> > As you know, we have a problem of lack of team members and contributors.
So we should break down every tasks as small as possible. Where was this task not broken into pieces? There are at least two tasks: - Improve GraphJobRunner memory consumption (HAMA-704, even reviewed on reviewboard with huge memory savings) - Implement SpillingQueue / SortedSpillingQueue (HAMA-644, HAMA-723 whatever else) This is the change we talked about on the dev list and on JIRAs very extensively and chose a single design we want to implement. This requires a lot of code change, so I don't see how splitting that smaller (IMHO this is atomic enough) would be beneficial. And even if you split the stuff, it would add huge organizational overhead, because we lack of team members/contributors that can work on those tasks is limited. I don't know what you mean exactly. But 23 issues are almost examples > except YARN integration tasks. If you leave here, I have to take cover > YARN tasks. Should I wait someone? Am I touching core module > aggressively? It is not about a skill discussion here, but I wanted to emphasize that you can very well work on other JIRAs instead of blocking our work on graph/messaging. And 23 is at least 22 more than the average of the rest of the team, think about that: would there be issues for newcomers? Yes there would! But why are you assigning them to yourself when you're not working actively on them? YARN is just a single umbrella issue that is "yours", there is work blocked on maven coding (HAMA-671) and also there is a pending patch review since 20/11/12 (4 months!) from me in HAMA-672, so don't tell me that you work on that things actively in your "full-time open sourcer" career. By the way, can you answer about this question - Is it really > technical conflicts? or emotional conflicts? If someone is usually emotional about things, it is you. Technically speaking, should we branch out such (big) refactoring issues to work on our own, or do you want to brew your own soup on trunk and have us merge all the stuff together? In case you want to please fork your own playground Hama and do all the stuff you want, if something emerges successfuly feel free to slice a patch and emit a JIRA. So I think we need to cut release as often as possible. Sorry Edward, but our releases have been a disaster so far. I'm only here since 0.3.0, but none of it was either scalable, nor good documented and well tested. I have no problem with taking more time for a product, as I don't feel the need to deliver half-baked stuff to people who are not using it anyways nor providing any feedback there (which is sad reality in many other open source projects as well). So in my opinion we have to iterate on our own and not with official releases. "It is done, when it's done" is the usual standard and I don't think deviating from it will give any advantages besides pissed off users getting Hama not to work like it should. Also your changes on the wiki recently: However, if no one responds to your patches for 3 days, you can commit then > review later. Who in the community has voted for that rule, or do you make the rules here? You can't talk about community in the same sentence as changing rules for everybody just because you like that. Where was the need to commit HAMA-745 without review? Why did you change that testcase? This is just the "tip" of the iceberg of changes you are doing to the trunk without the agreement of the community. We established a community process during the incubation (that was even written on the charter when graduating), so why do we not stick to it instead of laying out the rules for self-needs / or that of your employee? Regarding branches, maybe we all are not familiar with online > collaboration (or don't want to collaborate anymore). If we want to > walk own ways, why we need to be in here together? Branching is something that is perfectly legal when something needs to be developed in parallel to ongoing work. We don't have much ongoing work do we? So I don't think branching is usually need when working on small projects, because issues can be solved by communication. But if you commit / plan stuff to trunk without coordinating that with people (YOU KNOW) that are currently working on it, then it is just a bad move. In HAMA-704, I wanted to remove only message map to reduce memory > consumption. I still don't want to talk about disk-based vertices and > Spilling Queue at the moment. With this, I wanted to release 0.6.1 > 'partitioning issue fixed and quick executable examples' version ASAP. > You can't say B without saying A. The problems are much deeper than you think they are. The message consumption is not a problem of the message map, but a two fold problem of vertices that are in memory although they don't need to and a not very scalable messaging system. I told you that since the time we added the graph module, but I still fall on deaf ears with you since more than a year. Yea and tell you what? This requires a lot of changes. If you would have invested the time to work with us on the root of all issues instead of doing strange stuff e.G. like the partitioning jobs (in the hours I wasted to tell you about the technical downsides of it I could've built another Hadoop in FORTRAN) we could've gotten a release out months ago and work on other things. If we want to sort partitioned data using messaging system, idea > should be collected. The idea is there and the idea works, but I guess you're not following the JIRA's you are +1'ing to? Suraj is already working on the second part of the idea we divided by two and instead of cock fighting with each other we should work together to make this happening. And not as fast as possible because you want to roll out a release for your employee, but because we want to improve the framework radically and have enough time to test it throughoutly with various configurations and not just a Oracle BDA. P.S., These comments are never helpful in developing community. It is something that needs to be discussed throughout the whole project, and not on a single private mailing list. Community development doesn't start with +1'ing and smiling to everything just to keep people on board. Truth hurts, but is necessary to evolve something. Community starts with people who have a vision in making a project better, it will develop for itself when it is stable enough and has a bigger user base, you know- developers are users too. If I can't run a graph job with 1gb of wikipedia links on my laptop, this project is not very likely to be something I want to develop on. So our first responsibility is to make our project running perfectly smooth and nothing else. And that is something that must be discussed with people who want to develop, but can't- and we need these people. And to be honest again, we didn't had much other people than GSoC students that get a shitton of money for developing stuff and then walking away again? I count myself in now as well, mea culpa.
