I admit I didn't described enough info when open the JIRA ticket, and I didn't review patches carefully.
For example, I saw the memory issue before release 0.6. So, opened HAMA-596 "Optimize memory usage of graph job". Someone uploaded patch on there, so I dropped a +1 without review. BTW, the problem didn't fixed yet? So, I opened same issue HAMA-704 "Optimization of memory usage during message processing". Again, someone uploaded patch on there, so I dropped a +1 without review. Now, examples doesn't work? So I reported that error here and I've started to read recent changes now. That's all. The problem looks come from review culture. --- > If you would have invested the time to work with us on the root of all > issues instead of doing strange stuff e.G. like the partitioning jobs (in > the hours I wasted to tell you about the technical downsides of it I > could've built another Hadoop in FORTRAN) we could've gotten a release out > months ago and work on other things. Thomas, I really don't know why you saying like this. Let's don't blame each other anymore. What do you want from me? On Fri, Mar 15, 2013 at 2:47 AM, Thomas Jungblut <[email protected]> wrote: >> >> As you know, we have a problem of lack of team members and contributors. > > So we should break down every tasks as small as possible. > > > Where was this task not broken into pieces? > There are at least two tasks: > > - Improve GraphJobRunner memory consumption (HAMA-704, even reviewed on > reviewboard with huge memory savings) > - Implement SpillingQueue / SortedSpillingQueue (HAMA-644, HAMA-723 > whatever else) > > This is the change we talked about on the dev list and on JIRAs very > extensively and chose a single design we want to implement. This requires a > lot of code change, so I don't see how splitting that smaller (IMHO this is > atomic enough) would be beneficial. And even if you split the stuff, it > would add huge organizational overhead, because we lack of team > members/contributors that can work on those tasks is limited. > > I don't know what you mean exactly. But 23 issues are almost examples >> except YARN integration tasks. If you leave here, I have to take cover >> YARN tasks. Should I wait someone? Am I touching core module >> aggressively? > > > It is not about a skill discussion here, but I wanted to emphasize that you > can very well work on other JIRAs instead of blocking our work on > graph/messaging. And 23 is at least 22 more than the average of the rest of > the team, think about that: would there be issues for newcomers? Yes there > would! But why are you assigning them to yourself when you're not working > actively on them? > > YARN is just a single umbrella issue that is "yours", there is work blocked > on maven coding (HAMA-671) and also there is a pending patch review since > 20/11/12 (4 months!) from me in HAMA-672, so don't tell me that you work on > that things actively in your "full-time open sourcer" career. > > By the way, can you answer about this question - Is it really >> technical conflicts? or emotional conflicts? > > > If someone is usually emotional about things, it is you. Technically > speaking, should we branch out such (big) refactoring issues to work on our > own, or do you want to brew your own soup on trunk and have us merge all > the stuff together? In case you want to please fork your own playground > Hama and do all the stuff you want, if something emerges successfuly feel > free to slice a patch and emit a JIRA. > > So I think we need to cut release as often as possible. > > > Sorry Edward, but our releases have been a disaster so far. I'm only here > since 0.3.0, but none of it was either scalable, nor good documented and > well tested. I have no problem with taking more time for a product, as I > don't feel the need to deliver half-baked stuff to people who are not using > it anyways nor providing any feedback there (which is sad reality in many > other open source projects as well). So in my opinion we have to iterate on > our own and not with official releases. "It is done, when it's done" is the > usual standard and I don't think deviating from it will give any advantages > besides pissed off users getting Hama not to work like it should. > > Also your changes on the wiki recently: > > However, if no one responds to your patches for 3 days, you can commit then >> review later. > > > Who in the community has voted for that rule, or do you make the rules > here? You can't talk about community in the same sentence as changing rules > for everybody just because you like that. > Where was the need to commit HAMA-745 without review? Why did you change > that testcase? This is just the "tip" of the iceberg of changes you are > doing to the trunk without the agreement of the community. We established a > community process during the incubation (that was even written on the > charter when graduating), so why do we not stick to it instead of laying > out the rules for self-needs / or that of your employee? > > Regarding branches, maybe we all are not familiar with online >> collaboration (or don't want to collaborate anymore). If we want to >> walk own ways, why we need to be in here together? > > > Branching is something that is perfectly legal when something needs to be > developed in parallel to ongoing work. We don't have much ongoing work do > we? So I don't think branching is usually need when working on small > projects, because issues can be solved by communication. But if you commit > / plan stuff to trunk without coordinating that with people (YOU KNOW) that > are currently working on it, then it is just a bad move. > > In HAMA-704, I wanted to remove only message map to reduce memory >> consumption. I still don't want to talk about disk-based vertices and >> Spilling Queue at the moment. With this, I wanted to release 0.6.1 >> 'partitioning issue fixed and quick executable examples' version ASAP. >> > > You can't say B without saying A. The problems are much deeper than you > think they are. The message consumption is not a problem of the message > map, but a two fold problem of vertices that are in memory although they > don't need to and a not very scalable messaging system. I told you that > since the time we added the graph module, but I still fall on deaf ears > with you since more than a year. > Yea and tell you what? This requires a lot of changes. > > If you would have invested the time to work with us on the root of all > issues instead of doing strange stuff e.G. like the partitioning jobs (in > the hours I wasted to tell you about the technical downsides of it I > could've built another Hadoop in FORTRAN) we could've gotten a release out > months ago and work on other things. > > If we want to sort partitioned data using messaging system, idea >> should be collected. > > > The idea is there and the idea works, but I guess you're not following the > JIRA's you are +1'ing to? > Suraj is already working on the second part of the idea we divided by two > and instead of cock fighting with each other we should work together to > make this happening. And not as fast as possible because you want to roll > out a release for your employee, but because we want to improve the > framework radically and have enough time to test it throughoutly with > various configurations and not just a Oracle BDA. > > P.S., These comments are never helpful in developing community. > > > It is something that needs to be discussed throughout the whole project, > and not on a single private mailing list. Community development doesn't > start with +1'ing and smiling to everything just to keep people on board. > Truth hurts, but is necessary to evolve something. Community starts with > people who have a vision in making a project better, it will develop for > itself when it is stable enough and has a bigger user base, you know- > developers are users too. If I can't run a graph job with 1gb of wikipedia > links on my laptop, this project is not very likely to be something I want > to develop on. So our first responsibility is to make our project running > perfectly smooth and nothing else. And that is something that must be > discussed with people who want to develop, but can't- and we need these > people. > And to be honest again, we didn't had much other people than GSoC students > that get a shitton of money for developing stuff and then walking away > again? I count myself in now as well, mea culpa. -- Best Regards, Edward J. Yoon @eddieyoon
