On Nov 16, 2008, at 1:09 PM, Peter Veentjer wrote:

I have had a quick look at the source code of Hadoop and it appears
there there are some issues with the JMM. In some places it is done
correctly,  in some places partially and in some places it incorrect.

That is believable. Clearly some of the problems have been fixed, but Hadoop is moving fast and new code is being added every week. There have been bug reports on concurrency stuff that turned out to be false positives. *smile* It is amazing how much testing code gets when you run it on 20,000 nodes 24x7 and even rare cases have been hit in production.

There also are some design issues with concurrency as well and I think
the Hadoop project could benefit from overall solution instead of just
putting out small fires.

Yeah, we've talked about it for a long time. (See HADOOP-869.) The particularly problematic parts of the call graph in the JobTracker are when the lower levels call back into the higher layers. We've had to be careful to preserve lock order in those cases to avoid deadlock.

So who are the guys to get in touch with?

This list is exactly the right guys.

Together with the Hadoop developers I want to further improve the
quality of this very interesting project.

Jump right in. *Smile*

-- Owen

Reply via email to