Peter Veentjer wrote:
Hi Ted,

one of the easy to find problems is spinning on a non volatile
variable without changing the value in the loop and without additional
synchronization (at least I didn't find it in a few seconds).

examples:
AgentControllerSocketListener.closing
HttpConnector.stopMe
IndexUpdateReducer.closed
SecundaryNodeName.shouldRun
voTask.hasNext

These can all be fixed by making the field volatile.

These are the easy ones that can be found with static analysis tools,
but I bet there are a lot of more harder to find ones.

One of the problems with concurrency issues is that they are hard to test for -it's hard to create tests to show that the problem exists. Another is that the main services -namenode, secondary namenode, etc, all run (in production) in their own processes, so can get away with concurrency risks and static shared code that aren't so appealing in shared processes.


I think the Hadoop project would benefit from a structural approach to
solving these problems instead of just fixing these bugs. That is what
I want to help with but I can't do it without support of the
leading-developers of the Hadoop community.

1. I don't see anyone being against this, though you would have to start with education. For example, it took me a bit to work out that you were using JMM as an acronym for Java Memory Model.

2. I think we'd need to prioritise where the biggest risks are.

One of the things we need to agree upon is for example:
making fields that only are set in the constructor, final. This makes
analysis a lot easier.


It does, but it also makes subclassing trickier as subclassed instances don't get a look in or an opportunity to override the values -even if they have methods you can use to evaluate the subclassed values, the fact that these are called from the parent's constructor makes them a risk all on their own

Reply via email to