Re: Next steps after 2.2.1 release

Gregg Wonderly Mon, 08 Apr 2013 06:11:43 -0700

On 4/7/2013 7:03 PM, Greg Trasuk wrote:

I'm honestly and truly not passing judgement on the quality of the code. Ihonestly don't know if it's good or bad. I have to confess that, given thatJini was written as a top-level project at Sun, sponsored by Bill Joy, whenSun was at the top of its game, and the Jini project team was a "who's-who" ofdistributed computing pioneers, the idea that it's riddled with concurrencybugs surprises me. But mainly, I'm still trying to answer that question - "Howdo I know if it's good?" Here's what I'm doing: - I'm attempting to run thetests from "tags/2.2.0" against the "2.2" branch. When I have confidence inthe "2.2" branch, I'll publish the results, ask anyone else who's interestedto test it, and then call for a release on "2.2.1" - After that, thedevelopers need to reach consensus about how to move forward. Cheers, Greg.

This is an important issue to address. I know a lot of people here probablydon't participate on the Concurrency-interest mailing list that has a wide rangeof discussion about the JLS vs the JMM and what the JIT compilers actually do tocode these days.

The number one issue that you need to understand, is that the optimizer isworking against you more and more these days if you don't have JMM detailsexactly write. Statements are being reordered more and more, including actual"assignments" which can expose uninitialized data items in "racy" concurrentcode. The latest example is the Thread.setName()/Thread.getName() pair. Theyare most likely always to be accessed by "other threads", yet there is nosynchronization on them, including no "visibility" control with volatile even.What this means, is that if setName() and getName() are being called in a racyenvironment, the setName, will assign the array that is created to copy thecharacters into, before the arraycopy of the data occurs, potentially exposingan uninitialized name to getName().

There are literally hundreds of places in the JDK that still have these kinds ofraces going on, and no one at Oracle, based on how people are acting, appears tobe responsible for dealing with it. The Jini code, has many many of the sameissues that just randomly appear in stress cases on "slower" or "faster"hardware, depending on the issue.

When you haven't got sharing and visibility covered correctly, the JIT coderewrites can make execution order play a big part in conflating what you "see"happening verses what the "code" says, to you, should happen.

There are some very simple things to get the JIT out of the picture. One ofthese, is to actually open the source up in an IDE and declare every fieldfinal. If that doesn't work due to 'mutation' of values, change those fields to'volatile' so that it will compile again. Then run your tests and you will nowgreatly diminish reordering and visibility issues so that you can just get tothe simple "was it set correctly, before it was read" and "did we provide thecorrect atomicity for that update" kinds of questions that will help youunderstand things better when code is misbehaving.

This is the kind of thing that Peter has been working through because the usageof the code in real life has not continued in the same way that it did when thecode was written, and the JMM in JDK5 has literally broken so much software, allover the planet, that used to work quite well, because there wasn't a formaldefinition of "happens before". Now that there is, the compiler optimizationsare against you if you don't get it right. The behaviors you will experience,because of reorderings that are targeted at all out performance (minimizetraffic in and out of the CPU through memory subsystems), can create completelyunexpected results. Intra-thread semantics are kept correct, but inter-threadexecution will just seem intangible because stuff will not be happening in theorder the "code" says it should.


Gregg Wonderly

Re: Next steps after 2.2.1 release

Reply via email to