On 15/03/16 19:15, Kim Barrett wrote: >> On Mar 15, 2016, at 12:18 AM, Andrew Hughes <gnu.and...@redhat.com> wrote: > > I’ll probably have more to say later; just responding to one point here. > >>>> 2. A number of optimisations in GCC 6 lead to a broken JVM. We need to >>>> add -fno-delete-null-pointer-checks and -fno-lifetime-dse to get a >>>> working JVM. >>> >>> That's very disturbing! >> >> Andrew Haley (CCed) has more details on the need for these options, >> as he diagnosed the reason for the crash, with the help of the GCC >> developers. From what I understand of it, it is a case of more >> aggressive optimisations in the new GCC running into problems with >> code that doesn't strictly conform to the specification and exhibit >> undefined behaviour. > > That is my suspicion too, though without more detail of the failures it’s > hard to completely discount the possibility of compiler bugs.
They weren't compiler bugs: I analyzed the code and I am sure that the code in HotSpot isn't valid C++. The -fno-lifetime-dse is because we write to a field of an object in operator new before the constructor. This is in Node::operator new. It's been partially fixed in JDK 9 by 8034812, but an illegal write remains if assertions are turned on. The bug remains in JDK 8. It might be that there are no similar bugs elsewhere in HotSpot, but it would take more time than I had to prove this. We dereference null pointers a lot. I would very much like to clean all of these out but I didn't detect much enthusiasm from the HotSpot team. >> The need for -flifetime-dse is covered in >> comment #47 of the bug for this [0]; "an object field [is] being >> accessed before the object is constructed, in breach of >> C++98 [class.cdtor]”. > > Thanks for the pointer to the redhat bug for tracking this work: > https://bugzilla.redhat.com/show_bug.cgi?id=1306558 > > [Though a lot of comments there aren't visible to me.] > > This comment is quite worrisome. > https://bugzilla.redhat.com/show_bug.cgi?id=1306558#c6 > I very strongly suspect that -fno-strict-aliasing is broken in this > version of GCC. > > Is that still thought to be a concern? No. I was wrong. > And any more information about why -fno-delete-null-pointer-checks > matters? As I mentioned above, we dereference null pointers a lot. For example, Register rax is defined as (RegisterImpl*)0. So, if we do something like guarantee(reg->is_valid(), "must be"); if (reg == rax) stuff... GCC is quite within its rights to delete the call to "stuff". And it will. As I said, I would very much like to clean this stuff up, but I'd need support from the HotSpot team, and at the moment I feel that this is lacking. Even if we do get rid of it, problems will remain for old versions of OpenJDK for years. HotSpot is a million lines of code, more or less. We've found this kind of problem in several places. Auditing to show that we don't have such problems is a huge job, but we should do it. In the meantime, we should just consider some compiler options to be a defence against an increasingly aggressive compiler, and err on the side of safety. We're not losing significant performance because these optimizations are new and in many cases simply delete code we want. Andrew.