Thanks Dan! /Erik
On 5 Jul 2019, at 21:53, Daniel D. Daugherty <daniel.daughe...@oracle.com> wrote: >> I'll file a follow up bug after the dust settles for 8227117. > > I filed the following: > > JDK-8227338 templateInterpreter.cpp: copy_table() needs to be safer > https://bugs.openjdk.java.net/browse/JDK-8227338 > > Dan > > >> On 7/5/19 1:07 PM, Daniel D. Daugherty wrote: >>> On 7/4/19 3:10 AM, Erik Österlund wrote: >>> Hi Dan, >>> >>> Thanks for picking this up. The change looks good. >> >> Thanks! Of course, just the size of the comment below makes me wonder >> what I got myself into... :-) And I was so happy that the non-logging >> part of the fix was an else-statement with _one_ line... >> >> >>> However, when reviewing this, I looked at the code for actually restoring >>> the table (ignore/notice safepoints). It copies the dispatch table for the >>> interpreter. There is a comment stating it is important the copying is >>> atomic for MT-safety, and I can definitely see why. However, the copying >>> the line after that comment is in fact not atomic. >> >> Actually, the comment doesn't mention 'atomic', but that's probably >> because the code and the comment are very, very old. It mentions >> 'word wise for MT safety' and I agree that 'atomic' is what the >> person likely meant... >> >> The history: >> >> $ sgv src/share/vm/interpreter/templateInterpreter.cpp | grep 'The copy has >> to occur word wise for MT safety' >> 1.1 // Copy non-overlapping tables. The copy has to occur word wise >> for MT safety. >> >> $ sp -r1.1 src/share/vm/interpreter/templateInterpreter.cpp >> src/share/vm/interpreter/SCCS/s.templateInterpreter.cpp: >> >> D 1.1 07/08/29 13:42:26 sgoldman 1 0 00600/00000/00000 >> MRs: >> COMMENTS: >> 6571248 - continuation_for is specialized for template interpreter >> >> Hmmm... I expected that comment to be even older... ahhhh... a little >> more poking around and I found: >> >> $ sgv -r1.147 src/share/vm/interpreter/interpreter.cpp | grep 'The copy has >> to occur word wise for MT safety' >> 1.147 // Copy non-overlapping tables. The copy has to occur word wise >> for MT safety. >> >> $ sp -r1.147 src/share/vm/interpreter/interpreter.cpp >> src/share/vm/interpreter/SCCS/s.interpreter.cpp: >> >> D 1.147 99/02/17 10:14:36 steffen 235 233 00008/00002/00762 >> MRs: >> COMMENTS: >> >> This makes more sense (timeline wise) and dates back to when all >> of the interpreter was in vm/interpreter/interpreter.cpp. >> >> >>> Here is the copying code in templateInterpreter.cpp: >>> >>> static inline void copy_table(address* from, address* to, int size) { >>> // Copy non-overlapping tables. The copy has to occur word wise for MT >>> safety. >>> while (size-- > 0) *to++ = *from++; >>> } >>> >>> Copying using a loop of non-volatile loads and stores can and definitely >>> will on some compilers turn into memcpy calls instead as the compiler >>> (correctly) considers that an equivalent transformation. >> >> Yet another C++ compiler optimization land mine... sigh... >> >> >>> And memcpy does not guarantee atomicity. Indeed on some platforms it is not >>> atomic. On some platforms it will even enjoy out-of-thin-air values. >> >> That last bit is scary... >> >> >>> Perhaps Copy::disjoint_words_atomic() would be a better choice for atomic >>> word copying. If not, at the very least we should use Atomic::load/store >>> here. >> >> Copy::disjoint_words_atomic() sounds appealing... >> >> For those folks that aren't familiar with this part of safepointing... >> >> SafepointSynchronize::arm_safepoint() calls Interpreter::notice_safepoints() >> which calls calls copy_table(). So we're not at a safepoint yet, and, in >> fact, >> we're trying to bring those pesky JavaThreads to a safepoint... >> >> SafepointSynchronize::disarm_safepoint() calls >> Interpreter::ignore_safepoints() >> which also calls copy_table(). However, we did that before we have woken the >> JavaThreads that are blocked for the safepoint so that use of copy_table is >> safe: >> >> >> // Release threads lock, so threads can be created/destroyed again. >> Threads_lock->unlock(); >> >> // Wake threads after local state is correctly set. >> _wait_barrier->disarm(); >> } >> >> The 'Threads_lock->unlock()' should synchronize memory so that the restored >> table should be properly synced out to memory... >> >> >>> Having said that, the fix for that issue seems like a separate RFE, because >>> it has been sitting there for a lot longer than TLH has been around. >> >> Yes I would like to keep the copy_table() issue for a separate bug (not RFE). >> I'll file a follow up bug after the dust settles for 8227117. >> >> Thanks again for the review! >> >> Dan >> >>> >>> Thanks, >>> /Erik >>> >>>> On 2019-07-04 04:04, Daniel D. Daugherty wrote: >>>> Greetings, >>>> >>>> Robbin recently discovered this issue with Thread Local Handshakes. Since >>>> he's not available at the moment, I'm handling the issue: >>>> >>>> JDK-8227117 normal interpreter table is not restored after single >>>> stepping with TLH >>>> https://bugs.openjdk.java.net/browse/JDK-8227117 >>>> >>>> When using Thread Local Handshakes, the normal interpreter table is >>>> not restored after single stepping. This issue is caused by the >>>> VM_ChangeSingleStep VM-op relying on SafepointSynchronize::end() to >>>> restore the normal interpreter table for the "off" case. >>>> >>>> Prior to Thread Local Handshakes, this was a valid assumption to make. >>>> SafepointSynchronize::end() has been refactored into >>>> disarm_safepoint() and it only calls Interpreter::ignore_safepoints() >>>> on the global safepoint branch. That matches up with the call to >>>> Interpreter::notice_safepoints() that is also on the global safepoint >>>> branch. >>>> >>>> The solution is for the VM_ChangeSingleStep VM-op for the "off" case >>>> to call Interpreter::ignore_safepoints() directly. >>>> >>>> Here's the webrev URL: >>>> >>>> http://cr.openjdk.java.net/~dcubed/8227117-webrev/0_for_jdk14/ >>>> >>>> The fix is just a small addition to VM_ChangeSingleStep::doit(): >>>> >>>> if (_on) { >>>> Interpreter::notice_safepoints(); >>>> + } else { >>>> + Interpreter::ignore_safepoints(); >>>> } >>>> >>>> Everything else is just new logging support for future debugging of >>>> interpreter table management and single stepping. >>>> >>>> Tested this fix with Mach5 Tier[1-3] on the standard Oracle platforms. >>>> Mach5 Tier[4-6] on standard Oracle platforms is running now. >>>> >>>> Thanks, in advance, for questions, comments or suggestions. >>>> >>>> Dan >>>> >> >> >