On Fri, 12 Jul 2024 05:57:30 GMT, Axel Boldt-Christmas <abold...@openjdk.org> 
wrote:

>> When inflating a monitor the `ObjectMonitor*` is written directly over the 
>> `markWord` and any overwritten data is displaced into a displaced 
>> `markWord`. This is problematic for concurrent GCs which needs extra care or 
>> looser semantics to use this displaced data. In Lilliput this data also 
>> contains the klass forcing this to be something that the GC has to take into 
>> account everywhere.
>> 
>> This patch introduces an alternative solution where locking only uses the 
>> lock bits of the `markWord` and inflation does not override and displace the 
>> `markWord`. This is done by keeping associations between objects and 
>> `ObjectMonitor*` in an external hash table. Different caching techniques are 
>> used to speedup lookups from compiled code.
>> 
>> A diagnostic VM option is introduced called `UseObjectMonitorTable`. It is 
>> only supported in combination with the LM_LIGHTWEIGHT locking mode (the 
>> default). 
>> 
>> This patch has been evaluated to be performance neutral when 
>> `UseObjectMonitorTable` is turned off (the default). 
>> 
>> Below is a more detailed explanation of this change and how `LM_LIGHTWEIGHT` 
>> and `UseObjectMonitorTable` works.
>> 
>> # Cleanups
>> 
>> Cleaned up displaced header usage for:
>>   * BasicLock
>>     * Contains some Zero changes
>>     * Renames one exported JVMCI field
>>   * ObjectMonitor
>>     * Updates comments and tests consistencies
>> 
>> # Refactoring
>> 
>> `ObjectMonitor::enter` has been refactored an a 
>> `ObjectMonitorContentionMark` witness object has been introduced to the 
>> signatures. Which signals that the contentions reference counter is being 
>> held. More details are given below in the section about deflation.
>> 
>> The initial purpose of this was to allow `UseObjectMonitorTable` to interact 
>> more seamlessly with the `ObjectMonitor::enter` code. 
>> 
>> _There is even more `ObjectMonitor` refactoring which can be done here to 
>> create a more understandable and enforceable API. There are a handful of 
>> invariants / assumptions which are not always explicitly asserted which 
>> could be trivially abstracted and verified by the type system by using 
>> similar witness objects._
>> 
>> # LightweightSynchronizer
>> 
>> Working on adapting and incorporating the following section as a comment in 
>> the source code
>> 
>> ## Fast Locking
>> 
>>   CAS on locking bits in markWord. 
>>   0b00 (Fast Locked) <--> 0b01 (Unlocked)
>> 
>>   When locking and 0b00 (Fast Locked) is observed, it may be beneficial to 
>> avoid inflating by spinning a bit.
>> 
>>   If 0b10 (Inflated) is observed or there is to...
>
> Axel Boldt-Christmas has updated the pull request incrementally with one 
> additional commit since the last revision:
> 
>   Update arguments.cpp

Here comes my first-pass review of the shared code.
(Man, I hope we can get rid of UOMT soon, again...)

src/hotspot/share/oops/instanceKlass.cpp line 1090:

> 1088: 
> 1089:     // Step 2
> 1090:     // If we were to use wait() instead of waitUninterruptibly() then

This is a nice correction (even though, the actual call below is 
wait_uninterruptibly() ;-) ), but seems totally unrelated.

src/hotspot/share/oops/markWord.cpp line 27:

> 25: #include "precompiled.hpp"
> 26: #include "oops/markWord.hpp"
> 27: #include "runtime/basicLock.inline.hpp"

I don't think this include is needed (at least not by the changed code parts, I 
haven't checked existing code).

src/hotspot/share/runtime/arguments.cpp line 1820:

> 1818:     warning("New lightweight locking not supported on this platform");
> 1819:   }
> 1820:   if (UseObjectMonitorTable) {

Uhm, wait a second. That list of platforms covers all existing platforms 
anyway, so the whole block could be removed? Or is there a deeper meaning here 
that I don't understand?

src/hotspot/share/runtime/basicLock.cpp line 37:

> 35:     if (mon != nullptr) {
> 36:       mon->print_on(st);
> 37:     }

I am not sure if we wanted to do this, but we know the owner, therefore we 
could also look-up the OM from the table, and print it. It wouldn't have all 
that much to do with the BasicLock, though.

src/hotspot/share/runtime/basicLock.inline.hpp line 45:

> 43:   return reinterpret_cast<ObjectMonitor*>(get_metadata());
> 44: #else
> 45:   // Other platforms does not make use of the cache yet,

If it's not used, why does it matter to special case the code here?

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 28:

> 26: 
> 27: #include "classfile/vmSymbols.hpp"
> 28: #include "javaThread.inline.hpp"

This include is incorrect (and my IDE says it's not needed).

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 31:

> 29: #include "jfrfiles/jfrEventClasses.hpp"
> 30: #include "logging/log.hpp"
> 31: #include "logging/logStream.hpp"

Include of logStream.hpp not needed?

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 58:

> 56: 
> 57: //
> 58: // Lightweight synchronization.

This comment doesn't really say anything. Either remove it, or add a nice 
summary of how LW locking and OM table stuff works.

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 80:

> 78: 
> 79:   ConcurrentTable* _table;
> 80:   volatile size_t _table_count;

Looks like a misnomer to me. We only have one table, but we do have N 
entries/nodes. This is counted when new nodes are allocated or old nodes are 
freed. Consider renaming this to '_entry_count' or '_node_count'? I'm actually 
a bit surprised if ConcurrentHashTable doesn't already track this...

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 88:

> 86: 
> 87:   public:
> 88:     Lookup(oop obj) : _obj(obj) {}

Make explicit?

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 97:

> 95: 
> 96:     bool equals(ObjectMonitor** value) {
> 97:       // The entry is going to be removed soon.

What does this comment mean?

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 112:

> 110: 
> 111:   public:
> 112:     LookupMonitor(ObjectMonitor* monitor) : _monitor(monitor) {}

Make explicit?

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 159:

> 157:   static size_t min_log_size() {
> 158:     // ~= log(AvgMonitorsPerThreadEstimate default)
> 159:     return 10;

Uh wait - are we assuming that threads hold 1024 monitors *on average* ? Isn't 
this a bit excessive? I would have thought maybe 8 monitors/thread. Yes there 
are workloads that are bonkers. Or maybe the comment/flag name does not say 
what I think it says.

Or why not use AvgMonitorsPerThreadEstimate directly?

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 349:

> 347:   assert(LockingMode == LM_LIGHTWEIGHT, "must be");
> 348: 
> 349:   if (try_read) {

All the callers seem to pass try_read = true. Why do we have the branch at all?

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 401:

> 399: 
> 400:   if (inserted) {
> 401:     // Hopefully the performance counters are allocated on distinct

It doesn't look like the counters are on distinct cache lines (see 
objectMonitor.hpp, lines 212ff). If this is a concern, file a bug to 
investigate it later? The comment here is a bit misplaced, IMO.

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 473:

> 471:   int _length;
> 472: 
> 473:   void do_oop(oop* o) final {

C++ always provides something to learn - C++ has got a final keyword! :-) Looks 
like a reasonable use of it here, though.

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 477:

> 475:     if (obj->mark_acquire().has_monitor()) {
> 476:       if (_length > 0 && _contended_oops[_length-1] == obj) {
> 477:         // assert(VM_Version::supports_recursive_lightweight_locking(), 
> "must be");

Uncomment or remove assert?

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 554:

> 552:   bool _no_safepoint;
> 553:   union {
> 554:     struct {} _dummy;

Uhh ... Why does this need to be wrapped in a union and struct?

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 563:

> 561:     assert(locking_thread == current || 
> locking_thread->is_obj_deopt_suspend(), "locking_thread may not run 
> concurrently");
> 562:     if (_no_safepoint) {
> 563:       ::new (&_nsv) NoSafepointVerifier();

I'm thinking that it might be easier and cleaner to just re-do what the 
NoSafepointVerifier does? It just calls thread->inc/dec
_no_safepoint_count().

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 748:

> 746:   }
> 747: 
> 748:   // Fast-locking does not use the 'lock' argument.

I believe the comment is outdated.

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 969:

> 967: 
> 968:   for (;;) {
> 969:   // Fetch the monitor from the table

Wrong intendation.

src/hotspot/share/runtime/lightweightSynchronizer.cpp line 1157:

> 1155:   // enter can block for safepoints; clear the unhandled object oop
> 1156:   PauseNoSafepointVerifier pnsv(&nsv);
> 1157:   object = nullptr;

What is the point of that statement? object is not an out-arg (afaict), and not 
used subsequently.

src/hotspot/share/runtime/lightweightSynchronizer.hpp line 68:

> 66:   static void exit(oop object, JavaThread* current);
> 67: 
> 68:   static ObjectMonitor* inflate_into_object_header(Thread* current, 
> JavaThread* inflating_thread, oop object, const 
> ObjectSynchronizer::InflateCause cause);

My IDE flags this with a warning 'Parameter 'cause' is const-qualified in the 
function declaration; const-qualification of parameters only has an effect in 
function definitions' *shrugs*

src/hotspot/share/runtime/lockStack.inline.hpp line 232:

> 230:   oop obj = monitor->object_peek();
> 231:   assert(obj != nullptr, "must be alive");
> 232:   assert(monitor == 
> LightweightSynchronizer::get_monitor_from_table(JavaThread::current(), obj), 
> "must be exist in table");

"must be exist in table" -> "must exist in table"

src/hotspot/share/runtime/objectMonitor.cpp line 56:

> 54: #include "runtime/safepointMechanism.inline.hpp"
> 55: #include "runtime/sharedRuntime.hpp"
> 56: #include "runtime/synchronizer.hpp"

This include is not used.

src/hotspot/share/runtime/objectMonitor.hpp line 193:

> 191:   ObjectWaiter* volatile _WaitSet;  // LL of threads wait()ing on the 
> monitor
> 192:   volatile int  _waiters;           // number of waiting threads
> 193:  private:

You can now also remove the 'private:' here

src/hotspot/share/runtime/synchronizer.cpp line 390:

> 388: 
> 389: static bool useHeavyMonitors() {
> 390: #if defined(X86) || defined(AARCH64) || defined(PPC64) || 
> defined(RISCV64) || defined(S390)

Why are those if-defs here? Why is ARM excluded?

-------------

Changes requested by rkennke (Reviewer).

PR Review: https://git.openjdk.org/jdk/pull/20067#pullrequestreview-2174478048
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675695457
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675696406
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675704824
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675707735
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675711809
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675744474
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675745048
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1676111067
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675773683
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675747483
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675765460
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675766088
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675781420
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675791687
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675799897
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675803217
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675805690
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675824394
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675832868
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675854207
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675876915
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675932005
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1675936943
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1676107048
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1676112375
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1676125325
PR Review Comment: https://git.openjdk.org/jdk/pull/20067#discussion_r1676140201

Reply via email to