On Fri, 23 Apr 2021 19:48:47 GMT, Kim Barrett <kbarr...@openjdk.org> wrote:
> Please review this change to the String Deduplication facility. It is being > changed to use OopStorage to hold weak references to relevant objects, > rather than bespoke data structures requiring dedicated processing phases by > supporting GCs. > > (The Shenandoah update was contributed by Zhengyu Gu.) > > This change significantly simplifies the interface between a given GC and > the String Deduplication facility, which should make it much easier for > other GCs to opt in. However, this change does not alter the set of GCs > providing support; currently only G1 and Shenandoah support string > deduplication. Adding support by other GCs will be in followup RFEs. > > Reviewing via the diffs might not be all that useful for some parts, as > several files have been essentially completely replaced, and a number of > files have been added or eliminated. The full webrev might be a better > place to look. > > The major changes are in gc/shared/stringdedup/* and in the supporting > collectors, but there are also some smaller changes in other places, most > notably classfile/{stringTable,javaClasses}. > > This change is additionally labeled for review by core-libs, although there > are no source changes there. This change injects a byte field of bits into > java.lang.String, using one of the previously unused padding bytes in that > class. (There were two unused bytes, now only one.) > > Testing: > mach5 tier1-2 with and without -XX:+UseStringDeduplication > > Locally (linux-x64) ran all of the existing tests that use string > deduplication with both G1 and Shenandoah. Note that > TestStringDeduplicationInterned.java is disabled for shenandoah, as it > currently fails, for reasons I haven't figured out but suspect are test > related. > > - gc/stringdedup/ -- these used to be in gc/g1 > - runtime/cds/SharedStringsDedup.java > - runtime/cds/appcds/cacheObject/DifferentHeapSizes.java > - runtime/cds/appcds/sharedStrings/* > > shenandoah-only: > - gc/shenandoah/jvmti/TestHeapDump.java > - gc/shenandoah/TestStringDedup.java > - gc/shenandoah/TestStringDedupStress.java > > Performance tested baseline, baseline + stringdedup, new with stringdedup, > finding no significant differences. I looked at the runtime code, which looks fine. I didn't read the GC code. src/hotspot/share/classfile/javaClasses.inline.hpp line 77: > 75: > 76: uint8_t* java_lang_String::flags_addr(oop java_string) { > 77: assert(_initialized, "Mut be initialized"); typo: must src/hotspot/share/classfile/stringTable.cpp line 784: > 782: SharedStringIterator iter(f); > 783: _shared_table.iterate(&iter); > 784: } So with this change (somewhere below) do you no longer deduplicate strings from the shared string table? ie // The CDS archive does not include the string deduplication table. Only the string // table is saved in the archive. The shared strings from CDS archive need to be // added to the string deduplication table before deduplication occurs. That is // done in the beginning of the StringDedupThread (see StringDedupThread::do_deduplication()). void StringDedupThread::deduplicate_shared_strings(StringDedupStat* stat) { StringDedupSharedClosure sharedStringDedup(stat); StringTable::shared_oops_do(&sharedStringDedup); } Asking @iklam to have a look then. src/hotspot/share/runtime/globals.hpp line 1994: > 1992: > \ > 1993: product(uint64_t, StringDeduplicationHashSeed, 0, DIAGNOSTIC, > \ > 1994: "Seed for the table hashing function; 0 requests computed > seed") \ Should these two really be develop() options? src/hotspot/share/runtime/mutexLocker.cpp line 239: > 237: def(StringDedupQueue_lock , PaddedMonitor, leaf, true, > _safepoint_check_never); > 238: def(StringDedupTable_lock , PaddedMutex , leaf + 1, true, > _safepoint_check_never); > 239: } Why weren't these duplicate definitions? This is disturbing. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3662