On 8/31/2020 11:31 AM, Ioi Lam wrote:
Hi Lois,
Thanks for the review. Your comments has led me to discover a couple
of pretty serious issues, which hopefully I have fixed. Also the code
is cleaner now, with much fewer control-flow changes than the last
webrev.
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03/
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v03.delta/
Hi Ioi,
Looks very good, thanks for making those changes. One minor comment
here then another comment below when you discuss the addition of
check_cds_restrictions().
Minor nit in moduleEntry.cpp & packageEntry.cpp when dealing with the
ModuleEntry's reads list and a PackageEntry's exports list. The names
of the methods to write and read those arrays is somewhat confusing.
ModuleEntry::write_archived_entry_array
ModuleEntry::read_archived_entry_array
At first I thought you were reading/writing an array of archived
entries, not the array within an archived entry itself. I was trying to
think of a better name. Please consider adding a comment at line #400 &
line #417 ahead of those methods in moduleEntry.cpp to indicate that
they are used for both reading/writing a ModuleEntry's reads list and a
PackageEntry's exports list.
Please see my comments in-line.
On 8/25/20 7:58 AM, Lois Foltan wrote:
Hi Ioi,
Changes looks really good. Comments interspersed below.
Thanks,
Lois
On 8/12/2020 6:06 PM, Ioi Lam wrote:
Hi Lois,
Thanks for the comments. I have an updated webrev
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v02/
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v02.delta/
Here are the general notes on the changes. Replies to your questions
are in-line below.
(1) Integrated updates in the Java code from Alan Bateman. Here are
Alan's
notes:
The archive of the boot layer is as before, except that
archiving is
skipped if there are split packages or incubator modules.
Incubating
modules aren't resolved by default so we shouldn't have them in the
boot layer by default anyway.
I've dropped the module finders from the boot layer archive. I've
left the IllegalAccessLogger.Builder in the acrhive for now (even
though it is not the boot layer). We should be able to remove that
once the JEP to disallow illegal access by default is in.
Related is that I don't like the archive for the module graph
(ArchivedModuleGraph, pre-dates this RFE) including the set of
packages to export/open for illegal access as they aren't part
of the module graph. I've left it for now but we can also remove
that once we disallow illegal access by default (as those sets
will be empty).
The archive of built-in class loaders is now in one object
jdk.internal.loader.ArchivedClassLoaders which I think will be a
bit more maintainable. I've also drop the ucp field from the
AppClassLoader as the changes to BuiltinClassLoader means is no
longer needs to duplicated.
There's one remaining issue in ModuleBootstrap.boot where it has
fix
an app class loader value (ModuleLayer.CLV). Ideally the
initialization
of the built-in class loaders would do this but we are kinda forced
to separate the archiving of the built-in class loaders from the
boot
layer. I might look at this again some time.
(2) Moved code from classLoaderData.cpp -> classLoaderSharedData.cpp
(3) Reverted unnecessary changes in JavaClasses::compute_offset
(4) Minor clean up to use QuickSort::sort() instead of qsort()
(5) Moved the C-side of module initialization for platform/system
loaders to inside java.lang.System::initPhase2(), so this happens
at the same time as without full module archiving.
(6) Moved the use of Module_lock to so all modules in a class loader
are restored atomically. See ArchivedClassLoaderData::restore()
This fixed a bug where test/jdk/com/sun/jdi/ModulesTest.java
would fail as it sees a partially restored set of modules.
On 8/7/20 12:06 PM, Lois Foltan wrote:
Hi Ioi,
Overall looks promising. I have some review comments below, but
not a complete review. I concentrated on the JVM side primarily
how the archived module graph is read in, how the ModuleEntry and
PackageEntry tables are created from the archive for the 3 builtin
loaders and potential timing issues during start up.
On 7/22/2020 3:36 PM, Ioi Lam wrote:
https://bugs.openjdk.java.net/browse/JDK-8244778
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v01/
Please review this patch that stores the full module graph in the CDS
archive heap. This reduces the initialization time of the basic
JVM by
about 22%:
$ perf stat -r 100 bin/java -version
before: 98,219,329 instructions 0.03971 secs elapsed (+- 0.44%)
after: 55,835,815 instructions 0.03109 secs elapsed (+- 0.65%)
[1] Start with ModuleBootstrap.java. The current implementation is
quite restrictive: the archived module graph is used only when no
module options are specified.
See ModuleBootstrap.mayUseArchivedBootLayer().
We can probably support options such as main module and module
path
in a future RFE.
Yes, I noticed the restrictions. The JBS issue discusses the
possibility of using the dumped module graph in the event that the
value of the options are the same. I think for this implementation
to be viable and have a positive impact you should consider
comparing at least the options --add-modules, --add-exports and
--add-reads.
I agree. I want to keep the changes minimal in this RFE, and will
add more support for other use cases in follow-on RFEs. Instead of
requiring these options to be unspecified, we can relax the
restriction such that these options must be the same between archive
dump time and run time.
Sounds good!
[2] In the current JDK implementation, there is no single object
that represents "the module graph". Most of the information
is stored in the archive bootLayer object, but a few additional
restoration operations need to be performed:
+ See ModuleBootstrap.getArchivedBootLayer()
+ Some static fields need to be archived/restored in
Module.java, BuiltinClassLoader.java, ClassLoaders.java
and BootLoader.java
[3] I ran into a complication with two loader instances of
PlatformClassLoader and AppClassLoader. They are stored in
multiple tables inside the module graph (e.g.,
BuiltinClassLoader$LoadedModule) so I cannot easily recreate
them at runtime.
However, these two loaders contain information specific to the
dump time VM lifecycle (such as the classes that were loaded
during CDS dumping) that need to be scrubbed. I couldn't find an
elegant way of doing this, so I added a private
"resetArchivedStates"
method to a few classes. They are called inside
HeapShared::reset_archived_object_states().
[4] Related native data structures (PackageEntry and ModuleEntry)
are also archived. Start with classLoaderData.cpp
Passes mach5 tiers 1-4. I will test with additional tiers.
Thanks
- Ioi
classfile/classLoader.cpp
- line #1644 pulling the javabase_moduleEntry() check outside of
the Module_lock maybe problematic if after you check it another
thread initializes in the time between the check and the
obtaining of the Module_lock on line #1645.
Fixed.
That looks good. I think it is fine that you are checking if
java.base is defined via a call to javabase_moduleEntry() because you
are just trying to determine if a ModuleEntry should be created or
not. In most cases though using ModuleEntryTable::javabase_defined()
is the correct way to ensure that both the ModuleEntry for java.base
has been created and that the ModuleEntry has been injected in the
corresponding java.lang.Module oop.
classfile/classLoader.hpp
- somewhere in ArchivedClassLoaderData there should be an assert to
make sure that the CLD, whose package entries and module entries
are being archived is not a "_has_class_mirror_holder" CLD for a
weakly defined hidden class. Those dedicated CLDs should never
have package entries or module entries.
I added ArchivedClassLoaderData::assert_valid()
classfile/moduleEntry.cpp
- line #400, typo "conver" --> "convert"
- line #500, maybe sort if n is greater than 1 instead of 0 (same
comment for packageEntry.cpp line #270)
Fixed
classfile/systemDictionary.cpp
- It looks like the PackageEntry and ModuleEntry tables for the
system and platform class loaders are added within
SystemDictionary::compute_java_loaders() which is called from
Thread::create_vm() but after the call in Thread::create_vm() to
call_initPhase2 which is when Universe::_module_initialized is set
to true. Did I read this correctly? If yes, then I think you have
to be sure that Universe::_module_initialized is not set until the
module graph for the 3 builtin loaders is loaded from the archive.
I moved the code to be called by ModuleBootstrap::boot() so it
should now happen inside call_initPhase2.
Yes, that looks good.
memory/filemap.cpp
- line #237 typo "modue" --> "module"
Fixed
- Since the CDS support for capturing the module graph does not
archive unnamed modules, I assume
Modules::set_bootloader_unnamed_module() is still called. Is this
true or does the unnamed module for the boot loader get archived?
The unnamed module for the boot loader is not archived.
Modules::set_bootloader_unnamed_module() wasn't called in my last
webrev. Thanks for catching it.
I added a call to BootLoader.getUnnamedModule() in
ModuleBootstrap::boot() to trigger <clinit> of BootLoader, which
will call into the VM for Modules::set_bootloader_unnamed_module().
Looks good.
Clarification/timing questions:
Here's an overall problem I am facing:
The module graph is archived after the module system has fully
started up. This means that for the boot loader, we only know the
full set of modules/packages, but we don't know which part is the
subset that was initialized during early VM bootstraping (e.g., when
ModuleEntryTable::javabase_defined() == false).
So the behavior is a bit different during the early bootstrap phase
(all the way before we reach ModuleBootstrap::boot()):
ClassLoaderData::the_null_class_loader_data()->modules() and
ClassLoaderData::the_null_class_loader_data()->packages() are
already populated before a single class is loaded.
If this is the case then, at the point when a ModuleEntry is created
for java.base using the archived module graph, there should be a
assertion that java_lang_Class::_fixup_module_field_list is NULL. To
confirm no class has been loaded before java.base is defined. Maybe
in ClassLoaderDataShared::serialize where you restore the boot
loader's archived modules?
Thanks for pointing this out. It turned out that in the last webrev
(v02), I had a bug where the module of the primitive classes were not
initialized. Now I have changed the initialization code to behave the
same whether archived full module graph is used or not. The only
differences are:
[1] ClassLoader::create_javabase():
ModuleEntryTable::javabase_moduleEntry() may be inited by CDS.
[2] When archived full module graph is used,
ModuleEntryTable::patch_javabase_entries is called from
Modules::define_archived_modules.
(java_lang_Class::_fixup_module_field_list is used within this call.)
I also added a new test case: cds/PrimitiveClassMirrors.java
Good test, I'm glad you added that!
This difference doesn't seem to make a practical difference. E.g.,
none of our code seems to assume that "before any classes in the
java.util package is loaded,
ClassLoaderData::the_null_class_loader_data()->packages() must not
contain an entry for java.util".
I think we have two choices when the archived module graph is used:
[1] We require that the state of the module system must, at every
step of VM initialization, be identical to that of a VM that doesn't
use an archived module graph.
[2] We make sure that the VM/JDK bootstrap code can tolerate the
existence of module/packages that are added earlier than a VM that
doesn't use an archived module graph.
I tried doing a version of [1] and found that to be too difficult.
[2] seems much simpler and is the approach I am using now.
I think [2] is reasonable.
oops/instanceKlass.cpp
- line #2545, comment "TO DO -- point it to the archived
PackageEntry (JDK-8249262)"
are you thinking that since the module graph is read in ahead of
time that it can be set when an InstanceKlass is created? There is
a point before java.base is defined that InstanceKlass::set_package
takes into account that could be a timing issue.
I think it will work as if another class in the same package has
already been defined.
- There are some checks in modules.cpp that are valuable to have
during bootstrapping. For example, packages that are encountered
during bootstrapping before java.base is defined are put into the
java.base module but then verified after java.base is defined via
verify_javabase_packages. So at what point in the bootstrapping
process does the boot loader's archived module's become known?
Could classes have been defined to the VM before this point? Then
their packages will have to be verified to be sure that they are
indeed packages within java.base. Consider looking at other checks
that occur within classfile/modules.cpp as well.
As mentioned above, calling verify_javabase_packages() at run time
will fail, as we have loaded all packages for the boot loader, not
just those for java.base.
I think not calling verify_javabase_packages works because as you
stated above no classes have been loaded before java.base is defined
which is not the same situation as running without the archived
module graph.
A couple of additional comments:
- ModuleEntry's reads list and PackageEntry's exports list. We had
hoped eventually to change these lists from being a c heap allocated
GrowableArray over to a ResourceHashTable for faster lookup. Doing
that as a separate RFE might help CDS archiving not to have to
archive those lists as an Array?
CDS cannot archive ResourceHashTable either. We have CompactHashtable
which can be archived, but it cannot be modified at run time. I think
the export list can be modified at run time with
java.lang.Module::addExports(), so we probably need to do it like:
- if a Module was archived, check its CompactHashtable first
- if not found, check the ResourceHashTable
This would make the start-up a little faster (no more copying from the
array lists into the hashtable, but would make the code more
complicated. I need to investigate to see if it's worthwhile.
- moduleEntry.cpp and packageEntry.cpp: Both methods
"compare_module_by_name" and "compare_package_by_name" should have an
assert if the names are equal. No ModuleEntryTable or
PackageEntryTable should ever have 2 same named modules or packages.
I tried adding:
static int compare_package_by_name(PackageEntry* a, PackageEntry* b) {
assert(a != b && a->name() != b->name(), "array should not contain
duplicated entries");
return a->name()->fast_compare(b->name());
}
but this caused an assert, because our QuickSort implementation would
sometimes compare the same element!
#7 0x00007ffff659238b in QuickSort::partition<true, PackageEntry*,
int (*)(void const*, void const*)> (array=0x80043fcb0,
pivot=248, length=496, comparator=0x7ffff6590a5a
<compare_package_by_name(PackageEntry*, PackageEntry*)>)
at /jdk2/gil/open/src/hotspot/share/utilities/quickSort.hpp:76
76 for ( ; comparator(array[left_index], pivot_val) < 0;
++left_index) {
(gdb) p array[left_index]
$1 = (PackageEntry *) 0x7ffff0439d00
(gdb) p pivot_val
$2 = (PackageEntry *) 0x7ffff0439d00
(gdb) p pivot
$3 = 248
(gdb) p left_index
$4 = 248
So I ended up with:
static int compare_package_by_name(PackageEntry* a, PackageEntry* b) {
assert(a == b || a->name() != b->name(), "no duplicated names");
return a->name()->fast_compare(b->name());
}
Looks good to me!
- Another timing clarification question for me. I assume that the
module graph is dumped post module initialization when Java has
completely defined the module graph to the JVM, is this correct?
Yes
My concern is that there could be a window post module initialization
and pre archiving the module graph where another thread could define
a new module or add Module readability or add Package exportability.
So that the module graph you are dumping is not the same module graph
at the end of module initialization. Is this a concern?
We don't allow arbitrary code to be executed during -Xshare:dump, so
the module graph shouldn't be changed after module initialization has
finished.
I've added Modules::check_cds_restrictions() to check for this.
A question about this because a user's program can define modules post
module initialization via ModuleDescriptor.newModule(). See for
example, tests within
open/test/hotspot/jtreg/runtime/module/AccessCheck. So all of these
tests would trigger check_cds_restrictions() if -Xshare:dump was turned
on. Is that a concern?
Thanks,
Lois
I also added asserts to make sure that none of the classes used by the
archived module graph can be modified by JVMTI. All these classes are
loaded in the "early" phase of JVMTI, and we would disable CDS if a
JVMTI agent is registered to modify classes in this phase, so we
should be safe, but the asserts will ensure that. I updated the
ReplaceCriticalClassesForSubgraphs.java test and added a new test
RedefineClassesInModuleGraph.java.
Thanks
- Ioi
Thanks,
Lois
Well, verify_javabase_packages() was called once and it succeeded,
but that was during CDS dump time. So we know at least we have
verified this once :-)
Thanks
- Ioi
I may have more review comments as I continue to look at this!
Thanks,
Lois