Hi Ioi,
Changes looks really good. Comments interspersed below.
Thanks,
Lois
On 8/12/2020 6:06 PM, Ioi Lam wrote:
Hi Lois,
Thanks for the comments. I have an updated webrev
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v02/
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v02.delta/
Here are the general notes on the changes. Replies to your questions
are in-line below.
(1) Integrated updates in the Java code from Alan Bateman. Here are
Alan's
notes:
The archive of the boot layer is as before, except that archiving is
skipped if there are split packages or incubator modules. Incubating
modules aren't resolved by default so we shouldn't have them in the
boot layer by default anyway.
I've dropped the module finders from the boot layer archive. I've
left the IllegalAccessLogger.Builder in the acrhive for now (even
though it is not the boot layer). We should be able to remove that
once the JEP to disallow illegal access by default is in.
Related is that I don't like the archive for the module graph
(ArchivedModuleGraph, pre-dates this RFE) including the set of
packages to export/open for illegal access as they aren't part
of the module graph. I've left it for now but we can also remove
that once we disallow illegal access by default (as those sets
will be empty).
The archive of built-in class loaders is now in one object
jdk.internal.loader.ArchivedClassLoaders which I think will be a
bit more maintainable. I've also drop the ucp field from the
AppClassLoader as the changes to BuiltinClassLoader means is no
longer needs to duplicated.
There's one remaining issue in ModuleBootstrap.boot where it has fix
an app class loader value (ModuleLayer.CLV). Ideally the
initialization
of the built-in class loaders would do this but we are kinda forced
to separate the archiving of the built-in class loaders from the boot
layer. I might look at this again some time.
(2) Moved code from classLoaderData.cpp -> classLoaderSharedData.cpp
(3) Reverted unnecessary changes in JavaClasses::compute_offset
(4) Minor clean up to use QuickSort::sort() instead of qsort()
(5) Moved the C-side of module initialization for platform/system
loaders to inside java.lang.System::initPhase2(), so this happens
at the same time as without full module archiving.
(6) Moved the use of Module_lock to so all modules in a class loader
are restored atomically. See ArchivedClassLoaderData::restore()
This fixed a bug where test/jdk/com/sun/jdi/ModulesTest.java
would fail as it sees a partially restored set of modules.
On 8/7/20 12:06 PM, Lois Foltan wrote:
Hi Ioi,
Overall looks promising. I have some review comments below, but not
a complete review. I concentrated on the JVM side primarily how the
archived module graph is read in, how the ModuleEntry and
PackageEntry tables are created from the archive for the 3 builtin
loaders and potential timing issues during start up.
On 7/22/2020 3:36 PM, Ioi Lam wrote:
https://bugs.openjdk.java.net/browse/JDK-8244778
http://cr.openjdk.java.net/~iklam/jdk16/8244778-archive-full-module-graph.v01/
Please review this patch that stores the full module graph in the CDS
archive heap. This reduces the initialization time of the basic JVM by
about 22%:
$ perf stat -r 100 bin/java -version
before: 98,219,329 instructions 0.03971 secs elapsed (+- 0.44%)
after: 55,835,815 instructions 0.03109 secs elapsed (+- 0.65%)
[1] Start with ModuleBootstrap.java. The current implementation is
quite restrictive: the archived module graph is used only when no
module options are specified.
See ModuleBootstrap.mayUseArchivedBootLayer().
We can probably support options such as main module and module path
in a future RFE.
Yes, I noticed the restrictions. The JBS issue discusses the
possibility of using the dumped module graph in the event that the
value of the options are the same. I think for this implementation
to be viable and have a positive impact you should consider comparing
at least the options --add-modules, --add-exports and --add-reads.
I agree. I want to keep the changes minimal in this RFE, and will add
more support for other use cases in follow-on RFEs. Instead of
requiring these options to be unspecified, we can relax the
restriction such that these options must be the same between archive
dump time and run time.
Sounds good!
[2] In the current JDK implementation, there is no single object
that represents "the module graph". Most of the information
is stored in the archive bootLayer object, but a few additional
restoration operations need to be performed:
+ See ModuleBootstrap.getArchivedBootLayer()
+ Some static fields need to be archived/restored in
Module.java, BuiltinClassLoader.java, ClassLoaders.java
and BootLoader.java
[3] I ran into a complication with two loader instances of
PlatformClassLoader and AppClassLoader. They are stored in
multiple tables inside the module graph (e.g.,
BuiltinClassLoader$LoadedModule) so I cannot easily recreate
them at runtime.
However, these two loaders contain information specific to the
dump time VM lifecycle (such as the classes that were loaded
during CDS dumping) that need to be scrubbed. I couldn't find an
elegant way of doing this, so I added a private
"resetArchivedStates"
method to a few classes. They are called inside
HeapShared::reset_archived_object_states().
[4] Related native data structures (PackageEntry and ModuleEntry)
are also archived. Start with classLoaderData.cpp
Passes mach5 tiers 1-4. I will test with additional tiers.
Thanks
- Ioi
classfile/classLoader.cpp
- line #1644 pulling the javabase_moduleEntry() check outside of the
Module_lock maybe problematic if after you check it another
thread initializes in the time between the check and the obtaining
of the Module_lock on line #1645.
Fixed.
That looks good. I think it is fine that you are checking if java.base
is defined via a call to javabase_moduleEntry() because you are just
trying to determine if a ModuleEntry should be created or not. In most
cases though using ModuleEntryTable::javabase_defined() is the correct
way to ensure that both the ModuleEntry for java.base has been created
and that the ModuleEntry has been injected in the corresponding
java.lang.Module oop.
classfile/classLoader.hpp
- somewhere in ArchivedClassLoaderData there should be an assert to
make sure that the CLD, whose package entries and module entries are
being archived is not a "_has_class_mirror_holder" CLD for a weakly
defined hidden class. Those dedicated CLDs should never have package
entries or module entries.
I added ArchivedClassLoaderData::assert_valid()
classfile/moduleEntry.cpp
- line #400, typo "conver" --> "convert"
- line #500, maybe sort if n is greater than 1 instead of 0 (same
comment for packageEntry.cpp line #270)
Fixed
classfile/systemDictionary.cpp
- It looks like the PackageEntry and ModuleEntry tables for the
system and platform class loaders are added within
SystemDictionary::compute_java_loaders() which is called from
Thread::create_vm() but after the call in Thread::create_vm() to
call_initPhase2 which is when Universe::_module_initialized is set to
true. Did I read this correctly? If yes, then I think you have to
be sure that Universe::_module_initialized is not set until the
module graph for the 3 builtin loaders is loaded from the archive.
I moved the code to be called by ModuleBootstrap::boot() so it should
now happen inside call_initPhase2.
Yes, that looks good.
memory/filemap.cpp
- line #237 typo "modue" --> "module"
Fixed
- Since the CDS support for capturing the module graph does not
archive unnamed modules, I assume
Modules::set_bootloader_unnamed_module() is still called. Is this
true or does the unnamed module for the boot loader get archived?
The unnamed module for the boot loader is not archived.
Modules::set_bootloader_unnamed_module() wasn't called in my last
webrev. Thanks for catching it.
I added a call to BootLoader.getUnnamedModule() in
ModuleBootstrap::boot() to trigger <clinit> of BootLoader, which will
call into the VM for Modules::set_bootloader_unnamed_module().
Looks good.
Clarification/timing questions:
Here's an overall problem I am facing:
The module graph is archived after the module system has fully started
up. This means that for the boot loader, we only know the full set of
modules/packages, but we don't know which part is the subset that was
initialized during early VM bootstraping (e.g., when
ModuleEntryTable::javabase_defined() == false).
So the behavior is a bit different during the early bootstrap phase
(all the way before we reach ModuleBootstrap::boot()):
ClassLoaderData::the_null_class_loader_data()->modules() and
ClassLoaderData::the_null_class_loader_data()->packages() are already
populated before a single class is loaded.
If this is the case then, at the point when a ModuleEntry is created for
java.base using the archived module graph, there should be a assertion
that java_lang_Class::_fixup_module_field_list is NULL. To confirm no
class has been loaded before java.base is defined. Maybe in
ClassLoaderDataShared::serialize where you restore the boot loader's
archived modules?
This difference doesn't seem to make a practical difference. E.g.,
none of our code seems to assume that "before any classes in the
java.util package is loaded,
ClassLoaderData::the_null_class_loader_data()->packages() must not
contain an entry for java.util".
I think we have two choices when the archived module graph is used:
[1] We require that the state of the module system must, at every step
of VM initialization, be identical to that of a VM that doesn't use an
archived module graph.
[2] We make sure that the VM/JDK bootstrap code can tolerate the
existence of module/packages that are added earlier than a VM that
doesn't use an archived module graph.
I tried doing a version of [1] and found that to be too difficult. [2]
seems much simpler and is the approach I am using now.
I think [2] is reasonable.
oops/instanceKlass.cpp
- line #2545, comment "TO DO -- point it to the archived
PackageEntry (JDK-8249262)"
are you thinking that since the module graph is read in ahead of time
that it can be set when an InstanceKlass is created? There is a
point before java.base is defined that InstanceKlass::set_package
takes into account that could be a timing issue.
I think it will work as if another class in the same package has
already been defined.
- There are some checks in modules.cpp that are valuable to have
during bootstrapping. For example, packages that are encountered
during bootstrapping before java.base is defined are put into the
java.base module but then verified after java.base is defined via
verify_javabase_packages. So at what point in the bootstrapping
process does the boot loader's archived module's become known? Could
classes have been defined to the VM before this point? Then their
packages will have to be verified to be sure that they are indeed
packages within java.base. Consider looking at other checks that
occur within classfile/modules.cpp as well.
As mentioned above, calling verify_javabase_packages() at run time
will fail, as we have loaded all packages for the boot loader, not
just those for java.base.
I think not calling verify_javabase_packages works because as you stated
above no classes have been loaded before java.base is defined which is
not the same situation as running without the archived module graph.
A couple of additional comments:
- ModuleEntry's reads list and PackageEntry's exports list. We had
hoped eventually to change these lists from being a c heap allocated
GrowableArray over to a ResourceHashTable for faster lookup. Doing that
as a separate RFE might help CDS archiving not to have to archive those
lists as an Array?
- moduleEntry.cpp and packageEntry.cpp: Both methods
"compare_module_by_name" and "compare_package_by_name" should have an
assert if the names are equal. No ModuleEntryTable or PackageEntryTable
should ever have 2 same named modules or packages.
- Another timing clarification question for me. I assume that the
module graph is dumped post module initialization when Java has
completely defined the module graph to the JVM, is this correct? My
concern is that there could be a window post module initialization and
pre archiving the module graph where another thread could define a new
module or add Module readability or add Package exportability. So that
the module graph you are dumping is not the same module graph at the end
of module initialization. Is this a concern?
Thanks,
Lois
Well, verify_javabase_packages() was called once and it succeeded, but
that was during CDS dump time. So we know at least we have verified
this once :-)
Thanks
- Ioi
I may have more review comments as I continue to look at this!
Thanks,
Lois