Hi Thomas,

Thanks for looking into this in detail.

Actually the problem is not as bad as you think (I hope ....). First of all, this patch is intended for -Xshare:dump only (aka static CDS archive). Dynamic CDS archive (-XX:ArchiveClassesAtExit) is much harder to make deterministic because of concurrent thread execution. I've updated the RFE title to "Generation of classes.jsa with -Xshare:dump is not deterministic".

Also, the objects are not archived at the address allocated in the Metaspace. Instead, we copy them linearly into the archive and we have control on the order of copying:

1. All classes are loaded by a single thread. JIT is also disabled (so it cannot
   trigger constant pool resolution, etc, in the background)
2. Classes are loaded in deterministic order as specified in SharedClassListFile.
3. As as result of 2, Symbols are also created in a deterministic order

So, as long as I can guarantee that Symbols are allocated in monotonically increasing addresses, I can write them into the archive at deterministic locations.

All MetaspaceObjs are copied into the archive via a depth-first search inside ArchiveCompactor::iterate_roots(). The objects are copied as they are walked, e,g. InstanceKlass -> methods -> methodA, methodB, ....

Currently the VM lays out the classes deterministically. (For example, InstanceKlass::methods() is sorted by ascending names. Fields are ordered according to the field type and their order of appearance in the classfile.) This means that ArchiveCompactor::iterate_roots() will always walk the MetaspaceObjs in the same order.

In essence, iterate_roots() doesn't care where the source MetaspaceObjs are. It will probably work if every MetaspaceObjs is allocated at a random address. So hopefully this will not tie your hands in your Metaspace work.

The only restriction is the ordering of Symbols, which is the main fix in this patch. Now I allocate them in a VirtualSpace that's incrementally committed. It doesn't matter where this VistualSpace is located.

I've added some comments in iterate_roots() to explain what's happening, but I didn't want to wring a long essay. Do you think the comments are sufficient?

I updated the test case for uncompressed oops/klasses. I also run with a smaller -XX:MetaspaceSize which has the side effect of triggering GC. This found a problem with the archived heap objects, which I also fixed.


http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v03/
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v03.delta/

Thanks
- Ioi


On 4/27/20 2:14 AM, Thomas Stüfe wrote:
Rethinking this a bit more I realize you need not addresses growing monotonously but deterministic allocation: given a sequence of Metaspace allocation operations (Metaspace::allocate(), Metaspace::deallocate(), and collection of class loaders), the pointers returned by Metaspace::allocate() should come in the same order each time that sequence is repeated for a new VM. This invalidates some of my arguments in my last mail, but not all.

I also thought about restrictions this places on the callers.
- class loader collection are triggered by GCs. Can be be sure that this happens at exactly the same point at each run? Some GCs do class unloading concurrently, which adds a nondeterministic timing factor. - classes may be loaded concurrently by multiple threads, adding a timing factor. - You may have classes which are implicitly created like hidden classes for lambdas, or reflection glue classes. Their creation may not be deterministic. Even though they are not put into the archive, they live in class space too and their allocation mixes up things.

Also, requiring Metaspace allocation to be deterministic requires each part of it being deterministic (e.g. the deallocation block management). E.g. we never could base any decision on the numerical form of an address, which is location dependent and can vary between VM runs.

I really think reproducable builds are valuable, but my fear is that relying on Metaspace for deterministic allocation would be too fragile.

Thanks again, Thomas




On Mon, Apr 27, 2020 at 9:58 AM Thomas Stüfe <thomas.stu...@gmail.com <mailto:thomas.stu...@gmail.com>> wrote:

    Hi Ioi,

    Please don't do this :)

    First off, how would this work when dumping with
    UseCompressedClassPointers off? In that case allocation would be
    relegated to non-class metaspace which cannot guarantee that kind
    of address stability.

    Even in class space, I do not think you can guarantee addresses
    growing monotonously. Class unloading could happens during dump
    time, so space may be returned to class freelist and later reused.
    Metadata can be prematurely deallocated, e.g. if a class load
    error occurs or byte code is rewritten by some agent. Remainder of
    Metachunks are used up in a delayed fashion. All these cases will
    present you with pointers which are not growing monotonously.

    I also believe this problem of non-deterministic placement is not
    limited to Symbols, but that you should see it for Klass
    structures too, albeit rarely. I believe the fact that you do not
    see this is an accident, or we are not looking that closely. E.g.
    if we were to change the frequency at which we retrieve MetaBlocks
    from the freeblocklist (see
    
http://hg.openjdk.java.net/jdk/jdk/file/534855a30ef5/src/hotspot/share/memory/metaspace/spaceManager.cpp#l414),
    you would get more reuse of deallocated blocks and would certainly
    see more volatility in the addresses.

    This is all true with the current implementation; the upcoming new
    one uses a buddy style allocator behind the scenes where it is by
    no means guaranteed that the first chunks get used first. I think
    this is what still happens, by sheer accident, but I am hesitant
    to promise such a behavior in the future. It removes freedom from
    the implementation in a lot of ways.

    Small examples, we might want to shepherd certain allocations into
    separate parts of class space (e.g. Klass structures from hidden
    classes) to minimize fragmentation. Or add a mode, for testing,
    where we would allocate Klass at the very end of ccs, or at
    certain "round" addresses, to shake loose errors in the calling
    layers which rely too much on how Klass pointers look like.

    Bottomline I think the assumption that ccs allocates in
    monotonously ascending order is not even correct today, and may
    break very easily, and we should not rely on this.

    I think instead of misusing ccs for this, it would be cleaner to
    just allocate a large C heap area as backing storage for the
    symbols? How much space are we talking about? If memory is a
    concern, we could just reserve a range and commit it manually as
    we go.

    Or could we not order the placement of Klass and Symbol at dump
    time? Dump time is not that time critical, no?

    Thanks & Sorry,

    Thomas



    On Mon, Apr 27, 2020 at 7:31 AM Ioi Lam <ioi....@oracle.com
    <mailto:ioi....@oracle.com>> wrote:

        https://bugs.openjdk.java.net/browse/JDK-8241071
        
http://cr.openjdk.java.net/~iklam/jdk15/8241071-deterministic-cds-archive.v02/

        The goal is to for "java -Xshare:dump" to produce
        deterministic contents in
        the CDS archive that depend only on the contents of
        $JAVA_HOME/lib/modules.

        Problems:
        [1] Symbols in the CDS archive may have non-deterministic
        order because
             Arena allocation is non-deterministic.
        [2] The contents of the CDS shared heap region may be
        randomized due to
             ImmutableCollections.SALT32L.

        Fixes:
        [1] With -Xshare:dump, allocate Symbols from the class space
        (64-bit only).
             See changes in symbol.cpp for details.
        [2] When running the VM with -Xshare:dump,
        ImmutableCollections.SALT32L is
             initialized with a deterministic seed. NOTE: this affects
        ONLY when the
             VM is running with the special flag -Xshare:dump to dump
        the CDS
        archive.
             It does NOT affect normal execution of Java programs.

        ---
        I also cleaned up the -Xlog:cds output and print out the CRC
        of each
        CDS region, to help diagnose why two CDS archives may have
        different
        contents.

        Thanks
        - Ioi


Reply via email to