[gwt-contrib] Re: RFC: sharded linking

Alex Moffat Wed, 10 Feb 2010 06:49:46 -0800

I've replied before but don't see it here, if it turns up ignore this
dupe.


I don't maintain any linkers but I have experimented with multi-
machine builds. The current Precompile, CompilePerms, and Link
implementation has the nice feature that the CompilePerms step does
not require access to the source code being compiled. This makes it
very, very much easier to deploy additional CompilePerms workers as
they don't need to check out source code etc. I like the plan for
being able to perform some linking in parallel but I wouldn't like to
lose the ability to deploy a useful CompilePerms worker that does not
need source code access. If performing Java parsing, creating AST and
generating artifacts is something that may need to be parallelized for
some builds then I'd like it if that was done in an additional step so
that people could choose whether or not to run that on multiple
machines while still being able to run the CompilePerms steps on
multiple machines.

On Feb 9, 4:31 pm, Lex Spoon <sp...@google.com> wrote:
> This is a design doc about speeding up the link phase of GWT.  If you don't
> maintain a linker, and if you don't have a multi-machine GWT build, then
> none of this should matter to you.  If you do maintain a linker, let's make
> sure your linker can be updated with the proposed changes.  If you do have a
> multi-machine build, or if you have some ideas about them, then perhaps you
> can help us get the best speed benefit possible out of this.
>
> I want to speed up linking for multi-machine builds in two ways:
>
> 1. Allow more parts of linking to run in parallel.  In particular, anything
> that happens once per permutation and does not need information from other
> permutations can run in parallel.  As an example, the iframe linker chunks
> the JavaScript of each permutation into multiple <script> tags.  That work
> can happen in parallel once the linker API supports it.
>
> 2. Link does a lot of Java serialization for its artifacts, but the majority
> of the artifacts in a compile are emitted artifacts that have no structure.
>  They are just a named bag of bits, from the compiler's perspective.  It
> would help if such artifacts did not need a round of Java serialization on
> the Link node and could instead be bulk copied.
>
> === Transition ===
>
> The compiler will support two compilation modes: maximal sharding and
> simulated sharding.  Maximal sharding is used when all linkers support it
> and the Precompile/CompilePerms/Link entry points are used.  Simulated
> sharding is used when either some linker can't shard or when the Compiler
> entry point is used.
>
> Linkers individually indicate whether they implement the sharding or
> non-sharding API. This allows linkers to be updated one by one and to leave
> the non-sharding API behind once they do. It does not cause trouble with
> other linkers, because in practice linkers are highly independent.  I've
> looked at as many linkers as I could find to verify this.  Occasionally one
> linker depends on another; in such a case they'll have to be updated in
> tandem, but the need for that should be rare.
>
> By default, a linker is assumed to want the legacy non-sharding API. For
> such linkers, it isn't safe to assume it generators or its associated
> artifacts can be safely serialized and then deserialized on a different
> computer.
>
> The non-sharding API will be deprecated.  After the sharding API has been
> out for one GWT release cycle, support for non-shardable linkers will be
> dropped.
>
> === Maximal sharding ===
>
> Currently, Precompile parses Java into ASTs and runs
> generators. CompilePerms then runs one copy for each permutation, in
> parallel. Each instance optimizes the AST for one permutation and then
> converts it into JavaScript plus some additional artifacts. Finally, Link
> takes the JavaScript and all the produced artifacts, runs the individual
> linkers, and produces the final output. In summary, the three stages are:
>
> current Precompile:
>
>    - parse Java and run generators
>    - output: number of permutations, AST, generated artifacts
>
> current CompilePerms:
>
>    - input: permutation id, AST
>    - compile one permutation to JavaScript
>    - output: JavaScript, generated artifacts
>
> current Link:
>
>    - input: JavaScript from all permutations, generated artifacts
>    - run linkers on all artifacts
>    - emit EmittedArtifacts into the final output
>
> With maximal sharding, Precompile does no work except to count the number of
> permutations. Each CompilePerms instance parses Java ASTs, run generators,
> and optimizes for a specific permutation. Additionally,
> each CompilePerms instance also runs the shardable part of linkers on the
> results for that permutation. It then "thins" the artifacts (see below) and
> emits them. Finally, Link takes these results from the CompilePerms
> instances, runs the final, non-shardable part of each linker, and emits all
> the artifacts designated as emitted artifacts.  In summary, the
> maximal-sharding staging looks like this:
>
> new Precompile:
>
>    - output: number of permutations
>
> new CompilePerms:
>
>    - input: permutation id
>    - compile one permutation to JavaScript, including running generators
>    - run the on-shard part of linkers
>    - thin down the resulting artifacts, as defined below
>    - output: JavaScript and the thinned down set of artifacts
>
> new Link:
>
>    - input: JavaScript and transferable artifacts from each permutation
>    - run the final part of linkers, which can add more files to the final
>    output
>    - output: resulting emitted artifacts
>
> === Simulated Sharding ===
>
> Simulated sharding uses the in-trunk compiler staging, but runs the linkers
> as much as possible as if they were using the maximal sharding staging. The
> sequence is the same whether the Compiler entry point is used or the
> Precompile/CompilePerms/Link trio of entry points is used. Under
> simulated sharding, the Precompile and CompilePerms steps run exactly as in
> trunk. The Link stage, however, runs the linkers in a careful order so as to
> use the sharded API for those linkers that have been updated:
>
>    - For each compiled permutation, run the on-shard part of
>    all shardable linkers. For each permutation, start with a fresh set of
>    artifacts so that the linkers don't see each other's output.
>    - Combine all of the resulting artifacts.
>    - Run the non-shardable linkers on those artifacts.
>    - Thin the artifacts, as defined below
>    - Run the final part of all shardable linkers.
>    - Emit the "output" and "extra" files.
>
> === Development mode ===
>
> Development mode does not generate any compiled permutations. Thus, it does
> not run the per-permutation part of linkers. It does, however, need to run
> the final-link part of linkers. It should do this just after the places it
> calls link() or relink().
>
> === Detailed API changes ===
>
>    - Linkers that are updated to be shardable are annotated with a new
>    annotation @Shardable
>    - The Linker.link() method has a new boolean parameter, indicating
>    whether it is running on a shard or on the final node.
>    - BinaryEmittedArtifact is added as a final subclass of EmittedArtifact,
>    indicating an artifact with no internal structure.  The compiler can bulk
>    copy such artifacts rather than using Java serialization.
>    - There is a new annotation @Transferable that can be added to artifacts.
>     Artifacts without this annotation are subject to thinning, described 
> below.
>
> === Thinning of an artifact set ===
>
> After the sharded part of a linker runs, the resulting artifact set is
> thinned down, so as to minimize the amount sent back to the Link node and to
> minimize the amount of deserialization that Link has to do. Thinning an
> artifact set does two things:
>
>    - All EmittedArtifacts are replaced by a BinaryEmittedArtifact, thus
>    discarding any fields that the EmittedArtifact might have had.
>    - All other artifacts are discarded, except ones annotated with
>   �...@transferable
>
> === Order of linkers ===
>
> Whenever the compiler runs a number of linkers, it runs them in the order
> implied by the PRE, PRIMARY, and POST annotations.  This is true on the
> shards and not, as well as with both the shardable and non-shardable link()
> methods.

-- 
http://groups.google.com/group/Google-Web-Toolkit-Contributors

[gwt-contrib] Re: RFC: sharded linking

Reply via email to