Re: [gwt-contrib] RFC: sharded linking

2010-02-12 Thread Matt Mastracci
On 2010-02-12, at 1:15 PM, Ray Cromwell wrote:

 On Thu, Feb 11, 2010 at 4:43 PM, Scott Blum sco...@google.com wrote:

 - I dislike the whole transition period followed by having to forcibly
 update all linkers, unless there's a really compelling reason to do so.
 
 In general, I'd agree, but the number of linkers in the wild appears
 to be small, this may be a case of trying to preserve an API that only
 5 or 10 people in the world are using.

+1. I've written a handful of custom linkers (including one in the public 
gwt-firefox-extension project), but I'm used to updating them between GWT 
releases to work around subtle changes in the linker contract (ie: the 
evolution of hosted mode, various global variable changes, etc).  

I'd rather have a clean linker system that changes from version to version than 
an awkward one with a lot of legacy interfaces.

Matt.

-- 
http://groups.google.com/group/Google-Web-Toolkit-Contributors


Re: [gwt-contrib] RFC: sharded linking

2010-02-12 Thread Lex Spoon
On Thu, Feb 11, 2010 at 7:43 PM, Scott Blum sco...@google.com wrote:

 I have a few comments, but first I wanted to raise the point that I'm not
 sure why we're having this argument about maximally sharded Precompiles at
 all.  For one thing, it's already implemented, and optional, via
 -XshardPrecompile.  I can't think of any reason to muck with this, or why
 it would have any relevance to sharded linking.  Can we just table that part
 for now, or is there something I'm missing?


There are still two modes, but there's no more need for an explicit
argument.  For Compiler, precompile is never sharded.  For the three-stage
entry points, full sharding happens iff all linkers are shardable.



 - I'm not sure why development mode wouldn't run a sharded link first.
  Wouldn't it make sense if development mode works just like production
 compile, it just runs a single development mode permutation shard link
 before running the final link?


Sure, we can do that. Note, though, that they will be running against an
empty ArtifactSet, because there aren't any compiles for them to look at.
 Thus, they won't typically do anything.



 2) Instead of trying to do automatic thinning, we just let the linkers
 themselves do the thinning.  For example, one of the most
 serialization-expensive things we do is serialize/deserialze symbolMaps.  To
 avoid this, we update SymbolMapsLinker to do most of its work during
 sharding, and update IFrameLinker (et al) to remove the CompilationResult
 during the sharded link so it never gets sent across to the final link.


In addition to the other issues pointed out, note that this adds ordering
constraints among the linkers.  Any linker that deletes something must run
after every linker that wants to look at it.  Your example wouldn't work as
is, because it would mean no POST linker can look at CompilationResults.  It
also wouldn't work to put the deletion in a POST linker, for the same
reason.  We'd have to work out a way for the deletions to happen last, after
all the normal linkage activity.

Suppose, continuing that idea, we add a POSTPOST order that is used only for
deletion.  If it's really only for deletion, then the usual link() API is
overly general, because it lets linkers both add and remove artifacts during
POSTPOST, which is not desired.  So, we want a POSTPOST API that is only for
deletion.  Linkers somehow or another mark artifacts for deletion, but not
anything else.  At this point, though, isn't it pretty much the same as the
automated thinning in the initial proposal?


 The pros to this idea are (I think) that you don't break anyone... instead
you
 opt-in to the optimization.  If you don't do anything, it should still
work, but
 maybe slower than it could.

The proposal that started this thread also does not break anyone.

Lex

-- 
http://groups.google.com/group/Google-Web-Toolkit-Contributors

Re: [gwt-contrib] RFC: sharded linking

2010-02-11 Thread Scott Blum
I have a few comments, but first I wanted to raise the point that I'm not
sure why we're having this argument about maximally sharded Precompiles at
all.  For one thing, it's already implemented, and optional, via
-XshardPrecompile.  I can't think of any reason to muck with this, or why
it would have any relevance to sharded linking.  Can we just table that part
for now, or is there something I'm missing?


Okay, so now on to sharded linking itself.  Here's what I love:

- Love the overall goals: do more work in parallel and eliminate
serialization overhead.
- Love the idea of simulated sharding because it enforces consistency.
- Love that the linkers all run in the same order.

Here's what I don't love:

- I'm not sure why development mode wouldn't run a sharded link first.
 Wouldn't it make sense if development mode works just like production
compile, it just runs a single development mode permutation shard link
before running the final link?

- I dislike the whole transition period followed by having to forcibly
update all linkers, unless there's a really compelling reason to do so.
 Maybe I'm missing some use cases, but I don't see what problems result from
having some linkers run early and others run late.  As Lex noted, all the
linkers are largely independent of each other and mostly won't step on each
other's toes.

- It seems unnecessary to have to annotate Artifacts to say which ones are
transferable, because I thought we already mandated that all Artifacts have
to be transferable.

I have in mind a different proposal that I believe addresses the same goals,
but in a less-disruptive fashion.  Please feel free to poke holes in it:

1) Linker was made an abstract class specifically so that it could be
extended later.  I propose simply adding a new method linkSharded() with
the same semantics as link().  Linkers that don't override this method
would simply do nothing on the shards and possibly lose out on the
opportunity to shard work.  Linkers that can effectively do some work on
shards would override this method to do so.  (We might also have a
relinkSharded() for development mode.)

2) Instead of trying to do automatic thinning, we just let the linkers
themselves do the thinning.  For example, one of the most
serialization-expensive things we do is serialize/deserialze symbolMaps.  To
avoid this, we update SymbolMapsLinker to do most of its work during
sharding, and update IFrameLinker (et al) to remove the CompilationResult
during the sharded link so it never gets sent across to the final link.

The pros to this idea are (I think) that you don't break anyone... instead
you opt-in to the optimization.  If you don't do anything, it should still
work, but maybe slower than it could.

The cons are... well maybe it's too simplistic and I'm missing some of the
corner cases, or ways this could break down.

Thoughts?
Scott

-- 
http://groups.google.com/group/Google-Web-Toolkit-Contributors