Re: [gwt-contrib] RFC: sharded linking
On 2010-02-12, at 1:15 PM, Ray Cromwell wrote: On Thu, Feb 11, 2010 at 4:43 PM, Scott Blum sco...@google.com wrote: - I dislike the whole transition period followed by having to forcibly update all linkers, unless there's a really compelling reason to do so. In general, I'd agree, but the number of linkers in the wild appears to be small, this may be a case of trying to preserve an API that only 5 or 10 people in the world are using. +1. I've written a handful of custom linkers (including one in the public gwt-firefox-extension project), but I'm used to updating them between GWT releases to work around subtle changes in the linker contract (ie: the evolution of hosted mode, various global variable changes, etc). I'd rather have a clean linker system that changes from version to version than an awkward one with a lot of legacy interfaces. Matt. -- http://groups.google.com/group/Google-Web-Toolkit-Contributors
Re: [gwt-contrib] RFC: sharded linking
On Thu, Feb 11, 2010 at 7:43 PM, Scott Blum sco...@google.com wrote: I have a few comments, but first I wanted to raise the point that I'm not sure why we're having this argument about maximally sharded Precompiles at all. For one thing, it's already implemented, and optional, via -XshardPrecompile. I can't think of any reason to muck with this, or why it would have any relevance to sharded linking. Can we just table that part for now, or is there something I'm missing? There are still two modes, but there's no more need for an explicit argument. For Compiler, precompile is never sharded. For the three-stage entry points, full sharding happens iff all linkers are shardable. - I'm not sure why development mode wouldn't run a sharded link first. Wouldn't it make sense if development mode works just like production compile, it just runs a single development mode permutation shard link before running the final link? Sure, we can do that. Note, though, that they will be running against an empty ArtifactSet, because there aren't any compiles for them to look at. Thus, they won't typically do anything. 2) Instead of trying to do automatic thinning, we just let the linkers themselves do the thinning. For example, one of the most serialization-expensive things we do is serialize/deserialze symbolMaps. To avoid this, we update SymbolMapsLinker to do most of its work during sharding, and update IFrameLinker (et al) to remove the CompilationResult during the sharded link so it never gets sent across to the final link. In addition to the other issues pointed out, note that this adds ordering constraints among the linkers. Any linker that deletes something must run after every linker that wants to look at it. Your example wouldn't work as is, because it would mean no POST linker can look at CompilationResults. It also wouldn't work to put the deletion in a POST linker, for the same reason. We'd have to work out a way for the deletions to happen last, after all the normal linkage activity. Suppose, continuing that idea, we add a POSTPOST order that is used only for deletion. If it's really only for deletion, then the usual link() API is overly general, because it lets linkers both add and remove artifacts during POSTPOST, which is not desired. So, we want a POSTPOST API that is only for deletion. Linkers somehow or another mark artifacts for deletion, but not anything else. At this point, though, isn't it pretty much the same as the automated thinning in the initial proposal? The pros to this idea are (I think) that you don't break anyone... instead you opt-in to the optimization. If you don't do anything, it should still work, but maybe slower than it could. The proposal that started this thread also does not break anyone. Lex -- http://groups.google.com/group/Google-Web-Toolkit-Contributors
Re: [gwt-contrib] RFC: sharded linking
I have a few comments, but first I wanted to raise the point that I'm not sure why we're having this argument about maximally sharded Precompiles at all. For one thing, it's already implemented, and optional, via -XshardPrecompile. I can't think of any reason to muck with this, or why it would have any relevance to sharded linking. Can we just table that part for now, or is there something I'm missing? Okay, so now on to sharded linking itself. Here's what I love: - Love the overall goals: do more work in parallel and eliminate serialization overhead. - Love the idea of simulated sharding because it enforces consistency. - Love that the linkers all run in the same order. Here's what I don't love: - I'm not sure why development mode wouldn't run a sharded link first. Wouldn't it make sense if development mode works just like production compile, it just runs a single development mode permutation shard link before running the final link? - I dislike the whole transition period followed by having to forcibly update all linkers, unless there's a really compelling reason to do so. Maybe I'm missing some use cases, but I don't see what problems result from having some linkers run early and others run late. As Lex noted, all the linkers are largely independent of each other and mostly won't step on each other's toes. - It seems unnecessary to have to annotate Artifacts to say which ones are transferable, because I thought we already mandated that all Artifacts have to be transferable. I have in mind a different proposal that I believe addresses the same goals, but in a less-disruptive fashion. Please feel free to poke holes in it: 1) Linker was made an abstract class specifically so that it could be extended later. I propose simply adding a new method linkSharded() with the same semantics as link(). Linkers that don't override this method would simply do nothing on the shards and possibly lose out on the opportunity to shard work. Linkers that can effectively do some work on shards would override this method to do so. (We might also have a relinkSharded() for development mode.) 2) Instead of trying to do automatic thinning, we just let the linkers themselves do the thinning. For example, one of the most serialization-expensive things we do is serialize/deserialze symbolMaps. To avoid this, we update SymbolMapsLinker to do most of its work during sharding, and update IFrameLinker (et al) to remove the CompilationResult during the sharded link so it never gets sent across to the final link. The pros to this idea are (I think) that you don't break anyone... instead you opt-in to the optimization. If you don't do anything, it should still work, but maybe slower than it could. The cons are... well maybe it's too simplistic and I'm missing some of the corner cases, or ways this could break down. Thoughts? Scott -- http://groups.google.com/group/Google-Web-Toolkit-Contributors