John, >From a practicality standpoint I'm a little nervous about this plan to make RPCs calls out of a Java process to a native process to fetch a parse tree for transformations that have to occur realtime. I don't think the motivating factor here is to accept all inputs that browsers can. Gadget developers will tailor their markup to the platform as they have done already. I would greatly prefer us to pick one 'good' parser and stick with it for all the manageability and consumability benefits that come with that decision. Perhaps Im missing something here?
-Louis On Mon, Aug 11, 2008 at 11:59 AM, John Hjelmstad <[EMAIL PROTECTED]> wrote: > On Fri, Aug 8, 2008 at 6:10 AM, Ben Laurie <[EMAIL PROTECTED]> wrote: > > > [+google-caja-discuss] > > > > On Thu, Aug 7, 2008 at 9:27 PM, John Hjelmstad <[EMAIL PROTECTED]> wrote: > > > On Thu, Aug 7, 2008 at 3:20 AM, Ben Laurie <[EMAIL PROTECTED]> wrote: > > > > > >> On Wed, Aug 6, 2008 at 11:34 PM, John Hjelmstad <[EMAIL PROTECTED]> > > wrote: > > >> > This proposal effectively enables the renderer to become a > multi-pass > > >> > compiler for gadget content (essentially, arbitrary web content). > Such > > a > > >> > compiler can provide several benefits: static optimization of gadget > > >> content > > >> > (auto-proxying of images, whitespace/comment removal, consolidation > of > > >> CSS > > >> > blocks), security benefits (caja et al), new functionality > (annotation > > of > > >> > content for stats, document analysis, container-specific features), > > etc. > > >> To > > >> > my knowledge no such infrastructure exists today (with the possible > > >> > exception of Caja itself, which I'd like to dovetail with this > work). > > >> > > >> Caja clearly provides a large chunk of the code you'd need for this. > > >> I'd like to hear how we'd manage to avoid duplication between the two > > >> projects. > > >> > > >> A generalised framework for manipulating content sounds like a great > > >> idea, but probably should not live in either of the two projects (Caja > > >> and Shindig) but rather should be shared by both of them, I suspect. > > > > > > > > > I agree on both counts. As I mentioned, the piece of this idea that I > > expect > > > to change the most is the parse tree, and Caja's .parser.html and > > > .parser.css packages contain much of what I've thrown in here as a > base. > > > > > > My key requirements are: > > > * Lightweight framework. > > > * Parser modularity, mostly for HTML parsers (to re-use the good work > > done > > > by WebKit or Gecko.. CSS/JS can come direct from Caja I'd bet) > > > * Automatic maintenance of DOM<->String conversion. > > > * Easy to manipulate structure. > > > > I'm not sure what the value of parser modularity is? If the resulting > > tree is different, then that's a problem for people processing the > > tree. And if it is not, then why do we care? > > > IMO the value of parser modularity is that the lenient parsers native to > browsers can be used in place of those that might not accept all inputs. > One > could (and I'd like to) adapt WebKit or Gecko's parsing code into a server > that runs parallel to Shindig and provides a "local RPC" service for > parsing > semi-structured HTML. The resulting tree for WebKit's parser might be > different than that for an XHTML parser, Gecko's parser, etc, but if the > algorithm implemented atop it is rule-based rather than strict-structure > based that should be fine, no? > > > > > > > > > > > > I'd love to see both projects share the same base syntax tree > > > representations. I considered .parser.html(.DomTree) and .parser.css > for > > > these, but at the moment these appeared to be a little more tied to > > Caja's > > > lexer/parser implementation than I preferred (though I admit > > > AbstractParseTreeNode contains most of what's needed). > > > > > > To be sure, I don't see this as an end-all-be-all transformation system > > in > > > any way. I'd just like to put *something* reasonable in place that we > can > > > play with, provide some benefit, and enhance into a truly sophisticated > > > vision of document rewriting. > > > > > > > > >> > > >> > > >> > c. Add Gadget.getParsedContent(). > > >> > i. Returns a mutable GadgetContentParseTree used to manipulate > > Gadget > > >> > Contents. > > >> > ii. Mutable tree calls back to the Gadget object indicating when > > any > > >> > change is made, and emits an error if setContent() has been called > in > > the > > >> > interim. > > >> > > >> In Caja we have been moving towards immutable trees... > > > > > > > > > Interested to hear more about this. The whole idea is for the gadget's > > tree > > > representation to be modifiable. Doing that with immutable trees to me > > > suggests that a rewriter would have to create a completely new tree and > > set > > > it as a representation of new content. That's convenient as far as the > > > Gadget's maintenance of String<->Tree representations is concerned... > but > > > seems pretty heavyweight for many types of edits: in-situ modifications > > of > > > text, content reordering, etc. That's particularly so in a > > single-threaded > > > (viz rewriting) environment. > > > > Never having been entirely sold on the concept, I'll let those on the > > Caja team who advocate immutability explain why. > > >

