Any suggestions on experiments appreciated. The Shindig guys pointed out that one of the problems with the cacheability of cajoled output, is that they are caching parse trees at a number of stages, not code. They would like to use our parsers. Our CSS parser is already better than anything else they've tried, but NekoHTML is much faster than our HTML parser even though it doesn't do as good a job on mimicking browser behavior around tag soup. Parsing speed is a real concern though since their cache miss rate is high enough that parsing speed is non-trivial.
If we can give them an HTML parser that performs comparably to NekoHTML, they're willing to rework their stack to use it which should make it easier to get the cajoler working with existing processing stages, and to make sure that the cajoler sees proper file position information. My profiling from a while back shows that parsing is a small portion of the time consumed by the cajoler, but that did not cover time occupied by traversing and mutating parse trees. The operations they perform are: (1) parse if content is uncached or comes from a URL cache shared by multiple services (2) clone if content is in a cache (3) traverse the DOM usually using a pre-order traversal (4) replace some nodes, add some, and remove others (5) serialize the DOM to a string. The vast majority of their requests are by non-developers, so they want to optimize for the case where they do not need to reporting errors with detailed position information to the end user. There are a number of reasons why they do not make logs available one being that they mix user data in during early stages, and so are unlikely to change this policy. We assumed during initial design that the parsers would mostly used by the cajoler, and that the cajoling would be done once by a caching gadget store which would have a way to report feedback to the gadget developer, but neither of those assumptions turn out to hold. I'm going to do some profiling and try a number of experiments to see what improves parsing performance. It may be worth trying some of the following (1) add a parsing mode which does not store real file positions to prevent unnecessary object creation. This will break token adjacency checks in some parsers, but those are fixable. (2) rework the char producer to present lexers with multiple characters at a time (3) store token indices in FilePositions so that if messages need to be reported, we can reconstruct the file positions. (4) rework lexers to create fewer strings. (5) rework DomTree to improve construct clone speed (6) rework AbstractParseTreeNode traversal schemes, mutation methods, and child list consistency checks (7) rework ParseTreeNodes.newNodeInstance (8) rework synthetic attribute lists (9) profile per-node memory overhead (10) rework HTML&CSS tree serialization (11) rework Escaping (12) memoize identifier normalization and string escaping.
