Any suggestions on experiments appreciated.

The Shindig guys pointed out that one of the problems with the
cacheability of cajoled output, is that they are caching parse trees
at a number of stages, not code.
They would like to use our parsers.  Our CSS parser is already better
than anything else they've tried, but NekoHTML is much faster than our
HTML parser even though it doesn't do as good a job on mimicking
browser behavior around tag soup.
Parsing speed is a real concern though since their cache miss rate is
high enough that parsing speed is non-trivial.

If we can give them an HTML parser that performs comparably to
NekoHTML, they're willing to rework their stack to use it which should
make it easier to get the cajoler working with existing processing
stages, and to make sure that the cajoler sees proper file position
information.  My profiling from a while back shows that parsing is a
small portion of the time consumed by the cajoler, but that did not
cover time occupied by traversing and mutating parse trees.

The operations they perform are:
  (1) parse if content is uncached or comes from a URL cache shared by
multiple services
  (2) clone if content is in a cache
  (3) traverse the DOM usually using a pre-order traversal
  (4) replace some nodes, add some, and remove others
  (5) serialize the DOM to a string.

The vast majority of their requests are by non-developers, so they
want to optimize for the case where they do not need to reporting
errors with detailed position information to the end user.  There are
a number of reasons why they do not make logs available one being that
they mix user data in during early stages, and so are unlikely to
change this policy.
We assumed during initial design that the parsers would mostly used by
the cajoler, and that the cajoling would be done once by a caching
gadget store which would have a way to report feedback to the gadget
developer, but neither of those assumptions turn out to hold.

I'm going to do some profiling and try a number of experiments to see
what improves parsing performance.  It may be worth trying some of the
following
(1) add a parsing mode which does not store real file positions to
prevent unnecessary object creation.  This will break token adjacency
checks in some parsers, but those are fixable.
(2) rework the char producer to present lexers with multiple
characters at a time
(3) store token indices in FilePositions so that if messages need to
be reported, we can reconstruct the file positions.
(4) rework lexers to create fewer strings.
(5) rework DomTree to improve construct clone speed
(6) rework AbstractParseTreeNode traversal schemes, mutation methods,
and child list consistency checks
(7) rework ParseTreeNodes.newNodeInstance
(8) rework synthetic attribute lists
(9) profile per-node memory overhead
(10) rework HTML&CSS tree serialization
(11) rework Escaping
(12) memoize identifier normalization and string escaping.

Reply via email to