Re: speed results

Paul Rohr Tue, 17 Apr 2001 12:31:26 -0700
At 09:04 AM 4/15/01 +1000, Martin Sevior wrote:
>With a header in place getEdittableBounds() searches for the end of the
>Edittable region by skipping through the SectionLayout claases until it
>bumps up against a HdrFtrSection. Then it Finds the last block then then
>last run in the block and returns the docPosition of the run. That's the
>end of Edittable region. 

Cool!

>On the other hand getBounds() skips through all
>the Frags in the document until it runs out of then. Then find the
>position of the last one.

Yep.  Note, however, that in *both* cases you still walk the entire frag 
list, which gets slower and slower as documents balloon.  

Jeff's original code to calculate DocPositions just walks through all 
Frags from the beginning of the document, adding lengths.  It's not fast, 
but it's guaranteed to be reliable, which is good, since any math errors 
here would be devastating.  (Think about it.)

At the time, the two of us discussed a few caching strategies which could 
speed up this operation for common cases, but didn't have enough data on 
actual use cases to choose the right one.  For example:

  - cache the last lookup results
  - cache the overall doc size
  - cache aggregate lengths of a strux frag's contents (walk vs. skip)
  - etc. 

The first is simplest, because it allows "repeat" lookups, and doesn't need 
to interact with the editing code.  The second would speed up getBounds, and 
allow walking from either end of the document, but at the cost of minimal 
interactions on edit.  The third enables more speedups, but requires a lot 
more edit-time interaction, and could theoretically slow down typing.  I'm 
sure that clever folks with a good profiler could come up with even more 
alternatives to these three.  

Caching code often tends to be tricky to debug and maintain, and at the time 
we had other things to get working (like typing, for instance).  Thus, we 
agreed to file the TODO bug mentioned earlier in this thread to come back 
and speed this up later.  Sounds like now is the time, huh?  :-)

Any volunteers?  If so, I'd suggest that any patches or commits in this area 
be accompanied by:

  - before and after profile results
  - a theoretical explanation of *why* it's faster 
  - an explanation of why *this* cache isn't brittle

Thanks,
Paul

PS:  For critical caches like this, it might also be worth stealing a trick 
from the Excel development team and add some DEBUG-only code which compares 
the results of *both* approaches:

  - the cache, and 
  - the full-walk it supercedes

thereby detecting any runtime errors in the caching code.  It shouldn't make 
debug builds any slower than they already are, and it's a *wonderful* way to 
isolate any problems that might arise.
Re: speed results

Reply via email to