Re: [Groff] caching result of charinfo::get_flags

Werner LEMBERG Mon, 20 Dec 2010 23:36:11 -0800

> > Maybe the look-up algorithm of `get_flags' (without caching) could
> > also be optimized.  IIUC currently it does not sort/merge the
> > ranges and check all of them linearly.
>
> Certainly, but for the current state, I think this isn't necessary.
> It might be worth to look at it eventually, since everything which
> makes GNU troff faster is good...


BTW, I've just used the file `bash.1' version 2.05 from the linuxjm
project (with 217kByte it is about 30 times larger than `gprof.1'),
and profiling shows a completely different hot spot:

  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 43.14      0.22     0.22    78453     0.00     0.00  
unicode_decompose_ptable::lookup
  5.88      0.25     0.03   695361     0.00     0.00  token::next
  5.88      0.28     0.03     9941     0.00     0.00  file_iterator::fill
  3.92      0.30     0.02   202744     0.00     0.00  tfont::get_width
  3.92      0.32     0.02   108869     0.00     0.00  read_long_escape_name
  ...

Doing the same for the English bash.1 (version 4.1, about 276kByte), I
get this:

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 11.11      0.03     0.03   915059     0.00     0.00  token::next
 11.11      0.06     0.03   406191     0.00     0.00  font::get_width
  7.41      0.08     0.02   609951     0.00     0.00  glyph_to_unicode
  7.41      0.10     0.02   154972     0.00     0.00  symbol::symbol
  7.41      0.12     0.02   150763     0.00     0.00  string_iterator::fill
  ...

The timings are from a normal build (-O2).

Bruno, who has worked a lot on groff's Unicode support, already
pointed out in a comment in ptable.cpp that groff's `mythical
Aho-Hopcroft-Ullman hash function' can be improved; see

  http://www.haible.de/bruno/hashfunc.html

While of virtually no importance for latin man pages, non-latin man
pages (CJK, Russian, Greek, etc.) which contain zillions of \[uXXXX]
entries would benefit a lot...


    Werner

Re: [Groff] caching result of charinfo::get_flags

Reply via email to