Scintilla's use of PropSet is very simple (32 keys are inserted by
SciTE) and hasn't appeared important in profiles. SciTE is heavier but
the main problem is not with hashed access but with file pattern based
properties which require exhaustive searches. Say the file 'hash.xx'
has been opened and it goes looking for a lexer as specified in one of
the lexer.<filepattern> properties lexer.*.cxx lexer.*.hh;*.xx and
lexer.*.js;*.html . It can't go searching for a particular hash
because the <filepattern> may contain more than one pattern. Therefore
it looks at every single property checking for the prefix "lexer." and
then whether the filename matches the rest of the key name. There are
various improvements that could be made to the data structure to
address this such as linking together all the keys with a particular
prefix. While the time taken to do these searches is noticeable in
profiles it has never been slow enough to look at fixing.

Robert:

2) I packaged it in a very contained way - just replacing the previous
[inline] definition of PropSet::HashString in propset.h

  About the only negative I have is that the code is longer and more
complex. For SinkWorld I used Python's string hash which is 7 line of
code.

3) The legalities are discussed in some length by the author at

http://www.azillionmonkeys.com/qed/weblicense.html

By stripping the already quite sparse comments, I believe the borrowed
"raw source code" qualifies under the author's "derivative" license,
and Scintilla's own licensing terms do not invalidate this.

  Interpreting the conditions is fun but it looks OK.

Properly balancing size and performance is the trick here - since
PropSet gets used in a one-size-fits-all way (with large sets but with
space still mattering), I have used 256 buckets as a compromise.
Going larger just gets better (my own heavily instrumented use of this
algorithm was with 4096 buckets), so upping this might make sense.
This is really up to Neil, who *may* want to make it a configurable
option, so that say, SciTE could use a larger number like 4096, while
more constrained uses of Scintilla would use 256 (or even 32).

  Where use level is uncertain, I like hash tables that can grow like
SinkWorld's Dictionary which doubles in size once it is 2/3 full.

  Scintilla only uses simple key names (no filepatterns) and integer
values so could actually use a very simple implementation, moving the
full featured hash table into SciTE. I wouldn't mind decreasing the
'helper library' part of Scintilla (SString, PropSet, WordList, ...)
that are statically linked into client code since it complicates using
Scintilla.

  Neil
_______________________________________________
Scite-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scite-interest

Reply via email to