On 10/10/2013 05:23, Marvin Humphrey wrote:
I suspect that having TextTermStepper move away from mutating may have
minor performance implications.  I ran some benchmarks and found some slight
degradation from 0.3 to master and from master to cfish-string-prep1 as of
fde94c411c7c73ad35171bb19295f781ed48e0dd -- results below.  However, I still
support merging the branch, just with the note that this may now be a
hotspot to look into when refactoring at some point in the future.

Which commit on the new branch did you benchmark exactly? I added back some of the optimizations in c69fb741a5d016455b56de8ca3890c33f55ce464 (S_write_terms_and_postings in PostingPool) shortly after fde94c411c7c73ad35171bb19295f781ed48e0dd.

The part of the indexing code that's still affected by the TextTermStepper changes should be PostPool_Refill. This code loops over a Lexicon and repeatedly reads terms via Lex_Get_Term. With immutable strings, a new String is allocated for each term. I can't see an easy way to speed that up but the performance degradation shouldn't be too bad.

Nick

Reply via email to