Re: bytecount as String and prefix length

Marvin Humphrey Tue, 01 Nov 2005 20:52:51 -0800


On Nov 1, 2005, at 9:51 AM, Doug Cutting wrote:

Another approach might be to, instead of converting to UTF-8 tostrings right away, change things to convert lazily, if at all.
During index merging such conversion should never be needed.

!!

There ought to be some gains possible there, then. No predictions asto how much, though.

You needn't do this systematically throughout Lucene, but onlywhere it makes a big difference. For example, if you could avoidstrings in SegmentMerger.mergeTermInfos() it might make a hugedifference. This might be as simple as changing SegmentMergeInfoto use a TermBuffer instead of a Term. Does that make sense?

Abundant sense. I'm not as familiar with SegmentMerger as I am withother parts of the org.apache.lucene.index package, because I haven'tported it yet. But conceptually I understand exactly why this shouldrequire fewer resources.


I'll take a swing at SegmentMerger and submit a comprehensive diff.

Thanks for the suggestions,

Marvin Humphrey
Rectangular Research
http://www.rectangular.com/


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: bytecount as String and prefix length

Reply via email to