Another approach might be to, instead of converting to UTF-8 to strings
right away, change things to convert lazily, if at all. During index
merging such conversion should never be needed. You needn't do this
systematically throughout Lucene, but only where it makes a big
difference. For example, if you could avoid strings in
SegmentMerger.mergeTermInfos() it might make a huge difference. This
might be as simple as changing SegmentMergeInfo to use a TermBuffer
instead of a Term. Does that make sense?
Doug
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]