Another approach might be to, instead of converting to UTF-8 to strings right away, change things to convert lazily, if at all. During index merging such conversion should never be needed. You needn't do this systematically throughout Lucene, but only where it makes a big difference. For example, if you could avoid strings in SegmentMerger.mergeTermInfos() it might make a huge difference. This might be as simple as changing SegmentMergeInfo to use a TermBuffer instead of a Term. Does that make sense?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to