In theory, you could use the codec API directly, adding "chunks" of pre-sorted terms, and then fake up a SegmentInfo to make it look like some kind of degenerate segment, and then merge them?
But it's gonna be a lot of work to do that :) Merging FSTs sounds cool! Mike McCandless http://blog.mikemccandless.com On Tue, Jun 14, 2011 at 8:18 AM, Dawid Weiss <[email protected]> wrote: >> So actually it would work if you just enum'd the terms yourself, after >> indexing and optimizing. And this does amount to an external sort, I >> think! > > Yep. I was just curious if there's a way to do it without the overhead > of creating fields, documents, etc. If I have a spare minute I'll try > to write a merge sort from disk splits. It'd be neat to write FST > merging too (so that, given to FSTs you could merge them into one by > creating a new FST and adding sequences in order from one or the other > source). > > Dawid > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
