Yes, all postings for the entire doc are held in RAM data structures
... you could make your own indexing chain to somehow change this
behavior, but I don't think that's an easy task.
Mike McCandless
http://blog.mikemccandless.com
On Thu, Feb 20, 2014 at 4:02 PM, Igor Shalyminov
wrote:
> Mike,
Mike, thank you!
So eventually this amount of data must stay entirely in RAM (as postings)
before flushing to disk?
Can it be hacked?)
The documents themselves (that I will deliver to user) are of a regular size,
but features that I generate grow combinatorially in size and blow the index up
i
Yes, in 4.x IndexWriter now takes an Iterable that enumerates the
fields one at a time.
You can also pass a Reader to a Field.
That said, there will still be massive RAM required by IW to hold the
inverted postings for that one document, likely much more RAM than the
original document's String co