On Wed, 2014-12-10 at 15:27 +0100, Michael McCandless wrote:
> No, Lucene does not store numeric type nor multi-valued-ness today;
> it's frustrating.

At least I now know not to dig too deep for non-existing answers,
thanks. Out current code requires the user to be explicit about how the
content of the fields should be treated. Until a more fundamental
change, such as LUCENE-6005, we will leave it at that.

> In the meantime, maybe you could model your tool after
> UninvertingReader?  It faces the same issue (lack of schema) and lets
> the user specify the type.

Yes, that is what we're doing. Unfortunately we cannot use the
UninvertingReader directly due to its restrictions on facet structure
size: We have too many references in our shards so it hits an internal
16M(?) limit. 

Unfortunately our current mapping code from stored multi value String to
DocValues seems to be much very slow: It took nearly 2 days to convert a
single-segment 900GB index, where a standard optimize is only 8 hours.

> Also, see (the confusingly named) TestDemoParallelLeafReader?  It lets
> you partially reindex, e.g. derive new indexed fields or DV fields,
> etc., from existing stored/DV fields, in an NRT manner.

Thanks for the pointer. As far as I can see, the demo is very explicit
about the type of DocValues being long, so no auto-guessing there. It's
a very interesting idea though, with seamless DV-enabling.

Thank you,
Toke Eskildsen, State and University Library, Denmark



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to