Michael McCandless created LUCENE-7989:
------------------------------------------
Summary: Add computed (at segment flush) doc values fields
Key: LUCENE-7989
URL: https://issues.apache.org/jira/browse/LUCENE-7989
Project: Lucene - Core
Issue Type: Improvement
Reporter: Michael McCandless
This is a failed experiment but I thought I'd open an issue and post the patch
in case it inspires others.
It adds a new feature to Lucene, which lets you provide function (set via
{{IndexWriterConfig}}) that is invoked at segment flush time to create a new
doc values field as a function of all other doc values fields in that segment.
The newly created field is "first class", i.e. behaves as if you had indexed
actual doc values fields on your documents, it can participate in index sort,
etc. The interesting thing about it is it has access to all other documents
that made it into the flushed segment (by pulling doc values iterators for it).
Anyway, I got the feature working, and it's surprisingly small core code
change, but I had a very specific use case in mind, to "coalesce" documents by
their families while sorting them by another field, and I realized that even
though the feature is working, I cannot use it for this particular use case
since the coalescing would break during merge (it's not just a simple "merge
sort"). The test case I added, simulating my use case, fails on those seeds /
test multipliers that trigger merging of the random index.
I'll post a patch but I don't plan to push this any further!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]