[
https://issues.apache.org/jira/browse/LUCENE-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558696#comment-13558696
]
Shai Erera commented on LUCENE-4627:
------------------------------------
So, LUCENE-4602 went out without the migration layer. Rather, it provides
FacetsPayloadMigrationReader which do a one-time migration of payload data to
DocValues.
The DOCS_ONLY case doesn't require migration per se. It is ok for old indexes
to retain the existing drill-down terms' posting lists with positions, as this
information won't be accessed during search. New documents will be added w/
DOCS_ONLY, and thus positions will be removed from old segments, as they are
merged.
The other side of the migration story, is migrating how facets are encoded in
the DocValues. E.g. today they are encoded using dgap+vint. This can be solved
in two ways, both by the user:
* Provide a CategoryListDataMigratingReader, which will read the data using the
old IntDecoder and encode using the new IntEncoder. If required, e.g. if we'll
come up with a better encoder in LUCENE-4609, we can provide such utility.
* The user can keep the data encoded as-is, and write a special
CategoryListParams which returns a CategoryListIterator, whose setNextReader
loads the appropriate IntDecoder for that reader. Since the reader provides
information about which version of Lucene created it, it should be an easy task.
** Hmmm ... but as soon as an old and new segments are merged, the data in the
DocValues will be mixed, so that's a viable solution only as long as segments
aren't merged.
I think that if we'll introduce another index break for facets, we should
revisit this. For now, I don't think that much is needed to be done here.
> Migration layer for facets
> --------------------------
>
> Key: LUCENE-4627
> URL: https://issues.apache.org/jira/browse/LUCENE-4627
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/facet
> Reporter: Shai Erera
>
> Spin-off from LUCENE-4602 (and LUCNE-4623). It will be good if we can develop
> some migration layer so that users don't need to re-index their content when
> we change how facets are written in the index. Currently the two open issues
> are cut over to DV and index drill-down terms w/ DOCS_ONLY, but in the future
> there could be other changes.
> I don't think that this layer needs to be very heavy. Something in the form
> of a FacetsAtomicReaderWrapper. For instance, to support the DV migration, we
> can implement a PayloadFacetsAtomicReader which translates the payload to DV
> API (i.e. its docValues() API will actually read from the payload).
> We'd need some API on IW I think to initialize that reader, so that data can
> be migrated from payload to DV during segment merges.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]