[
https://issues.apache.org/jira/browse/LUCENE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530667#comment-13530667
]
Shai Erera commented on LUCENE-4602:
------------------------------------
Awesome !
I think then we shuld focus on making the cut to DV then !
First though, we should have a migration plan. I would prefer that we didn't
force all our existing customers to do a one-time index upgrade, I'm not at all
sure people will be thrilled with the idea. Just to clarify, our=my customers
are not the ones that hold the indexes, they are the ones that develop the
products and will need to face the *real* end-customer telling him that he
needs to upgrade his potentially uber-large index ...
So what I was thinking is if the abstraction CLI layer could be used to fetch
the ordinals from either payload or DV, based on the segment's version? And
also, migrate segments gradually, e.g. as they are merged? Would we need a
FacetsAtomicReader for that, which initializes the CLI per the segment version?
Is it even possible to move payload data to DV during segment merging? I.e.,
even in the one-time upgrade...
Please don't mention re-indexing :).
> Use DocValues to store per-doc facet ord
> ----------------------------------------
>
> Key: LUCENE-4602
> URL: https://issues.apache.org/jira/browse/LUCENE-4602
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Attachments: LUCENE-4602.patch, LUCENE-4602.patch
>
>
> Spinoff from LUCENE-4600
> DocValues can be used to hold the byte[] encoding all facet ords for
> the document, instead of payloads. I made a hacked up approximation
> of in-RAM DV (see CachedCountingFacetsCollector in the patch) and the
> gains were somewhat surprisingly large:
> {noformat}
> Task QPS base StdDev QPS comp StdDev
> Pct diff
> HighTerm 0.53 (0.9%) 1.00 (2.5%)
> 87.3% ( 83% - 91%)
> LowTerm 7.59 (0.6%) 26.75 (12.9%)
> 252.6% ( 237% - 267%)
> MedTerm 3.35 (0.7%) 12.71 (9.0%)
> 279.8% ( 268% - 291%)
> {noformat}
> I didn't think payloads were THAT slow; I think it must be the advance
> implementation?
> We need to separately test on-disk DV to make sure it's at least
> on-par with payloads (but hopefully faster) and if so ... we should
> cutover facets to using DV.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]