[ 
https://issues.apache.org/jira/browse/LUCENE-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530923#comment-13530923
 ] 

Shai Erera commented on LUCENE-4602:
------------------------------------

Who said anything about blocking progress? All I'm saying is that before we 
release these improvements, we need to have a migration plan. These are not 
just my customers. I think that there are other people that use this module, at 
least from questions that pop up here and there on the list.

Sure, we can tell everyone to re-index. But that's not how I prefer to work. I 
don't think that cutting over to DV is the only migration we should talk about. 
E.g. LUCENE-4623 would also require migration and any change in the future to 
how we decide to store/encode facets would require migration.

It would be good if we can think about a layer that will provide that 
migration. Today we have Codecs and Lucene guarantees that old segments will be 
read w/ old Codecs versions (per our back-compat policy). What I would like to 
develop is something similar, which can read facets from old segments in the 
old way, and ultimately when segments are merged, migrate data to the new 
format. Then we can tell customers that if they didn't migrate their indexes 
when Lucene 6.0 is released, they have to addIndexes or forceMerge or something.

I know that this module has the @lucene.experimental tag on it all over the 
place, but I don't treat it as experimental at all. I would prefer that you 
help me develop this migration layer, even if just by contributing ideas, 
rather than tell me that it's my problem and that I get paid to solve it :).

I don't think that we should release code that breaks all apps out there and 
forces them to reindex. Unless the changes are really non-migratable. But in 
this case, I think it should be easy? If you want to chime in, I'll open a 
separate issue to discuss this.
                
> Use DocValues to store per-doc facet ord
> ----------------------------------------
>
>                 Key: LUCENE-4602
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4602
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>         Attachments: LUCENE-4602.patch, LUCENE-4602.patch
>
>
> Spinoff from LUCENE-4600
> DocValues can be used to hold the byte[] encoding all facet ords for
> the document, instead of payloads.  I made a hacked up approximation
> of in-RAM DV (see CachedCountingFacetsCollector in the patch) and the
> gains were somewhat surprisingly large:
> {noformat}
>                     Task    QPS base      StdDev    QPS comp      StdDev      
>           Pct diff
>                 HighTerm        0.53      (0.9%)        1.00      (2.5%)   
> 87.3% (  83% -   91%)
>                  LowTerm        7.59      (0.6%)       26.75     (12.9%)  
> 252.6% ( 237% -  267%)
>                  MedTerm        3.35      (0.7%)       12.71      (9.0%)  
> 279.8% ( 268% -  291%)
> {noformat}
> I didn't think payloads were THAT slow; I think it must be the advance
> implementation?
> We need to separately test on-disk DV to make sure it's at least
> on-par with payloads (but hopefully faster) and if so ... we should
> cutover facets to using DV.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to