[ 
https://issues.apache.org/jira/browse/LUCENE-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713685#comment-13713685
 ] 

Robert Muir commented on LUCENE-5122:
-------------------------------------

I think the current setup is just the way it is because it was easy: originally 
this codec only worked with two primitives of numeric[] and byte[] and built 
support for Sorted on top of combinations of these.

Its worth a benchmark to see what the overhead really is in practice, ill look 
at it.

I also suspect we aren't doing the best thing for SortedSet (addressing into a 
large packedints stream). You guys dug into this for lucene's faceting before 
and I think the result of that was that delta-encoded vbyte lists per-document 
was the fastest... so I've been wanting to try to run some benchmarks here with 
that in mind too...

                
> DiskDV probably shouldnt use BlockPackedReader for SortedDV doc-to-ord
> ----------------------------------------------------------------------
>
>                 Key: LUCENE-5122
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5122
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Robert Muir
>
> I dont think "blocking" provides any benefit here in general. we can assume 
> the ordinals are essentially random and since SortedDV is single-valued, its 
> probably better to just use the simpler packedints directly? 
> I guess the only case where it would help is if you sorted your segments by 
> that DV field. But that seems kinda wierd/esoteric to sort your index by a 
> deref'ed string value, e.g. I don't think its even supported by SortingMP.
> For the SortedSet "ord stream", this can exceed 2B values so for now I think 
> it should stay as blockpackedreader. but it could use a large blocksize...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to