[
https://issues.apache.org/jira/browse/LUCENE-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13713685#comment-13713685
]
Robert Muir commented on LUCENE-5122:
-------------------------------------
I think the current setup is just the way it is because it was easy: originally
this codec only worked with two primitives of numeric[] and byte[] and built
support for Sorted on top of combinations of these.
Its worth a benchmark to see what the overhead really is in practice, ill look
at it.
I also suspect we aren't doing the best thing for SortedSet (addressing into a
large packedints stream). You guys dug into this for lucene's faceting before
and I think the result of that was that delta-encoded vbyte lists per-document
was the fastest... so I've been wanting to try to run some benchmarks here with
that in mind too...
> DiskDV probably shouldnt use BlockPackedReader for SortedDV doc-to-ord
> ----------------------------------------------------------------------
>
> Key: LUCENE-5122
> URL: https://issues.apache.org/jira/browse/LUCENE-5122
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Robert Muir
>
> I dont think "blocking" provides any benefit here in general. we can assume
> the ordinals are essentially random and since SortedDV is single-valued, its
> probably better to just use the simpler packedints directly?
> I guess the only case where it would help is if you sorted your segments by
> that DV field. But that seems kinda wierd/esoteric to sort your index by a
> deref'ed string value, e.g. I don't think its even supported by SortingMP.
> For the SortedSet "ord stream", this can exceed 2B values so for now I think
> it should stay as blockpackedreader. but it could use a large blocksize...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]