[
https://issues.apache.org/jira/browse/LUCENE-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015457#comment-14015457
]
Robert Muir commented on LUCENE-5722:
-------------------------------------
+1 to current patch, this is just more speed on top of previous improvements
today.
With my sort test, its an additional 7-10% (on top of previous commit which was
similar). With a microbenchmark of numericdocvalues the improvement is way more
substantial (it seems ~ 25%)
In order to continue further, after this one is committed I want to exploit
this slice API for packed ints, instead of clone()'ing the whole file in DV we
just slice() what we need, remove offset adjustments in the packed ints
decoder, and actually get more safety (read past EOF if you screw up instead of
reading into another fields packed ints or whatever).
In parallel I will begin work on backporting slice() api to 4.x, its baked for
a while and I think is good to go. Ill start on this now.
> Speed up MMapDirectory.seek()
> -----------------------------
>
> Key: LUCENE-5722
> URL: https://issues.apache.org/jira/browse/LUCENE-5722
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/store
> Reporter: Robert Muir
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5722-multiseek.patch, LUCENE-5722.patch,
> LUCENE-5722.patch, LUCENE-5722.patch, LUCENE-5722.patch, LUCENE-5722.patch
>
>
> For traditional lucene access which is mostly sequential, occasional
> advance(), I think this method gets drowned out in noise.
> But for access like docvalues, its important. Unfortunately seek() is complex
> today because of mapping multiple buffers.
> However, the very common case is that only one map is used for a given clone
> or slice.
> When there is the possibility to use only a single mapped buffer, we should
> instead take advantage of ByteBuffer.slice(), which will adjust the internal
> mmap address and remove the offset calculation. furthermore we don't need the
> shift/mask or even the negative check, as they are then all handled with the
> ByteBuffer api: seek is a one-liner (with try/catch of course to convert
> exceptions).
> This makes docvalues access 20% faster, I havent tested conjunctions or
> anyhting like that.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]