[
https://issues.apache.org/jira/browse/LUCENE-5722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-5722:
----------------------------------
Attachment: LUCENE-5722.patch
After digging more, I found out that the bug in LUCENE-5658 is not related to
that one. The problem was a bug in the offset calculation of the single buffer
special case. The test created a slice from a two-buffer input that started at
the buffer boundary. The result was a slice with one buffer, so the
optimization applied. But Robert's patch was missing to apply the
chunkSizeMask, so the offset was still the one from the two buffer case.
The applied patch also contains the test for LUCENE-5658, which passes of
course. I also added a test for the special-case Robert has seen.
This patch still uses the extra 0-byte buffer at the end. We may improve this
to handle this and only use a single-buffer indexinput, but the added
complexity in buildSlice is not worth to do it.
I think we can test performance now.
In addition, there may be another improvement for the default impl's seek. But
we should check this separately. I will upload a separate patch tomorrow.
> Speed up MMapDirectory.seek()
> -----------------------------
>
> Key: LUCENE-5722
> URL: https://issues.apache.org/jira/browse/LUCENE-5722
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Attachments: LUCENE-5722.patch, LUCENE-5722.patch, LUCENE-5722.patch,
> LUCENE-5722.patch
>
>
> For traditional lucene access which is mostly sequential, occasional
> advance(), I think this method gets drowned out in noise.
> But for access like docvalues, its important. Unfortunately seek() is complex
> today because of mapping multiple buffers.
> However, the very common case is that only one map is used for a given clone
> or slice.
> When there is the possibility to use only a single mapped buffer, we should
> instead take advantage of ByteBuffer.slice(), which will adjust the internal
> mmap address and remove the offset calculation. furthermore we don't need the
> shift/mask or even the negative check, as they are then all handled with the
> ByteBuffer api: seek is a one-liner (with try/catch of course to convert
> exceptions).
> This makes docvalues access 20% faster, I havent tested conjunctions or
> anyhting like that.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]