[
https://issues.apache.org/jira/browse/LUCENE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand updated LUCENE-5703:
---------------------------------
Attachment: LUCENE-5703.patch
Here is a patch that switches BinaryDocValues to the discussed API, as well as
Sorted(Set)DocValues.lookupOrd for consistency.
- the default codec as well as memory, direct and disk don't allocate the
byte[] anymore in BinaryDocValues.get.
- the default codec takes advantage of the maximum length of binary terms,
which is exposed in the metadata to never have to resize the BytesRef that
stores the term.
- old codecs (lucene40, lucene42) have moved to the new API but still allocate
the byte[] on the fly
- fixed grouping and comparators to not assume they own the bytes
- removed the two tests from BaseDocValuesFormatTestCase that ensured that
each return value had its own bytes
Tests pass (I ran the whole suite 6 times already) and I'll run benchmarks soon
to make sure that doesn't introduce a performance regression.
> Don't allocate/copy bytes all the time in binary DV producers
> -------------------------------------------------------------
>
> Key: LUCENE-5703
> URL: https://issues.apache.org/jira/browse/LUCENE-5703
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Fix For: 4.9, 5.0
>
> Attachments: LUCENE-5703.patch
>
>
> Our binary doc values producers keep on creating new {{byte[]}} arrays and
> copying bytes when a value is requested, which likely doesn't help
> performance. This has been done because of the way fieldcache consumers used
> the API, but we should try to fix it in 5.0.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]