[
https://issues.apache.org/jira/browse/LUCENE-7399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adrien Grand updated LUCENE-7399:
---------------------------------
Attachment: LUCENE-7399.patch
Here is a new patch, I fixed assertHistogram to be called in an assertion and
added the suggested docs.
bq. Maybe the visitor should also take BytesRef? Codec impls could read a whole
byte[] values block in at once
I am not sure codecs could leverage this. I think a serious codec impl would do
prefix compression to save space, so it could not read large byte[] anyway as
it would need to concatenate the shared prefix and the suffix that is specific
to the value at every iteration?
bq. We could also fix BKDWriter.writeCommonPrefixes to save the copy there,
though that's just once per leaf block.
I remember trying it out and it didn't help.
bq. Have you tweaked 20 to see if that's a good value? Sorting BKD points is
rather costly since when we swap, we swap whole values (docID, maybe ord, then
the byte[] value for this field).
I remember tweaking it a long time ago when I worked in this Sorter
abstraction, and values in [20,50] looked fine when sorting a simple int[] (so
both comparisons and swaps were cheap) so I picked 20 to err on the safe side.
It's true it might be different with points that have costly swaps.
> Speed up flush of points v2
> ---------------------------
>
> Key: LUCENE-7399
> URL: https://issues.apache.org/jira/browse/LUCENE-7399
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Adrien Grand
> Assignee: Adrien Grand
> Priority: Minor
> Attachments: LUCENE-7399.patch, LUCENE-7399.patch
>
>
> There are improvements we can make on top of LUCENE-7396 to get ever better
> flush performance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]