[
https://issues.apache.org/jira/browse/LUCENE-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14262300#comment-14262300
]
Robert Muir commented on LUCENE-6153:
-------------------------------------
{quote}
It would be nice for these to be consistent.
{quote}
+1
{quote}
Can we have a constant for default block size = 1024. Also might as well have
constants for whatever 1 << 14 and 128 are, but that can be a follow on issue.
{quote}
Here i'm not sure I agree: a named constant will tell us what the parameter is
(since you cannot do this in java, which is annoying), but causes unnecessary
indirection, you lose locality of reference when reading the code. I would
rather do it like this:
{noformat}
return new CompressingStoredFieldsFormat("Lucene50StoredFieldsFast", // codec
name
CompressionMode.FAST, // lz4
1 << 14, // block
size
128, //
maximum number of docs per block (to avoid worst cases)
1024 // chunk
size (index interval as number of blocks)
);
{noformat}
> randomize stored fields/vectors index blocksize
> -----------------------------------------------
>
> Key: LUCENE-6153
> URL: https://issues.apache.org/jira/browse/LUCENE-6153
> Project: Lucene - Core
> Issue Type: Test
> Reporter: Robert Muir
> Attachments: LUCENE-6153.patch
>
>
> the Compressing impls compress documents into chunks. We then record index
> data for every N chunks, which is binary searched to find the start of the
> chunk. today this is always 1024.
> This means to test the stored fields index well, we need to index thousands
> and thousands of documents. But if we randomize the parameter, we can test it
> more effectively by setting it to very low values (e.g. 5) in tests.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]