[ 
https://issues.apache.org/jira/browse/LUCENE-5188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adrien Grand updated LUCENE-5188:
---------------------------------

    Attachment: LUCENE-5188.patch

Here is a patch that slices large chunks (>= twice the configured chunk size) 
into several LZ4 blocks (of chunkSize bytes each). The LZ4 blocks will be 
decompressed as needed so that you don't end up decompressing everything if you 
only need the first field of your document.

A nice side-effect of this patch is that it reduces memory pressure as well 
when working with big documents (LUCENE-4955): since big documents are sliced 
into fixed-size blocks, it is not needed anymore to allocate a byte[] of the 
size of the document (potentially several MB) to decompress it.
                
> Make CompressingStoredFieldsFormat more friendly to StoredFieldVisitors
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-5188
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5188
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>         Attachments: LUCENE-5188.patch
>
>
> The way CompressingStoredFieldsFormat works is that it first decompresses 
> data and then consults the StoredFieldVisitor. This is a bit wasteful in case 
> documents are big and only the first field of a document is of interest so 
> maybe we could decompress and consult the StoredFieldVicitor in a more 
> streaming fashion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to