[ 
https://issues.apache.org/jira/browse/LUCENE-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12848305#action_12848305
 ] 

Michael McCandless commented on LUCENE-2340:
--------------------------------------------

bq. This can be problematic and causes a big overhead when using large 
blockSize (e.g., 1024), on small segments or on rare term posting list.

The block is "shared" across postings, so a rare posting list in an otherwise 
big segment should be fine?

Small segments will indeed be wasteful, but they'll presumably quickly be 
merged away.

bq. The new implementation of SimpleIntBlockIndex* is even more silly than the 
previous one, and store a vint at the beginning of each block for recording the 
length of a block.

Would other less-silly impls also need to do this?  Ie the thing I want to 
avoid is foisting onto all block-based codecs the need to track the size of 
every block...

> FixedIntBlockIndexOutput encodes unnecessary integers at the end of a list
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-2340
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2340
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: Flex Branch
>            Reporter: Renaud Delbru
>            Priority: Minor
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-1458-FixedIntBlockIndexOutput.patch, 
> LUCENE-1458-FixedIntBlockIndexOutput.patch
>
>
> At closing time, the current FixedIntBlockIndexOutput flushes blocks of 
> blockSize even if there is only a few integers in the block.
> This can be problematic and causes a big overhead when using large blockSize 
> (e.g., 1024), on small segments or on rare term posting list. 
> One solution will be to have a secondary flushBlock method with an additional 
> paramter: the valid length of a buffer. This method will be only called in 
> the FixedIntBlockIndexOutput#close() method.
> The way this particular block of integers are encoded are left to subclasses.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to