[ 
https://issues.apache.org/jira/browse/LUCENE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964245#comment-13964245
 ] 

Adrien Grand commented on LUCENE-5583:
--------------------------------------

I didn't think much about the thread-local, I just thought it would be nice to 
minimize the number of buffer instances. I will make it work like copyBytes for 
consistency.

> Should BufferedChecksumIndexInput have its own buffer?
> ------------------------------------------------------
>
>                 Key: LUCENE-5583
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5583
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: 4.8
>            Reporter: Adrien Grand
>            Priority: Blocker
>             Fix For: 4.8, 5.0
>
>         Attachments: LUCENE-5583.patch
>
>
> I was playing with on-the-fly checksum verification and this made me stumble 
> upon an issue with {{BufferedChecksumIndexInput}}.
> I have some code that skips over a {{DataInput}} by reading bytes into 
> /dev/null, eg.
> {code}
>   private static final byte[] SKIP_BUFFER = new byte[1024];
>   private static void skipBytes(DataInput in, long numBytes) throws 
> IOException {
>     assert numBytes >= 0;
>     for (long skipped = 0; skipped < numBytes; ) {
>       final int toRead = (int) Math.min(numBytes - skipped, 
> SKIP_BUFFER.length);
>       in.readBytes(SKIP_BUFFER, 0, toRead);
>       skipped += toRead;
>     }
>   }
> {code}
> It is fine to read into this static buffer, even from multiple threads, since 
> the content that is read doesn't matter here. However, it breaks with 
> {{BufferedChecksumIndexInput}} because of the way that it updates the 
> checksum:
> {code}
>   @Override
>   public void readBytes(byte[] b, int offset, int len)
>     throws IOException {
>     main.readBytes(b, offset, len);
>     digest.update(b, offset, len);
>   }
> {code}
> If you are unlucky enough so that a concurrent call to {{skipBytes}} started 
> modifying the content of {{b}} before the call to {{digest.update(b, offset, 
> len)}} finished, then your checksum will be wrong.
> I think we should make {{BufferedChecksumIndexInput}} read into a private 
> buffer first instead of relying on the user-provided buffer.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to