[ https://issues.apache.org/jira/browse/LUCENE-5583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964154#comment-13964154 ]
Uwe Schindler commented on LUCENE-5583: --------------------------------------- +1 on having SkipBytes. One other thing: Currently ChecksumIndexInput throws UnsupportedEx if you try to seek, but in a subclass we suddenly allow it again (only forward). Maybe we should move the code up to ChecksumIndexinput and document that seeking only works forward and may be costly (because it has to read)? In that case we implement that with skipBytes(), too. This would allow to use ChecksumIndexinput also for other codec parts while merging. > Should BufferedChecksumIndexInput have its own buffer? > ------------------------------------------------------ > > Key: LUCENE-5583 > URL: https://issues.apache.org/jira/browse/LUCENE-5583 > Project: Lucene - Core > Issue Type: Bug > Affects Versions: 4.8 > Reporter: Adrien Grand > > I was playing with on-the-fly checksum verification and this made me stumble > upon an issue with {{BufferedChecksumIndexInput}}. > I have some code that skips over a {{DataInput}} by reading bytes into > /dev/null, eg. > {code} > private static final byte[] SKIP_BUFFER = new byte[1024]; > private static void skipBytes(DataInput in, long numBytes) throws > IOException { > assert numBytes >= 0; > for (long skipped = 0; skipped < numBytes; ) { > final int toRead = (int) Math.min(numBytes - skipped, > SKIP_BUFFER.length); > in.readBytes(SKIP_BUFFER, 0, toRead); > skipped += toRead; > } > } > {code} > It is fine to read into this static buffer, even from multiple threads, since > the content that is read doesn't matter here. However, it breaks with > {{BufferedChecksumIndexInput}} because of the way that it updates the > checksum: > {code} > @Override > public void readBytes(byte[] b, int offset, int len) > throws IOException { > main.readBytes(b, offset, len); > digest.update(b, offset, len); > } > {code} > If you are unlucky enough so that a concurrent call to {{skipBytes}} started > modifying the content of {{b}} before the call to {{digest.update(b, offset, > len)}} finished, then your checksum will be wrong. > I think we should make {{BufferedChecksumIndexInput}} read into a private > buffer first instead of relying on the user-provided buffer. -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org