[
https://issues.apache.org/jira/browse/AVRO-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doug Cutting updated AVRO-541:
------------------------------
Attachment: AVRO-541.patch
Here's a patch (not for commit) that forces this to fail every time,
deterministically, by hardwiring the random seed to a value that triggers
failures.
Some observations:
- failures appear to happen roughly 1/16 times
- failures observed were always when appending a compressed file to an
uncompressed
- in this particular failure, the final bytes of an appended block are
incorrect, just before the sync marker. These bytes should be '355 205 335 356
236 r' but are instead 'w y b p d D 215 252 270 335 324 335'.
I found this by looking for the value that fails the unit test:
expected:<{"stringField": "dwvpxfdknqocdbppkpjfkmkmppcowqcmw", "longField":
-4115970600535328707}> but was:<{"stringField":
"dwvpxfdknqocdbppkpjfkmkmppcowqcmw", "longField": -125568963}>
One can scan the file for "dwv..." to find where this should be. Fortunately
the bug is in an uncompressed file, build/test/test-null-A.avro. To find what
the bytes for "longField" should be, one can look for "dwv..." in
build/test/test-null-A.avro. Note that the sync marker, unique per file, is
found following the null byte following the schema text at the head of the file.
So it appears that, for some reason, the uncompressed data buffer that's
appended in this is both too long and contains some incorrect data at its end.
I have no idea yet why.
Scott, as the author of much of this, do you have any idea?
> Java: TestDataFileConcat sometimes fails
> ----------------------------------------
>
> Key: AVRO-541
> URL: https://issues.apache.org/jira/browse/AVRO-541
> Project: Avro
> Issue Type: Bug
> Components: java
> Reporter: Doug Cutting
> Priority: Critical
> Fix For: 1.4.0
>
> Attachments: AVRO-541.patch
>
>
> TestDataFileConcat intermittently fails with:
> {code}
> Testcase: testConcateateFiles[5] took 0.032 sec
> Caused an ERROR
> java.io.IOException: Block read partially, the data may be corrupt
> org.apache.avro.AvroRuntimeException: java.io.IOException: Block read
> partially, the data may be corrupt
> at
> org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:173)
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:193)
> at
> org.apache.avro.TestDataFileConcat.testConcateateFiles(TestDataFileConcat.java:141)
> Caused by: java.io.IOException: Block read partially, the data may be corrupt
> at
> org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:157)
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.