Getting this error when reading an Avro file on Amazon EMR Hadoop. Does not
occur on any recent Apache Hadoop build.
Exception org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid
sync!
org.apache.avro.AvroRuntimeException: java.io.IOException: Invalid sync!
at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:176)
at Abc.readAvroFile(Abc.java:28)
at Abc.main(Abc.java:65)
Caused by: java.io.IOException: Invalid sync!
at
org.apache.avro.file.DataFileStream.nextRawBlock(DataFileStream.java:258)
at org.apache.avro.file.DataFileStream.hasNext(DataFileStream.java:164)
... 2 more
Source code that throws the Invalid sync! exception indicates low level IO
problem:
{code}
244 DataBlock nextRawBlock(DataBlock reuse) throws IOException {
245 if (!hasNextBlock()) {
246 throw new NoSuchElementException();
247 }
248 if (reuse == null || reuse.data.length < (int) blockSize) {
249 reuse = new DataBlock(blockRemaining, (int) blockSize);
250 } else {
251 reuse.numEntries = blockRemaining;
252 reuse.blockSize = (int)blockSize;
253 }
254 // throws if it can't read the size requested
255 vin.readFixed(reuse.data, 0, reuse.blockSize);
256 vin.readFixed(syncBuffer);
257 if (!Arrays.equals(syncBuffer, sync))
258 throw new IOException("Invalid sync!");
259 availableBlock = false;
260 return reuse;
261 }
{code}
Looks like this commit from Doug Cutting removed those error messages:
http://www.mail-archive.com/[email protected]/msg00218.html
Anyone have any clue as to what could cause these errors?
Thanks,
Matt
iCrossing Privileged and Confidential Information
This email message is for the sole use of the intended recipient(s) and may
contain confidential and privileged information of iCrossing. Any unauthorized
review, use, disclosure or distribution is prohibited. If you are not the
intended recipient, please contact the sender by reply email and destroy all
copies of the original message.