Hi,
I was finally able to reproduce an error that was driving me crazy for the
last few weeks.
I get a reproducable data corruption when I'm reading a file from the dfs.
Reading the file locally does seem to work fine, also making sure to only
read at most 2048 bytes at a time, seems to skip the bug as well. This
happens on the latest release candidate 0.18.0 and also on older versions (I
was using the cvs version from 0.17.2 before).
here is the code I'm using:
//initialize...DistributedFileSystem
FSDataInputStream in = remoteFileSystem.open(new Path(feoutput));
byte buffer[] = new byte[4];
while (in.available() >= 1) {
in.read(buffer, 0, 4); // id (ignore)
int id = BinConverter.byteArrayToInt(buffer, 0);
in.read(buffer, 0, 4); // size
int size = BinConverter.byteArrayToInt(buffer, 0);
log.info("ID %d -> %d", id, size);
byte[] aaa = new byte[size];
//THIS WORKS!!!
while (size > 0) {
int read = in.read(aaa, aaa.length-size, (int)
Math.min(2048, size));
size -= read;
}
//OR THIS (but I want to use the data)
in.skip((long) size);
//AND THIS DOESN0T!
in.read(aaa,0,size);
}
in.close();
Interestingly code piece 1 works, and the skip as well, but when reading the
data in all at once I get the following data corruption after a while....
..........
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46126 -> 1322
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46127 -> 1547
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46128 -> 1470
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46129 -> 675
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46130 -> 1666
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46131 -> 765
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46132 -> 574
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46133 -> 1761
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46134 -> 937
08/08/17 17:38:25 INFO indexer.PCreateClusters: ID 46135 -> 942
vs: (here it fails)
......
08/08/17 17:35:36 INFO indexer.PCreateClusters: ID 46133 -> 1761
08/08/17 17:35:36 INFO indexer.PCreateClusters: ID 46134 -> 937
08/08/17 17:35:36 INFO indexer.PCreateClusters: ID 1240951364 ->
-2045632292
java.lang.NegativeArraySizeException
.................
Any ideas on how this can be fixed? I suspect a buffer is wrongly filled in
some cases when the size of the requested read call exceeds a certain limit.
Thanks for your help,
Thibaut
--
View this message in context:
http://www.nabble.com/%22Data-corruption%22-when-reading-file-from-DFS-%28version-0.18.0%2C-0.17.2%29-tp19021171p19021171.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.