Hi,
Trying to run terasort with the latest crail (v1.2-rc2-1-g8a739dd) and I’m 
getting the error below.

(Job aborted due to stage failure: Task 36 in stage 1.0 failed 4 times, most 
recent failure: Lost task 36.3 in stage 1.0)

there is never a getBlock call to that fd (19318) for that task, and I also see 
that the previous fd(19153)
is called 6 times, but with different positions.  Is that wrong, as in perhaps 
the namenode is
getting a collision or is stuck?  I also only see these tasks (36.x) running on 
one executor.

BTW, I should note that I’m not running with,

com.ibm.crail.terasort.sorter.CrailShuffleNativeRadixSorter
or
com.ibm.crail.terasort.serializer.F22Serializer

as I couldn’t get them to run without error.  I’m getting a “NYI” assertion 
error when those are used.
Would this matter?


20/01/09 10:34:35 INFO crail: lookupDirectory: path 
/spark/shuffle/shuffle_0/part_36/1-4-35352996

20/01/09 10:34:35 DEBUG crail: RPC: getFile, writeable false

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: lookup: name 
/spark/shuffle/shuffle_0/part_36/1-4-35352996, success, fd 19318

20/01/09 10:34:35 INFO crail: CoreInputStream: open, path  
/spark/shuffle/shuffle_0/part_36/1-4-35352996, fd 19318, streamId 836, isDir 
false, readHint 4754948

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 19153, token 0, position 
2097152, capacity 7070730

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 19153, token 0, position 
3145728, capacity 7070730

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 19153, token 0, position 
4194304, capacity 7070730

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: lookupDirectory: path 
/spark/shuffle/shuffle_0/part_54/1-3-35352997

20/01/09 10:34:35 DEBUG crail: RPC: getFile, writeable false

20/01/09 10:34:35 INFO crail: lookup: name 
/spark/shuffle/shuffle_0/part_54/1-3-35352997, success, fd 19079

20/01/09 10:34:35 INFO crail: CoreInputStream: open, path  
/spark/shuffle/shuffle_0/part_54/1-3-35352997, fd 19079, streamId 837, isDir 
false, readHint 7086206

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 19079, token 0, position 
1048576, capacity 7086206

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 19153, token 0, position 
5242880, capacity 7070730

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: lookupDirectory: path 
/spark/shuffle/shuffle_0/part_36/3-1-35352995

20/01/09 10:34:35 DEBUG crail: RPC: getFile, writeable false

20/01/09 10:34:35 INFO crail: lookup: name 
/spark/shuffle/shuffle_0/part_36/3-1-35352995, success, fd 18715

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: CoreInputStream: open, path  
/spark/shuffle/shuffle_0/part_36/3-1-35352995, fd 18715, streamId 838, isDir 
false, readHint 9487318

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 19153, token 0, position 
6291456, capacity 7070730

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 18715, token 0, position 
1048576, capacity 9487318

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 19079, token 0, position 
2097152, capacity 7086206

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 19079, token 0, position 
3145728, capacity 7086206

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 18715, token 0, position 
2097152, capacity 9487318

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 DEBUG crail: RPC: getBlock, fd 18715, token 0, position 
3145728, capacity 9487318

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: EndpointCache hit /192.168.2.100:4420, fsId 0

20/01/09 10:34:35 INFO crail: lookupDirectory: path 
/spark/shuffle/shuffle_0/part_55/1-4-35352996

20/01/09 10:34:35 DEBUG crail: RPC: getFile, writeable false

20/01/09 10:34:35 INFO crail: lookup: name 
/spark/shuffle/shuffle_0/part_55/1-4-35352996, success, fd 19337

20/01/09 10:34:35 INFO crail: CoreInputStream: open, path  
/spark/shuffle/shuffle_0/part_55/1-4-35352996, fd 19337, streamId 839, isDir 
false, readHint 4764488



Regards,

           David

C: 714-476-2692


Reply via email to