cause of RPC error?

AlexG Thu, 04 Feb 2016 13:35:21 -0800

I am simply trying to load an RDD from disk with
transposeRowsRDD.avro(baseInputFname).rdd.map( <some pointwise function
here, not important>)
and I get this error in my log:


16/02/04 11:44:07 ERROR TaskSchedulerImpl: Lost executor 7 on nid00788:
Remote RPC client disassociated. Likely due to containers exceeding
thresholds, or network issues. Check driver logs for WARN messages.

When I check the log for that node (I guess this is what it means by
driver?) I see:

16/02/04 11:43:55 INFO TorrentBroadcast: Started reading broadcast variable
152
16/02/04 11:43:55 INFO MemoryStore: Block broadcast_152_piece0 stored as
bytes in memory (estimated size 19.3 KB, free 8.8 GB)
16/02/04 11:43:55 INFO TorrentBroadcast: Reading broadcast variable 152 took
4 ms
16/02/04 11:43:55 INFO MemoryStore: Block broadcast_152 stored as values in
memory (estimated size 364.3 KB, free 8.8 GB)
16/02/04 11:43:56 INFO MemoryStore: Block rdd_1634_1637 stored as values in
memory (estimated size 24.0 B, free 8.8 GB)
16/02/04 11:43:56 INFO Executor: Finished task 1637.0 in stage 0.0 (TID
1637). 2602 bytes result sent to driver
16/02/04 11:43:56 INFO CoarseGrainedExecutorBackend: Got assigned task 1643
16/02/04 11:43:56 INFO Executor: Running task 1643.0 in stage 0.0 (TID 1643)
16/02/04 11:43:56 INFO CacheManager: Partition rdd_1634_1643 not found,
computing it
16/02/04 11:43:56 INFO HadoopRDD: Input split:
file:/global/cscratch1/sd/gittens/CFSRA/CFSRAparquetTranspose/CFSRAparquetTranspose0/part-r-00007-ddeb3951-d7da-4926-a16c-d54d71850131.avro:134217728+33554432

So the executor seems to have crashed without any error message being
emitted, with plenty of memory on hand (I also grepped for WARNING messages,
didn't see any). Any idea on what might be happening, or how to debug?
Several other executors are also lost with the same behavior. I'm using
Spark in standalone mode.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/cause-of-RPC-error-tp26151.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

cause of RPC error?

Reply via email to