I'm always suffering Spark SQL job fails with error "Container exited with
a non-zero exit code 143".
I know that it was casused by the memory used execeeds the limits of
spark.yarn.executor.memoryOverhead. As shown below, memory allocation
request was failed at 18/11/08 17:36:05, then it RECEIVED SIGNAL TERM. Can
spark executor avoid the fate of being destroyed ?

my conf:
--master yarn-client \
--driver-memory 10G \
--executor-memory 10G \
--executor-cores 5 \
--num-executors 12 \
--conf "spark.executor.extraJavaOptions= -XX:MaxPermSize=256M" \
--conf "spark.sql.shuffle.partitions=200" \
--conf "spark.scheduler.mode=FAIR" \
--conf "spark.yarn.executor.memoryOverhead=2048" \

18/11/08 17:35:52 INFO [Executor task launch worker for task 13694]
FileScanRDD: Reading File path: hdfs://
range: 134217728-268435456, partition values: [20180103]
18/11/08 17:35:52 INFO [Executor task launch worker for task 13700]
FileScanRDD: Reading File path: hdfs://
range: 402653184-536870912, partition values: [20180104]
18/11/08 17:35:52 INFO [Executor task launch worker for task 13688]
FileScanRDD: Reading File path: hdfs://
range: 134217728-268435456, partition values: [20180101]
18/11/08 17:35:52 INFO [Executor task launch worker for task 13694]
TorrentBroadcast: Started reading broadcast variable 135
18/11/08 17:35:52 INFO [Executor task launch worker for task 13694]
MemoryStore: Block broadcast_135_piece0 stored as bytes in memory
(estimated size 27.2 KB, free 1822.3 MB)
18/11/08 17:35:52 INFO [Executor task launch worker for task 13694]
TorrentBroadcast: Reading broadcast variable 135 took 3 ms
18/11/08 17:35:52 INFO [Executor task launch worker for task 13694]
MemoryStore: Block broadcast_135 stored as values in memory (estimated size
365.6 KB, free 1821.9 MB)
18/11/08 17:36:00 INFO [Executor task launch worker for task 13700]
ShuffleExternalSorter: Thread 1100 spilling sort data of 580.0 MB to disk
(0  time so far)
18/11/08 17:36:03 INFO [Executor task launch worker for task 13688]
ShuffleExternalSorter: Thread 1098 spilling sort data of 580.0 MB to disk
(0  time so far)
18/11/08 17:36:05 WARN [Executor task launch worker for task 13694]
TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
18/11/08 17:36:05 INFO [Executor task launch worker for task 13694]
ShuffleExternalSorter: Thread 1099 spilling sort data of 514.0 MB to disk
(0  time so far)
18/11/08 17:36:05 ERROR [SIGTERM handler] CoarseGrainedExecutorBackend:
18/11/08 17:36:05 INFO [Thread-2] DiskBlockManager: Shutdown hook called
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Shutdown hook called
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory
18/11/08 17:36:10 INFO [Thread-2] ShutdownHookManager: Deleting directory

Reply via email to