Hello, We are running spark on yarn version 1.4.1 java.vendor=Oracle Corporation java.runtime.version=1.7.0_40-b43 datanucleus-core-3.2.10.jar datanucleus-api-jdo-3.2.6.jar datanucleus-rdbms-3.2.9.jar
IndexIDAttemptStatusLocality LevelExecutor ID / HostLaunch TimeDuration ▾GC TimeInput Size / RecordsWrite TimeShuffle Write Size / RecordsErrors1672060 RUNNINGRACK_LOCAL56 / foo1.net2015/11/14 15:28:565.1 h0.0 B (hadoop) / 00.0 B / 0 IndexIDAttemptStatusLocality LevelExecutor ID / HostLaunch TimeDurationGC TimeInput Size / Records ▴Write TimeShuffle Write Size / RecordsErrors130176 0RUNNINGRACK_LOCAL19 / foo2.net2015/11/14 03:26:2216.8 h82 ms0.0 B (hadoop) / 16592040.0 B / 0 Our spark jobs have been running fine till now, suddenly we saw some lone executors which got 0 records to process, got stuck indefinitely. We killed some jobs which ran for 16+ hours. This seems like a spark bug, is anyone aware of any issue in this version of spark?