Hi I am getting out of memory error in worker log in streaming jobs in every couple of hours. After this worker dies. There is no shuffle, no aggression, no. caching in job, its just a transformation. I'm not able to identify where is the problem, driver or executor. And why worker getting dead after the OOM streaming job should die. Am I missing something.
Driver Memory: 2g Executor memory: 4g Spark Version: 2.4 Kafka Direct Stream Spark Standalone Cluster. 20/05/06 12:52:20 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 20/05/06 12:53:03 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[ExecutorRunner for app-20200506124717-10226/0,5,main] java.lang.OutOfMemoryError: Java heap space at org.apache.xerces.util.XMLStringBuffer.append(Unknown Source) at org.apache.xerces.impl.XMLEntityScanner.scanData(Unknown Source) at org.apache.xerces.impl.XMLScanner.scanComment(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanComment(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source) at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2480) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1143) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1115) at org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopConfigurations(SparkHadoopUtil.scala:464) at org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:436) at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:114) at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:114) at org.apache.spark.deploy.worker.ExecutorRunner.org $apache$spark$deploy$worker$ExecutorRunner$$fetchAndRunExecutor(ExecutorRunner.scala:149) at org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:73) 20/05/06 12:53:38 INFO DriverRunner: Worker shutting down, killing driver driver-20200505181719-1187 20/05/06 12:53:38 INFO DriverRunner: Killing driver process! Regards Hrishi