These errors are completely clueless. No clue why its OOM exception is
coming.


20/05/08 15:36:55 INFO Worker: Asked to kill driver
driver-20200508153502-1291

20/05/08 15:36:55 INFO DriverRunner: Killing driver process!

20/05/08 15:36:55 INFO CommandUtils: Redirection to
/grid/1/spark/work/driver-20200508153502-1291/stderr closed: Stream closed

20/05/08 15:36:55 INFO CommandUtils: Redirection to
/grid/1/spark/work/driver-20200508153502-1291/stdout closed: Stream closed

20/05/08 15:36:55 INFO ExternalShuffleBlockResolver: Application
app-20200508153654-11776 removed, cleanupLocalDirs = true

20/05/08 15:36:55 INFO Worker: Driver driver-20200508153502-1291 was killed
by user

*20/05/08 15:43:06 WARN AbstractChannelHandlerContext: An exception
'java.lang.OutOfMemoryError: Java heap space' [enable DEBUG level for full
stacktrace] was thrown by a user handler's exceptionCaught() method while
handling the following exception:*

*java.lang.OutOfMemoryError: Java heap space*

*20/05/08 15:43:23 ERROR SparkUncaughtExceptionHandler: Uncaught exception
in thread Thread[dispatcher-event-loop-6,5,main]*

*java.lang.OutOfMemoryError: Java heap space*

*20/05/08 15:43:17 WARN AbstractChannelHandlerContext: An exception
'java.lang.OutOfMemoryError: Java heap space' [enable DEBUG level for full
stacktrace] was thrown by a user handler's exceptionCaught() method while
handling the following exception:*

*java.lang.OutOfMemoryError: Java heap space*

20/05/08 15:43:33 INFO ExecutorRunner: Killing process!

20/05/08 15:43:33 INFO ExecutorRunner: Killing process!

20/05/08 15:43:33 INFO ExecutorRunner: Killing process!

20/05/08 15:43:33 INFO ShutdownHookManager: Shutdown hook called

20/05/08 15:43:33 INFO ShutdownHookManager: Deleting directory
/grid/1/spark/local/spark-e045e069-e126-4cff-9512-d36ad30ee922


On Thu, May 7, 2020 at 10:16 PM Hrishikesh Mishra <sd.hri...@gmail.com>
wrote:

> It's only happening for Hadoop config. The exceptions trace are different
> for each time it gets died. And Jobs run for couple hours then worker dies.
>
> Another Reason:
>
> *20/05/02 02:26:34 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in thread Thread[ExecutorRunner for app-20200501213234-9846/3,5,main]*
>
> *java.lang.OutOfMemoryError: Java heap space*
>
> * at org.apache.xerces.xni.XMLString.toString(Unknown Source)*
>
> at org.apache.xerces.parsers.AbstractDOMParser.characters(Unknown Source)
>
> at org.apache.xerces.xinclude.XIncludeHandler.characters(Unknown Source)
>
> at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unknown
> Source)
>
> at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
> Source)
>
> at
> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
> Source)
>
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>
> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>
> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>
> at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>
> at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
>
> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
>
> at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2480)
>
> at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468)
>
> at
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539)
>
> at
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492)
>
> at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)
>
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1143)
>
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1115)
>
> at
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopConfigurations(SparkHadoopUtil.scala:464)
>
> at
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:436)
>
> at
> org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:114)
>
> at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:114)
>
> at org.apache.spark.deploy.worker.ExecutorRunner.org
> $apache$spark$deploy$worker$ExecutorRunner$$fetchAndRunExecutor(ExecutorRunner.scala:149)
>
> at
> org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:73)
>
> *20/05/02 02:26:37 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in thread Thread[dispatcher-event-loop-3,5,main]*
>
> *java.lang.OutOfMemoryError: Java heap space*
>
> * at java.lang.Class.newInstance(Class.java:411)*
>
> at
> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:403)
>
> at
> sun.reflect.MethodAccessorGenerator$1.run(MethodAccessorGenerator.java:394)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at
> sun.reflect.MethodAccessorGenerator.generate(MethodAccessorGenerator.java:393)
>
> at
> sun.reflect.MethodAccessorGenerator.generateSerializationConstructor(MethodAccessorGenerator.java:112)
>
> at
> sun.reflect.ReflectionFactory.generateConstructor(ReflectionFactory.java:398)
>
> at
> sun.reflect.ReflectionFactory.newConstructorForSerialization(ReflectionFactory.java:360)
>
> at
> java.io.ObjectStreamClass.getSerializableConstructor(ObjectStreamClass.java:1520)
>
> at java.io.ObjectStreamClass.access$1500(ObjectStreamClass.java:79)
>
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:507)
>
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:482)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:482)
>
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:379)
>
> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:478)
>
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:379)
>
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
>
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
>
> at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
>
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
>
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
>
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
>
> at
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:43)
>
> at
> org.apache.spark.rpc.netty.RequestMessage.serialize(NettyRpcEnv.scala:565)
>
> at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:193)
>
> at
> org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:528)
>
> at org.apache.spark.deploy.worker.Worker.org
> $apache$spark$deploy$worker$Worker$$sendToMaster(Worker.scala:658)
>
> *20/05/02 02:26:34 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in thread Thread[spark-shuffle-directory-cleaner-4-1,5,main]*
>
> *java.lang.OutOfMemoryError: Java heap space*
>
> * at java.io.UnixFileSystem.resolve(UnixFileSystem.java:108)*
>
> * at java.io.File.<init>(File.java:262)*
>
> * at java.io.File.listFiles(File.java:1253)*
>
> at
> org.apache.spark.network.util.JavaUtils.listFilesSafely(JavaUtils.java:177)
>
> at
> org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:140)
>
> at
> org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
>
> at
> org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
>
> at
> org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
>
> at
> org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.deleteNonShuffleFiles(ExternalShuffleBlockResolver.java:269)
>
> at
> org.apache.spark.network.shuffle.ExternalShuffleBlockResolver.lambda$executorRemoved$1(ExternalShuffleBlockResolver.java:235)
>
> at
> org.apache.spark.network.shuffle.ExternalShuffleBlockResolver$$Lambda$19/1657523067.run(Unknown
> Source)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> at
> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
>
> at java.lang.Thread.run(Thread.java:748)
>
> 20/05/02 02:27:03 INFO ExecutorRunner: Killing pro
>
>
>
> Another Reason
>
> 20/05/02 22:15:21 INFO DriverRunner: Copying user jar
> http://XX.XX.XXX.19:90/jar/hc-job-1.0-SNAPSHOT.jar to
> /grid/1/spark/work/driver-20200502221520-1101/hc-job-1.0-SNAPSHOT.jar
> *20/05/02 22:15:50 WARN TransportChannelHandler: Exception in connection
> from /XX.XX.XXX.19:7077*
> *java.lang.OutOfMemoryError: Java heap space*
> * at java.util.Arrays.copyOf(Arrays.java:3332)*
> * at
> java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)*
> * at
> java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)*
> at java.lang.StringBuilder.append(StringBuilder.java:136)
> at java.io.ObjectStreamField.getClassSignature(ObjectStreamField.java:322)
> at java.io.ObjectStreamField.<init>(ObjectStreamField.java:140)
> at
> java.io.ObjectStreamClass.getDefaultSerialFields(ObjectStreamClass.java:1789)
> at java.io.ObjectStreamClass.getSerialFields(ObjectStreamClass.java:1705)
> at java.io.ObjectStreamClass.access$800(ObjectStreamClass.java:79)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:496)
> at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:482)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:482)
> at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:379)
> at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:669)
> at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1883)
> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1749)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2040)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1571)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:431)
> at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
> at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
> at
> org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:271)
> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
> at
> org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:320)
> at
> org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:270)
> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
> at
> org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:269)
> at org.apache.spark.rpc.netty.RequestMessage$.apply(NettyRpcEnv.scala:611)
> at
> org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:662)
> at
> org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:654)
> at
> org.apache.spark.network.server.TransportRequestHandler.processOneWayMessage(TransportRequestHandler.java:275)
> *20/05/02 22:15:50 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in thread Thread[DriverRunner for driver-20200502221520-1100,5,main]*
> *java.lang.OutOfMemoryError: Java heap space*
> * at
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2627)*
> at
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492)
> at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1143)
> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1115)
> at
> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopConfigurations(SparkHadoopUtil.scala:464)
> at
> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:436)
> at
> org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:114)
> at
> org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:160)
> at
> org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173)
> at
> org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92)
> *20/05/02 22:15:51 ERROR SparkUncaughtExceptionHandler: Uncaught exception
> in thread Thread[dispatcher-event-loop-7,5,main]*
> *java.lang.OutOfMemoryError: Java heap space*
> * at org.apache.spark.deploy.worker.Worker.receive(Worker.scala:443)*
> * at
> org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:117)*
> * at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:205)*
> at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:101)
> at
> org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:221)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 20/05/02 22:16:05 INFO ExecutorRunner: Killing process!
>
>
>
>
> On Thu, May 7, 2020 at 7:48 PM Jeff Evans <jeffrey.wayne.ev...@gmail.com>
> wrote:
>
>> You might want to double check your Hadoop config files.  From the stack
>> trace it looks like this is happening when simply trying to load
>> configuration (XML files).  Make sure they're well formed.
>>
>> On Thu, May 7, 2020 at 6:12 AM Hrishikesh Mishra <sd.hri...@gmail.com>
>> wrote:
>>
>>> Hi
>>>
>>> I am getting out of memory error in worker log in streaming jobs in
>>> every couple of hours. After this worker dies. There is no shuffle, no
>>> aggression, no. caching  in job, its just a transformation.
>>> I'm not able to identify where is the problem, driver or executor. And
>>> why worker getting dead after the OOM streaming job should die. Am I
>>> missing something.
>>>
>>> Driver Memory:  2g
>>> Executor memory: 4g
>>>
>>> Spark Version:  2.4
>>> Kafka Direct Stream
>>> Spark Standalone Cluster.
>>>
>>>
>>> 20/05/06 12:52:20 INFO SecurityManager: SecurityManager: authentication
>>> disabled; ui acls disabled; users  with view permissions: Set(root); groups
>>> with view permissions: Set(); users  with modify permissions: Set(root);
>>> groups with modify permissions: Set()
>>>
>>> 20/05/06 12:53:03 ERROR SparkUncaughtExceptionHandler: Uncaught
>>> exception in thread Thread[ExecutorRunner for
>>> app-20200506124717-10226/0,5,main]
>>>
>>> java.lang.OutOfMemoryError: Java heap space
>>>
>>> at org.apache.xerces.util.XMLStringBuffer.append(Unknown Source)
>>>
>>> at org.apache.xerces.impl.XMLEntityScanner.scanData(Unknown Source)
>>>
>>> at org.apache.xerces.impl.XMLScanner.scanComment(Unknown Source)
>>>
>>> at
>>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanComment(Unknown
>>> Source)
>>>
>>> at
>>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown
>>> Source)
>>>
>>> at
>>> org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown
>>> Source)
>>>
>>> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>>>
>>> at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
>>>
>>> at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
>>>
>>> at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
>>>
>>> at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
>>>
>>> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
>>>
>>> at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2480)
>>>
>>> at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2468)
>>>
>>> at
>>> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2539)
>>>
>>> at
>>> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2492)
>>>
>>> at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2405)
>>>
>>> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1143)
>>>
>>> at org.apache.hadoop.conf.Configuration.set(Configuration.java:1115)
>>>
>>> at
>>> org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopConfigurations(SparkHadoopUtil.scala:464)
>>>
>>> at
>>> org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:436)
>>>
>>> at
>>> org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopUtil.scala:114)
>>>
>>> at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:114)
>>>
>>> at org.apache.spark.deploy.worker.ExecutorRunner.org
>>> $apache$spark$deploy$worker$ExecutorRunner$$fetchAndRunExecutor(ExecutorRunner.scala:149)
>>>
>>> at
>>> org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:73)
>>>
>>> 20/05/06 12:53:38 INFO DriverRunner: Worker shutting down, killing
>>> driver driver-20200505181719-1187
>>>
>>> 20/05/06 12:53:38 INFO DriverRunner: Killing driver process!
>>>
>>>
>>>
>>>
>>> Regards
>>> Hrishi
>>>
>>

Reply via email to