Looks like it's a driver side error log, and I think executor log would have much more warning/error logs and probably with stack traces.
I'd also suggest excluding the external dependency whatever possible while experimenting/investigating. If you're suspecting Apache Spark I'd rather say you'll want to stick with writing to Kafka on investigation, not changing to Delta Lake which adds the external dependency and harder to find where is the root cause. Your dependencies are a bit odd. Could you please remove dependencies for spark-sql-kafka and try out "--packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1" in spark-submit/spark-shell instead? On Fri, Jan 22, 2021 at 5:03 AM gshen <gshe...@gmail.com> wrote: > I am now testing with to stream into a Delta table. Interestingly I have > gotten it working within a community version of Databricks, which leads me > to think there might be something to do with my dependencies. I am > checkpointing to ADLS Gen2 adding the following dependencies: > > delta-core_2.12-0.7.0.jar > hadoop-azure-3.2.1.jar > hadoop-azure-datalake-3.2.1.jar > Rwildfly-openssl-java-1.1.3.Final.jar > spark-sql-kafka-0-10_2.12-3.0.1.jar > spark-streaming-kafka-0-10-assembly_2.12-3.0.1.jar > commons-pool2-2.8.0.jar > kafka-clients-0.10.2.2.jar > > Here's a more detailed the stack trace: > > {"LOG_LEVEL":"WARN", "LOG_MESSAGE":"2021-01-21 18:58:23,867 > [org.apache.spark.scheduler.TaskSetManager] > Lost task 0.0 in stage 1.0 (TID 3, 10.1.88.2, executor 1): > org.apache.spark.util.TaskCompletionListene > rException: Self-suppression not permitted > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:145) > at > > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124) > at org.apache.spark.scheduler.Task.run(Task.scala:143) > at > > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) > at > org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > "} > > {"LOG_LEVEL":"ERROR", "LOG_MESSAGE":"2021-01-21 18:58:26,283 > [org.apache.spark.scheduler.TaskSetManager > ] Task 0 in stage 1.0 failed 4 times; aborting job"} > > {"LOG_LEVEL":"ERROR", "LOG_MESSAGE":"2021-01-21 18:58:26,373 > [org.apache.spark.sql.execution.datasource > s.FileFormatWriter] Aborting job 6115425c-9740-4e47-b2a1-e646c131e763."} > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 > in > stage 1.0 failed 4 times, > most recent failure: Lost task 0.3 in stage 1.0 (TID 11, 10.1.28.7, > executor > 2): org.apache.spark.util. > TaskCompletionListenerException: Self-suppression not permitted > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:145) > at > > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124) > at org.apache.spark.scheduler.Task.run(Task.scala:143) > at > > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) > at > org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Driver stacktrace: > at > > org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2059) > at > > org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2008) > at > > org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:200 > 7) > at > scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at > scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at > scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2007) > at > > org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:97 > 3) > at > > org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler. > scala:973) > at scala.Option.foreach(Option.scala:407) > at > > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:973) > at > > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2239) > at > > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2188) > at > > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2177) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) > at > org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:775) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099) > at > > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:19 > 5) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.$anonfun$writeFiles$1(TransactionalWrite > .scala:162) > at > > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scal > a:100) > at > > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) > at > > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scal > a:87) > at > org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) > at > > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.writeFiles(TransactionalWrite.scala:134) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.writeFiles$(TransactionalWrite.scala:116 > ) > at > > org.apache.spark.sql.delta.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:80) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.writeFiles(TransactionalWrite.scala:107) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.writeFiles$(TransactionalWrite.scala:106 > ) > at > > org.apache.spark.sql.delta.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:80) > at > > org.apache.spark.sql.delta.sources.DeltaSink.$anonfun$addBatch$1(DeltaSink.scala:99) > at > > org.apache.spark.sql.delta.sources.DeltaSink.$anonfun$addBatch$1$adapted(DeltaSink.scala:54) > at > org.apache.spark.sql.delta.DeltaLog.withNewTransaction(DeltaLog.scala:188) > at > org.apache.spark.sql.delta.sources.DeltaSink.addBatch(DeltaSink.scala:54) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$16(MicroBatch > Execution.scala:572) > at > > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scal > a:100) > at > > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) > at > > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scal > a:87) > at > org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) > at > > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$15(MicroBatch > Execution.scala:570) > at > > org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.s > cala:352) > at > > org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter. > scala:350) > at > > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.sca > la:69) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.runBatch(MicroBatchExecution.sc > ala:570) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$2(M > icroBatchExecution.scala:223) > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > > org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.s > cala:352) > at > > org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter. > scala:350) > at > > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.sca > la:69) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(M > icroBatchExecution.scala:191) > at > > org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scal > a:57) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchEx > ecution.scala:185) > at > org.apache.spark.sql.execution.streaming.StreamExecution.org > $apache$spark$sql$execution$stre > aming$StreamExecution$$runStream(StreamExecution.scala:334) > at > > org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:2 > 45) > Caused by: org.apache.spark.util.TaskCompletionListenerException: > Self-suppression not permitted > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:145) > at > > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124) > at org.apache.spark.scheduler.Task.run(Task.scala:143) > at > > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) > at > org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > {"LOG_LEVEL":"ERROR", "LOG_MESSAGE":"2021-01-21 18:58:26,389 > [org.apache.spark.sql.execution.streaming. > MicroBatchExecution] Query [id = 94e815d4-294b-4d9c-bcd4-9c30c2557c0a, > runId > = 4652cb0b-7f88-4a8b-bfc1- > aebb961249bb] terminated with error"} > org.apache.spark.SparkException: Job aborted. > at > > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:22 > 6) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.$anonfun$writeFiles$1(TransactionalWrite > .scala:162) > at > > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scal > a:100) > at > > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) > at > > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scal > a:87) > at > org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) > at > > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.writeFiles(TransactionalWrite.scala:134) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.writeFiles$(TransactionalWrite.scala:116 > ) > at > > org.apache.spark.sql.delta.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:80) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.writeFiles(TransactionalWrite.scala:107) > at > > org.apache.spark.sql.delta.files.TransactionalWrite.writeFiles$(TransactionalWrite.scala:106 > ) > at > > org.apache.spark.sql.delta.OptimisticTransaction.writeFiles(OptimisticTransaction.scala:80) > at > > org.apache.spark.sql.delta.sources.DeltaSink.$anonfun$addBatch$1(DeltaSink.scala:99) > at > > org.apache.spark.sql.delta.sources.DeltaSink.$anonfun$addBatch$1$adapted(DeltaSink.scala:54) > at > org.apache.spark.sql.delta.DeltaLog.withNewTransaction(DeltaLog.scala:188) > at > org.apache.spark.sql.delta.sources.DeltaSink.addBatch(DeltaSink.scala:54) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$16(MicroBatch > Execution.scala:572) > at > > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scal > a:100) > at > > org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:160) > at > > org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scal > a:87) > at > org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:764) > at > > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$15(MicroBatch > Execution.scala:570) > at > > org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.s > cala:352) > at > > org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter. > scala:350) > at > > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.sca > la:69) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.runBatch(MicroBatchExecution.sc > ala:570) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$2(M > icroBatchExecution.scala:223) > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at > > org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.s > cala:352) > at > > org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter. > scala:350) > at > > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.sca > la:69) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(M > icroBatchExecution.scala:191) > at > > org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scal > a:57) > at > > org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchEx > ecution.scala:185) > at > org.apache.spark.sql.execution.streaming.StreamExecution.org > $apache$spark$sql$execution$stre > aming$StreamExecution$$runStream(StreamExecution.scala:334) > at > > org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:2 > 45) > Caused by: org.apache.spark.SparkException: Job aborted due to stage > failure: Task 0 in stage 1.0 faile > d 4 times, most recent failure: Lost task 0.3 in stage 1.0 (TID 11, > 10.1.28.7, executor 2): org.apache. > spark.util.TaskCompletionListenerException: Self-suppression not permitted > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:145) > at > > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124) > at org.apache.spark.scheduler.Task.run(Task.scala:143) > at > > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) > at > org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Driver stacktrace: > at > > org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2059) > at > > org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2008) > at > > org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:200 > 7) > at > scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) > at > scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) > at > scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) > at > org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2007) > at > > org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:97 > 3) > at > > org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler. > scala:973) > at scala.Option.foreach(Option.scala:407) > at > > org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:973) > at > > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2239) > at > > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2188) > at > > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2177) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49) > at > org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:775) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:2099) > at > > org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:19 > 5) > ... 37 more > Caused by: org.apache.spark.util.TaskCompletionListenerException: > Self-suppression not permitted > at > org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:145) > at > > org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:124) > at org.apache.spark.scheduler.Task.run(Task.scala:143) > at > > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:446) > at > org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:449) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {"LOG_LEVEL":"WARN", "LOG_MESSAGE":"2021-01-21 18:58:26,462 > [org.apache.spark.scheduler.TaskSetManager] > Lost task 82.3 in stage 1.0 (TID 12, 10.1.88.2, executor 1): TaskKilled > (Stage cancelled)"} > {"LOG_LEVEL":"WARN", "LOG_MESSAGE":"2021-01-21 18:58:26,559 > [org.apache.spark.scheduler.TaskSetManager] > Lost task 1.3 in stage 1.0 (TID 13, 10.1.28.7, executor 2): TaskKilled > (Stage cancelled)"} > {"LOG_LEVEL":"ERROR", "LOG_MESSAGE":"2021-01-21 18:58:26,623 [spark-job] > Something went wrong > . Exception: Job aborted. > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >