[jira] [Commented] (SPARK-23416) flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16473110#comment-16473110 ] Dongjoon Hyun commented on SPARK-23416: --- FYI. https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90536 (branch-2.3) > flaky test: > org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress > test for failOnDataLoss=false > -- > > Key: SPARK-23416 > URL: https://issues.apache.org/jira/browse/SPARK-23416 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Jose Torres >Priority: Minor > Fix For: 2.3.0 > > > I suspect this is a race condition latent in the DataSourceV2 write path, or > at least the interaction of that write path with StreamTest. > [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87241/testReport/org.apache.spark.sql.kafka010/KafkaSourceStressForDontFailOnDataLossSuite/stress_test_for_failOnDataLoss_false/] > h3. Error Message > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. > h3. Stacktrace > sbt.ForkMain$ForkError: > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. at > org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:295) > at > org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189) > Caused by: sbt.ForkMain$ForkError: org.apache.spark.SparkException: Writing > job aborted. at > org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2.scala:108) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) > at > org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:294) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3272) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3253) at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252) at > org.apache.spark.sql.Dataset.collect(Dataset.scala:2722) at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3$$anonfun$apply$15.apply(MicroBatchExecution.scala:488) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3.apply(MicroBatchExecution.scala:483) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271) > at > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:482) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:133) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271) > at
[jira] [Commented] (SPARK-23416) flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363386#comment-16363386 ] Apache Spark commented on SPARK-23416: -- User 'cloud-fan' has created a pull request for this issue: https://github.com/apache/spark/pull/20605 > flaky test: > org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress > test for failOnDataLoss=false > -- > > Key: SPARK-23416 > URL: https://issues.apache.org/jira/browse/SPARK-23416 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Jose Torres >Priority: Minor > > I suspect this is a race condition latent in the DataSourceV2 write path, or > at least the interaction of that write path with StreamTest. > [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87241/testReport/org.apache.spark.sql.kafka010/KafkaSourceStressForDontFailOnDataLossSuite/stress_test_for_failOnDataLoss_false/] > h3. Error Message > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. > h3. Stacktrace > sbt.ForkMain$ForkError: > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. at > org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:295) > at > org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189) > Caused by: sbt.ForkMain$ForkError: org.apache.spark.SparkException: Writing > job aborted. at > org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2.scala:108) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) > at > org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:294) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3272) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3253) at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252) at > org.apache.spark.sql.Dataset.collect(Dataset.scala:2722) at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3$$anonfun$apply$15.apply(MicroBatchExecution.scala:488) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3.apply(MicroBatchExecution.scala:483) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271) > at > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:482) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:133) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271) > at >
[jira] [Commented] (SPARK-23416) flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362950#comment-16362950 ] Apache Spark commented on SPARK-23416: -- User 'jose-torres' has created a pull request for this issue: https://github.com/apache/spark/pull/20602 > flaky test: > org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress > test for failOnDataLoss=false > -- > > Key: SPARK-23416 > URL: https://issues.apache.org/jira/browse/SPARK-23416 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Jose Torres >Priority: Minor > > I suspect this is a race condition latent in the DataSourceV2 write path, or > at least the interaction of that write path with StreamTest. > [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87241/testReport/org.apache.spark.sql.kafka010/KafkaSourceStressForDontFailOnDataLossSuite/stress_test_for_failOnDataLoss_false/] > h3. Error Message > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. > h3. Stacktrace > sbt.ForkMain$ForkError: > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. at > org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:295) > at > org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189) > Caused by: sbt.ForkMain$ForkError: org.apache.spark.SparkException: Writing > job aborted. at > org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2.scala:108) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) > at > org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:294) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3272) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3253) at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252) at > org.apache.spark.sql.Dataset.collect(Dataset.scala:2722) at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3$$anonfun$apply$15.apply(MicroBatchExecution.scala:488) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3.apply(MicroBatchExecution.scala:483) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271) > at > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:482) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:133) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271) > at >
[jira] [Commented] (SPARK-23416) flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362941#comment-16362941 ] Jose Torres commented on SPARK-23416: - I think I see the problem. * StreamExecution.stop() works by interrupting the stream execution thread. This is not safe in general, and can throw any variety of exceptions. * StreamExecution.isInterruptedByStop() solves this problem by implementing a whitelist of exceptions which indicate the stop() happened. * The v2 write path adds calls to ThreadUtils.awaitResult(), which weren't in the V1 write path and (if the interrupt happens to fall in them) throw a new exception which isn't accounted for. I'm going to write a PR to add another whitelist entry, but this is quite fragile. > flaky test: > org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress > test for failOnDataLoss=false > -- > > Key: SPARK-23416 > URL: https://issues.apache.org/jira/browse/SPARK-23416 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Jose Torres >Priority: Minor > > I suspect this is a race condition latent in the DataSourceV2 write path, or > at least the interaction of that write path with StreamTest. > [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87241/testReport/org.apache.spark.sql.kafka010/KafkaSourceStressForDontFailOnDataLossSuite/stress_test_for_failOnDataLoss_false/] > h3. Error Message > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. > h3. Stacktrace > sbt.ForkMain$ForkError: > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. at > org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:295) > at > org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189) > Caused by: sbt.ForkMain$ForkError: org.apache.spark.SparkException: Writing > job aborted. at > org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2.scala:108) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) > at > org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:294) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3272) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3253) at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252) at > org.apache.spark.sql.Dataset.collect(Dataset.scala:2722) at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3$$anonfun$apply$15.apply(MicroBatchExecution.scala:488) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3.apply(MicroBatchExecution.scala:483) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271) > at > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:482) > at >
[jira] [Commented] (SPARK-23416) flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362920#comment-16362920 ] Marco Gaido commented on SPARK-23416: - I see this failing also with this stacktrace: {code:java} sbt.ForkMain$ForkError: org.apache.spark.sql.streaming.StreamingQueryException: Query memory [id = cca87cf7-0532-41af-b757-0948ec294c0c, runId = c1830af6-1715-4947-bd76-a1a63482280b] terminated with exception: null at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:295) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189) Caused by: sbt.ForkMain$ForkError: java.lang.InterruptedException: null at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208) at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:201) at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75) at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:92) at org.apache.spark.rpc.RpcEndpointRef.askSync(RpcEndpointRef.scala:76) at org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runContinuous(ContinuousExecution.scala:271) at org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runActivatedStream(ContinuousExecution.scala:89) at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:279) ... 1 more {code} https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87401/testReport/org.apache.spark.sql.kafka010/KafkaContinuousSourceStressForDontFailOnDataLossSuite/stress_test_for_failOnDataLoss_false/ > flaky test: > org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress > test for failOnDataLoss=false > -- > > Key: SPARK-23416 > URL: https://issues.apache.org/jira/browse/SPARK-23416 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Jose Torres >Priority: Minor > > I suspect this is a race condition latent in the DataSourceV2 write path, or > at least the interaction of that write path with StreamTest. > [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87241/testReport/org.apache.spark.sql.kafka010/KafkaSourceStressForDontFailOnDataLossSuite/stress_test_for_failOnDataLoss_false/] > h3. Error Message > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. > h3. Stacktrace > sbt.ForkMain$ForkError: > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. at > org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:295) > at > org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189) > Caused by: sbt.ForkMain$ForkError: org.apache.spark.SparkException: Writing > job aborted. at > org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2.scala:108) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) > at > org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:294) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3272) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at
[jira] [Commented] (SPARK-23416) flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362913#comment-16362913 ] Dongjoon Hyun commented on SPARK-23416: --- Thank you for filing this! > flaky test: > org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress > test for failOnDataLoss=false > -- > > Key: SPARK-23416 > URL: https://issues.apache.org/jira/browse/SPARK-23416 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Jose Torres >Priority: Minor > > I suspect this is a race condition latent in the DataSourceV2 write path, or > at least the interaction of that write path with StreamTest. > [https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87241/testReport/org.apache.spark.sql.kafka010/KafkaSourceStressForDontFailOnDataLossSuite/stress_test_for_failOnDataLoss_false/] > h3. Error Message > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. > h3. Stacktrace > sbt.ForkMain$ForkError: > org.apache.spark.sql.streaming.StreamingQueryException: Query [id = > 16b2a2b1-acdd-44ec-902f-531169193169, runId = > 9567facb-e305-4554-8622-830519002edb] terminated with exception: Writing job > aborted. at > org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:295) > at > org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189) > Caused by: sbt.ForkMain$ForkError: org.apache.spark.SparkException: Writing > job aborted. at > org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2.scala:108) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) > at > org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at > org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247) > at > org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:294) > at > org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3272) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$collect$1.apply(Dataset.scala:2722) > at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3253) at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252) at > org.apache.spark.sql.Dataset.collect(Dataset.scala:2722) at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3$$anonfun$apply$15.apply(MicroBatchExecution.scala:488) > at > org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$3.apply(MicroBatchExecution.scala:483) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271) > at > org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:482) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:133) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121) > at > org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:121) > at > org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271) > at >