Patrick McGloin created SPARK-23565:
---------------------------------------

             Summary: Improved error message for when the number of sources for 
a query changes
                 Key: SPARK-23565
                 URL: https://issues.apache.org/jira/browse/SPARK-23565
             Project: Spark
          Issue Type: Improvement
          Components: Structured Streaming
    Affects Versions: 2.3.0
            Reporter: Patrick McGloin


If you change the number of sources for a Structured Streaming query then you 
will get an assertion error as the number of sources in the checkpoint does not 
match the number of sources in the query that is starting.  This can happen if, 
for example, you add a union to the input of the query.  This is of course 
correct but the error is a bit cryptic and requires investigation.

Suggestion for a more informative error message =>

The number of sources for this query has changed.  There was [x] sources in the 
checkpoint offsets and now there are [y] sources requested by the query.  
Cannot continue.

This is the current message.

02-03-2018 13:14:22 ERROR StreamExecution:91 - Query ORPositionsState to Kafka 
[id = 35f71e63-dbd0-49e9-98b2-a4c72a7da80e, runId = 
d4439aca-549c-4ef6-872e-29fbfde1df78] terminated with error 
java.lang.AssertionError: assertion failed at 
scala.Predef$.assert(Predef.scala:156) at 
org.apache.spark.sql.execution.streaming.OffsetSeq.toStreamProgress(OffsetSeq.scala:38)
 at 
org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$populateStartOffsets(StreamExecution.scala:429)
 at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(StreamExecution.scala:297)
 at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecution.scala:294)
 at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1$$anonfun$apply$mcZ$sp$1.apply(StreamExecution.scala:294)
 at 
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:279)
 at 
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
 at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anonfun$org$apache$spark$sql$execution$streaming$StreamExecution$$runBatches$1.apply$mcZ$sp(StreamExecution.scala:294)
 at 
org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to