Github user viirya commented on the issue:

    https://github.com/apache/spark/pull/14803
  
    >
    * What error is printed (if any) if an invalid partition directory is 
created midstream.
    
    The error is:
    
        [info]   org.apache.spark.sql.streaming.StreamingQueryException: Query 
query-14 terminated with exception: assertio
        n failed: Conflicting partition column names detected:
        [info] 
        [info]  Partition column name list #0: partition2
        [info]  Partition column name list #1: partition
        [info] 
        [info] For partitioned table directories, data files should only live 
in leaf directories.
        [info] And directories at the same level should have the same partition 
column name.
        [info] Please check the following directories for unexpected files or 
inconsistent partition column names:
        [info] 
        [info]  
file:/root/repos/spark-1/target/tmp/streaming.src-c3a9895d-7be1-4ded-9154-7a24026513d7/partition2=bar
        [info]  
file:/root/repos/spark-1/target/tmp/streaming.src-c3a9895d-7be1-4ded-9154-7a24026513d7/partition=bar
        [info]  
file:/root/repos/spark-1/target/tmp/streaming.src-c3a9895d-7be1-4ded-9154-7a24026513d7/partition=foo
        [info]   at 
org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$Strea
        mExecution$$runBatches(StreamExecution.scala:211)
        [info]   at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:124)
        [info]   Cause: java.lang.AssertionError: assertion failed: Conflicting 
partition column names detected:
        [info] 
        [info]  Partition column name list #0: partition2
        [info]  Partition column name list #1: partition
    
    >
    * Are we okay if all of the data disappears (that has already been 
processed) and then new data arrives?
    
    I enhanced the added test to test this. It okay, if I understand your point 
correctly here.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to