[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

cloud-fan Mon, 09 Oct 2017 12:23:17 -0700

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/19269
  
    Several things to discuss:
    
    1. Since Spark can't disable speculation during runtime, currently there is 
not much benefit to provide an interface for data source to disable 
speculation, because data source can check the spark conf at the beginning and 
throw exception if speculation is enabled. We can do it later via mix-in trait.
    2. The only contract Spark needs is: data written/committed by tasks should 
not be visible to data source readers until the job-level commitment. But they 
can be visible to others like other writing tasks, so it's possible for data 
sources to implement "abort the output of the other writer".
    3. The `WriteCommitMessage` can include statistics(it's an empty 
interface), so data sources can aggregate statistics at driver side.
    
    cc @steveloughran @rdblue



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19269: [SPARK-22026][SQL][WIP] data source v2 write path

Reply via email to