[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...

rdblue Fri, 19 Oct 2018 15:29:26 -0700

Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22547#discussion_r226796934
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
 ---
    @@ -173,12 +185,17 @@ object DataSourceV2Relation {
           source: DataSourceV2,
           options: Map[String, String],
           tableIdent: Option[TableIdentifier] = None,
    -      userSpecifiedSchema: Option[StructType] = None): 
DataSourceV2Relation = {
    -    val readSupport = source.createReadSupport(options, 
userSpecifiedSchema)
    -    val output = readSupport.fullSchema().toAttributes
    +      userSpecifiedSchema: Option[StructType] = None): 
Option[DataSourceV2Relation] = {
    --- End diff --
    
    This shouldn't return an option. A relation is not a read-side structure, 
it is also used in write-side logical plans as the target of a write. 
Validation rules like PreprocessTableInsertion validate the write dataframe 
against the relation's schema. That's why the relation has a newWriteSupport 
method.
    
    Creating a relation from a Table should always work, even if the table 
isn't readable or isn't writable. Analysis can be done later to validate 
whether the plan that contains a relation can actually use the table.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22547: [SPARK-25528][SQL] data source V2 read side API r...

Reply via email to