Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22547#discussion_r226796934
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceV2Relation.scala
---
@@ -173,12 +185,17 @@ object DataSourceV2Relation {
source: DataSourceV2,
options: Map[String, String],
tableIdent: Option[TableIdentifier] = None,
- userSpecifiedSchema: Option[StructType] = None):
DataSourceV2Relation = {
- val readSupport = source.createReadSupport(options,
userSpecifiedSchema)
- val output = readSupport.fullSchema().toAttributes
+ userSpecifiedSchema: Option[StructType] = None):
Option[DataSourceV2Relation] = {
--- End diff --
This shouldn't return an option. A relation is not a read-side structure,
it is also used in write-side logical plans as the target of a write.
Validation rules like PreprocessTableInsertion validate the write dataframe
against the relation's schema. That's why the relation has a newWriteSupport
method.
Creating a relation from a Table should always work, even if the table
isn't readable or isn't writable. Analysis can be done later to validate
whether the plan that contains a relation can actually use the table.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]