cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL]
Support passing all Table metadata in TableProvider
URL: https://github.com/apache/spark/pull/25651#discussion_r323808342
##########
File path:
external/avro/src/main/scala/org/apache/spark/sql/v2/avro/AvroDataSourceV2.scala
##########
@@ -35,7 +36,10 @@ class AvroDataSourceV2 extends FileDataSourceV2 {
AvroTable(tableName, sparkSession, options, paths, None,
fallbackFileFormat)
}
- override def getTable(options: CaseInsensitiveStringMap, schema:
StructType): Table = {
+ override def getTable(
+ options: CaseInsensitiveStringMap,
+ schema: StructType,
+ partitions: Array[Transform]): Table = {
Review comment:
But we do have a problem here. Table properties are case sensitive while
scan options are case insensitive.
Think about 2 cases:
1. `spark.read.format("myFormat").options(...).schema(...).load()`.
We need to get the table with the user-specifed options and schema. When
scan the table, we need to use the user-specified options as scan options. The
problem is, `DataFrameReader.options` specifies both table properties and scan
options in this case.
2. `CREATE TABLE t USING myFormat TABLEPROP ...` and then
`spark.read.options(...).table("t")`
In this case, `DataFrameReader.options` only specifies scan options.
Ideally, `TableProvider.getTable` takes table properties which should be
case sensitive. However, `DataFrameReader.options` also specifies scan options
which should be case insensitive.
I don't have a good idea now. Maybe it's OK to treat this as a special table
which accepts case insensitive table properties.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]