Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/21667#discussion_r199860192
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
---
@@ -42,65 +38,9 @@ object DataSourceUtils {
/**
* Verify if the schema is supported in datasource. This verification
should be done
- * in a driver side, e.g., `prepareWrite`, `buildReader`, and
`buildReaderWithPartitionValues`
- * in `FileFormat`.
- *
- * Unsupported data types of csv, json, orc, and parquet are as follows;
- * csv -> R/W: Interval, Null, Array, Map, Struct
- * json -> W: Interval
- * orc -> W: Interval, Null
- * parquet -> R/W: Interval, Null
+ * in a driver side.
--- End diff --
`FileFormat` is internal so it's nothing about public API, but just about
design choice.
Generally it's ok to have a central place to put some business logic for
different cases. However, here we can't access all `FileFormat`
implementations, Hive ORC is in Hive module. Now the only choice is: dispatch
the business logic into implementations.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]