Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21667#discussion_r199860192
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
 ---
    @@ -42,65 +38,9 @@ object DataSourceUtils {
     
       /**
        * Verify if the schema is supported in datasource. This verification 
should be done
    -   * in a driver side, e.g., `prepareWrite`, `buildReader`, and 
`buildReaderWithPartitionValues`
    -   * in `FileFormat`.
    -   *
    -   * Unsupported data types of csv, json, orc, and parquet are as follows;
    -   *  csv -> R/W: Interval, Null, Array, Map, Struct
    -   *  json -> W: Interval
    -   *  orc -> W: Interval, Null
    -   *  parquet -> R/W: Interval, Null
    +   * in a driver side.
    --- End diff --
    
    `FileFormat` is internal so it's nothing about public API, but just about 
design choice.
    
    Generally it's ok to have a central place to put some business logic for 
different cases. However, here we can't access all `FileFormat` 
implementations, Hive ORC is in Hive module. Now the only choice is: dispatch 
the business logic into implementations.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to