Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19124#discussion_r137121898
  
    --- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ---
    @@ -145,15 +146,27 @@ class DetermineTableStats(session: SparkSession) 
extends Rule[LogicalPlan] {
      * `PreprocessTableInsertion`.
      */
     object HiveAnalysis extends Rule[LogicalPlan] {
    +  private def checkFieldNames(table: CatalogTable): Unit = {
    --- End diff --
    
    The logic is slightly different.
    
    **HiveStrategies**
    ```
      private def checkFieldNames(table: CatalogTable): Unit = {
        val serde = table.storage.serde
        if (serde == HiveSerDe.sourceToSerDe("orc").get.serde) {
          OrcFileFormat.checkFieldNames(table.dataSchema)
        } else if (serde == HiveSerDe.sourceToSerDe("parquet").get.serde) {
          ParquetSchemaConverter.checkFieldNames(table.dataSchema)
        }
      }
    ```
    
    **DataSourceStrategies**
    ```
      private def checkFieldNames(table: CatalogTable): Unit = {
        table.provider.get.toLowerCase(Locale.ROOT) match {
          case "parquet" =>
            ParquetSchemaConverter.checkFieldNames(table.dataSchema)
          case "orc" =>
            OrcFileFormat.checkFieldNames(table.dataSchema)
          case _ =>
        }
      }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to