Github user budde commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16944#discussion_r103051158
  
    --- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
    @@ -226,6 +258,41 @@ private[hive] class HiveMetastoreCatalog(sparkSession: 
SparkSession) extends Log
         result.copy(expectedOutputAttributes = Some(metastoreRelation.output))
       }
     
    +  private def inferSchema(
    +      metastoreSchema: StructType,
    +      options: Map[String, String],
    +      fileFormat: FileFormat,
    +      fileType: String,
    +      fileIndex: FileIndex): Option[StructType] = {
    +    val inferred = fileFormat.inferSchema(
    +      sparkSession,
    +      options,
    +      fileIndex.listFiles(Nil).flatMap(_.files))
    +    if (fileType.equals("parquet")) {
    +      
inferred.map(ParquetFileFormat.mergeMetastoreParquetSchema(metastoreSchema, _))
    +    } else {
    +      inferred
    +    }
    +  }
    +
    +  private def updateCatalogTable(
    +      catalogTable: CatalogTable,
    +      inferredSchema: Option[StructType]): Option[CatalogTable] = try {
    +    inferredSchema.flatMap { schema =>
    +      logInfo(s"Saving case-sensitive schema for table 
${catalogTable.identifier.table}")
    +      val updatedTable = catalogTable.copy(schema = schema)
    +      val catalog = sparkSession.sharedState.externalCatalog
    +      catalog.alterTable(updatedTable)
    +      Option(catalog.getTable(
    --- End diff --
    
    I think that should be fine. I had some concerns around the way 
```HiveExternalCatalog``` mutates the raw ```CatalogTable``` returned by the 
metastore that I think pushed me towards fetching the table again but I really 
don't think that should matter since the original ```catalogTable``` was 
retrieved from ```HiveExternalCatalog``` as well.
    
    I'll just used ```updatedTable``` here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to