Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14071#discussion_r70003918
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala ---
    @@ -939,42 +940,33 @@ class SparkSqlAstBuilder(conf: SQLConf) extends 
AstBuilder {
         // to include the partition columns here explicitly
         val schema = cols ++ partitionCols
     
    -    // Storage format
    -    val defaultStorage: CatalogStorageFormat = {
    -      val defaultStorageType = 
conf.getConfString("hive.default.fileformat", "textfile")
    -      val defaultHiveSerde = HiveSerDe.sourceToSerDe(defaultStorageType, 
conf)
    -      CatalogStorageFormat(
    -        locationUri = None,
    -        inputFormat = defaultHiveSerde.flatMap(_.inputFormat)
    -          .orElse(Some("org.apache.hadoop.mapred.TextInputFormat")),
    -        outputFormat = defaultHiveSerde.flatMap(_.outputFormat)
    -          
.orElse(Some("org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat")),
    -        // Note: Keep this unspecified because we use the presence of the 
serde to decide
    -        // whether to convert a table created by CTAS to a datasource 
table.
    -        serde = None,
    -        compressed = false,
    -        serdeProperties = Map())
    -    }
         validateRowFormatFileFormat(ctx.rowFormat, ctx.createFileFormat, ctx)
    -    val fileStorage = 
Option(ctx.createFileFormat).map(visitCreateFileFormat)
    -      .getOrElse(CatalogStorageFormat.empty)
    -    val rowStorage = Option(ctx.rowFormat).map(visitRowFormat)
    -      .getOrElse(CatalogStorageFormat.empty)
    -    val location = Option(ctx.locationSpec).map(visitLocationSpec)
    +    var storage = CatalogStorageFormat(
    +      locationUri = Option(ctx.locationSpec).map(visitLocationSpec),
    +      provider = Some("hive"),
    +      properties = Map.empty)
    +    Option(ctx.createFileFormat).foreach(ctx => storage = 
getFileFormat(ctx, storage))
    +    Option(ctx.rowFormat).foreach(ctx => storage = getRowFormat(ctx, 
storage))
    --- End diff --
    
    Originally, the value of `serde` is set using the following logics:
    ```
    serde = 
rowStorage.serde.orElse(fileStorage.serde).orElse(defaultStorage.serde)
    ```
    Now, we are missing the last case, right?
    
    In addition, maybe we can write a description to document it. 
`getFileFormat` might first assign an initial value; then, `getRowFormat` could 
overwrite it. If both does not set it, we use the default value.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to