[ 
https://issues.apache.org/jira/browse/SPARK-21687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308153#comment-16308153
 ] 

Takeshi Yamamuro edited comment on SPARK-21687 at 1/2/18 10:26 PM:
-------------------------------------------------------------------

I feel this make some sense (But, this is not a bug, so less opportunity to 
backport to 2.3) cc: [~dongjoon]


was (Author: maropu):
I feel this make some sense (But, this is a not bug, so less opportunity to 
backport to 2.3) cc: [~dongjoon]

> Spark SQL should set createTime for Hive partition
> --------------------------------------------------
>
>                 Key: SPARK-21687
>                 URL: https://issues.apache.org/jira/browse/SPARK-21687
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.1.0, 2.2.0
>            Reporter: Chaozhong Yang
>            Priority: Minor
>
> In Spark SQL, we often use `insert overwite table t partition(p=xx)` to 
> create partition for partitioned table. `createTime` is an important 
> information to manage data lifecycle, e.g TTL.
> However, we found that Spark SQL doesn't call setCreateTime in 
> `HiveClientImpl#toHivePartition` as follows:
> {code:scala}
> def toHivePartition(
>       p: CatalogTablePartition,
>       ht: HiveTable): HivePartition = {
>     val tpart = new org.apache.hadoop.hive.metastore.api.Partition
>     val partValues = ht.getPartCols.asScala.map { hc =>
>       p.spec.get(hc.getName).getOrElse {
>         throw new IllegalArgumentException(
>           s"Partition spec is missing a value for column '${hc.getName}': 
> ${p.spec}")
>       }
>     }
>     val storageDesc = new StorageDescriptor
>     val serdeInfo = new SerDeInfo
>     
> p.storage.locationUri.map(CatalogUtils.URIToString(_)).foreach(storageDesc.setLocation)
>     p.storage.inputFormat.foreach(storageDesc.setInputFormat)
>     p.storage.outputFormat.foreach(storageDesc.setOutputFormat)
>     p.storage.serde.foreach(serdeInfo.setSerializationLib)
>     serdeInfo.setParameters(p.storage.properties.asJava)
>     storageDesc.setSerdeInfo(serdeInfo)
>     tpart.setDbName(ht.getDbName)
>     tpart.setTableName(ht.getTableName)
>     tpart.setValues(partValues.asJava)
>     tpart.setSd(storageDesc)
>     new HivePartition(ht, tpart)
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to