[
https://issues.apache.org/jira/browse/SPARK-30967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
NITISH SHARMA updated SPARK-30967:
----------------------------------
Summary: Achieve LAST_ACCESS_TIME column update in TBLS table of hive
metastore on hive table access through pyspark (was: Achieve LAST_ACCESS_TIME
column update in TBLS table of hive metastore on table access )
> Achieve LAST_ACCESS_TIME column update in TBLS table of hive metastore on
> hive table access through pyspark
> -----------------------------------------------------------------------------------------------------------
>
> Key: SPARK-30967
> URL: https://issues.apache.org/jira/browse/SPARK-30967
> Project: Spark
> Issue Type: Question
> Components: Spark Shell
> Affects Versions: 2.4.5
> Reporter: NITISH SHARMA
> Priority: Critical
>
> I have a requirement where i am looking to update LAST_ACCESS_TIME in TBLS of
> Hive metastore whenever any table is accessed through spark. I set this below
> property in hive-site.xml and hive honors it and updates the LAST_ACCESS_TIME
> everytime it is accessed.
> <property>
> <name>hive.exec.pre.hooks</name>
>
> <value>org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec</value>
> </property>
> However, the same thing i want to achieve using pyspark/spark-shell but its
> not honoring this property of hive hooks. Is there an alternate approach of
> achieving this - 'Update of LAST_ACCESS_TIME in hive metastore on access
> using spark'.
> I passed the property like this -
> spark-sql -e 'set
> spark.hadoop.hive.exec.post.hooks=org.apache.hadoop.hive.ql.hooks.UpdateInputAccessTimeHook$PreExec;select
> * from db.table;'
> as well as i put the same property in /etc/spark/conf/hive-site.xml location.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]