[
https://issues.apache.org/jira/browse/SPARK-11777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15010944#comment-15010944
]
Stanislav Hadjiiski commented on SPARK-11777:
---------------------------------------------
I guess when HiveContext is used spark carries out all of its updates to a
specific table trough Hive. According to the documentation, this is one of the
cases when you have to issue a REFRESH on the metastore. I assume that when
creating a new table, Spark 'knows' it has to INVALIDATE the metadata but does
not REFRESH it on overwrite.
An acceptable resolution would be one of the following:
* saveAsTable issues a metastore refresh (either always or if specified by user
with boolean flag)
* some convenient way to issue Impala statements through the spark API (Spark
SQL and Hive can't issue a metastore refresh, at least to my best knowledge)
> HiveContext.saveAsTable does not update the metastore on overwrite
> ------------------------------------------------------------------
>
> Key: SPARK-11777
> URL: https://issues.apache.org/jira/browse/SPARK-11777
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.5.1
> Reporter: Stanislav Hadjiiski
>
> Consider the following code:
> {quote}
> case class Bean(cdata: String)
> val hiveContext = new org.apache.spark.sql.hive.HiveContext(sparkContext)
> val df = hiveContext.createDataFrame(Bean("test10") :: Bean("test20") :: Nil)
> df.write.mode(SaveMode.Overwrite).saveAsTable("db_name.table")
> {quote}
> This works as expected - if the table does not exist it is created, otherwise
> it's content is replaced. However, only in the first case the data is
> accessible through impala (i.e. outside of spark environment). To get it
> working after overwriting a
> {quote}
> REFRESH db_name.table
> {quote}
> should be issued in impala-shell. Neither
> {quote}
> hiveContext.refreshTable("db_name.table")
> {quote}
> nor
> {quote}
> hiveContext.sql("REFRESH TABLE db_name.table")
> {quote}
> fixes the issue. The same applies if the {{default}} database is used (and
> {{db_name.}} is omiited everywhere)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]