[
https://issues.apache.org/jira/browse/IMPALA-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164006#comment-17164006
]
Vihang Karajgaonkar commented on IMPALA-4364:
---------------------------------------------
The issue is that catalogd doesn't fetch the partition objects while executing
the refresh but instead it fetches the partitionNames and only adds/removes the
ones which are missing from the table. See this for reference:
[https://github.com/apache/impala/blob/09727a8d5105cc73bbb53e1be9038972b0b65bb3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1229]
refresh table_name location is possible but isn't generic enough in my opinion.
For example, it will not handle the case where say you add a column or update
the comment of the partition from sparkSQL.
Another option could be a query option which is session specific and can be
used to load the altered partitions as well instead of just updated the
partition list.
> REFRESH does not pick up ALTER TABLE...PARTITION...SET LOCATION changes
> -----------------------------------------------------------------------
>
> Key: IMPALA-4364
> URL: https://issues.apache.org/jira/browse/IMPALA-4364
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Affects Versions: Impala 2.6.0
> Reporter: Jacob Evan Beard
> Priority: Major
> Labels: usability
>
> AFAIK the REFRESH command should pick up all changes to a table made by ALTER
> TABLE from outside of Impala (e.g. Spark SQL), however REFRESH does not pick
> up changes from ALTER TABLE...PARTITION...SET LOCATION, which seems to
> require an INVALIDATE METADATA instead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]