[ 
https://issues.apache.org/jira/browse/IMPALA-4364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164006#comment-17164006
 ] 

Vihang Karajgaonkar commented on IMPALA-4364:
---------------------------------------------

The issue is that catalogd doesn't fetch the partition objects while executing 
the refresh but instead it fetches the partitionNames and only adds/removes the 
ones which are missing from the table. See this for reference:

[https://github.com/apache/impala/blob/09727a8d5105cc73bbb53e1be9038972b0b65bb3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1229]

refresh table_name location is possible but isn't generic enough in my opinion. 
For example, it will not handle the case where say you add a column or update 
the comment of the partition from sparkSQL.

Another option could be a query option which is session specific and can be 
used to load the altered partitions as well instead of just updated the 
partition list.

 

> REFRESH does not pick up ALTER TABLE...PARTITION...SET LOCATION changes
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-4364
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4364
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>    Affects Versions: Impala 2.6.0
>            Reporter: Jacob Evan Beard
>            Priority: Major
>              Labels: usability
>
> AFAIK the REFRESH command should pick up all changes to a table made by ALTER 
> TABLE from outside of Impala (e.g. Spark SQL), however REFRESH does not pick 
> up changes from ALTER TABLE...PARTITION...SET LOCATION, which seems to 
> require an INVALIDATE METADATA instead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to