Hey Fawze, RECOVER PARTITIONS is cheaper to execute, but it works only once for each new partition. If you keep adding files to existing partitions, per-partition REFRESH is the best bet.
HTH On Wed, 6 Feb 2019 at 09:27, Fawze Abujaber <fawz...@gmail.com> wrote: > > Hi Community, > > I'm all the time working to enhance our impala usage and resource > consumption, and here i would like to think which to use between alter table > recover partitions and refresh statement, in terms of running time and > resources, specially that refresh can be run on specific partitions, i have > spark job that adding files at the HDFS partitioned by year,month and day. > > To automatically detect new partition directories added through Hive or HDFS > operations: > > In CDH 5.5 / Impala 2.3 and higher, the RECOVER PARTITIONS clause scans a > partitioned table to detect if any new partition directories were added > outside of Impala, such as by Hive ALTER TABLE statements or by hdfs dfs or > hadoop fs commands. The RECOVER PARTITIONS clause automatically recognizes > any data files present in these new directories, the same as the REFRESH > statement does. > > > -- > Take Care > Fawze Abujaber