Hello community,

I have a spark job that writes parquet files partitioned by year, month and
day, in top of these parquet files i'm creating different impala external
tables, i have a retention job that cleans the parquet files in a daily
basis, in the impala side i'm running a daily ALTER TABLE xxxxxx recover
partitions and hourly REFRESH TABLE xxxx, when i'm running show
partitions i see old partitions and they shows up as Zero size and number
of files.

1) are these partitions impacting the hive metastore performance and memory
used by the metastore?

2) Is there a way to drop these partitions without running ALTER TABLE xxx
DROP IF EXISTS PARTITION, i believe also DROP and CREATE table can make
this change but this is a heavy solution.

3) Is ALTER TABLE xxxx recover partitions looking on the newly created
partitions?

-- 
Take Care
Fawze Abujaber

Reply via email to