Hello Community, I'm investigating an issue we are running on with impala DDL statements which sometimes took more than 6-9 minutes.
We have around 144 impala tables that partitioned by YYYY/MM/DD We are keeping the data between 3-13 months depend on the table, we are running 3 different DDL statements. ---> ALTER Table recover partitions each 20 minutes to detect the new data generated by spark job and written into the HDFS. ---> DROP AND CREATE table twice a day to detect new schema changes in the data. I don't see the issue occuring on specific table or specific Impala daemon, On the other side we have 450 hive tables that we running the same DDL statements on using the hive. Trying to find a way to investigate this with no success, for example i want to check the size of the metadata stored at each daemon in order to see if my issue related to the metadata size or not, but don't aware how to check this. Any suggestions on how to investigate this issue is much appreciated. -- Take Care Fawze Abujaber
