sorry i missed to mention this, i'm using impala 2.10 with Cloudera manager 5.13.0
On Fri, Oct 19, 2018 at 9:30 PM Bharath Vissapragada <[email protected]> wrote: > What version of Impala are you on? > > On Fri, Oct 19, 2018 at 11:14 AM Fawze Abujaber <[email protected]> wrote: > >> Hello Community, >> >> I Have 400 Impala tables that partitioned by Year,month and day, and the >> retention for these tables is 6 months. >> >> I would like to increase these tables partitions by adding the first 2 >> digits of the account, that meaning i will increase the partitions of each >> table by X100. >> >> For sure i will review these tables and make sure i do this for the large >> tables only. >> >> Is there is a limit for the number of partitions for each table, >> theorytically No but intersting to know the best practises, I know this >> will impact the metastore and catalog server. >> >> What i'm looking for is: >> >> 1- How i can check the size for the metadata that each impala node store >> and the catalog server as a whole? >> >> 2- Is there a linear relationship between number of tables/partitions and >> the memory needed for the metastore and catalog server? >> In other words, for example if i would like to do the mention change, >> what is the needed changes i should do in terms of memory for the >> metastore, Catalog, and Impala Daemon to minmize the impact. >> >> 3- Is there a relationship between the DDL statements that i will do >> (mainly DROP partitions) and the memory of the metastore and Catalog, and >> impala daemon memory? >> >> >> 4- Is there any metric in Cloudera Manager that i can use to get about >> the partitions and it's impact on the mentioned 3 Roles? >> >> 5- in a note a side, on 200 of the impala tables i have, i have to run >> ALTER Table xxxx recover partitions each 20 minutes, and DROP/CREATE tables >> twice a day. >> which actions i can take to reduce the running time of these operations. >> >> I'm intersting to know the actions that i can terms in terms of: >> >> A) Number of impala daemons in the cluster (adding more nodes). >> B) Number of the nodes that can act as coordinator ( I'm using VIP for >> the cordinator and i can drop and add nodes to this VIP). >> C) The impala daemon memory limit. >> D) The catalog role memory and the hive metastore memory. >> >> >> >> -- >> Take Care >> Fawze Abujaber >> > -- Take Care Fawze Abujaber
