1. show partition enhanced I think it's more useful to see the data volume of each partiton, futhermore, the data volume of every day will an important metric which cann't be count now. Sum up, if the partition column is "day, hour" Shall we focus on these two problem: Q1: How can we find the data volumn of each hour? ------we can aggreagete the data volumne of each segment belong to this parition, it is easy. Q2: How can we find the data volumn of each day? ------ May be add a option in "show partiitons", like "show paritions groupby DAY"?
2. show load delay Shall we add an option (dryRun = true) in the "insert stage" command, to output the statitis of stages aren't loaded. the output can be | dtm-20200219/hh=13 | incompletely load, there are still 200 stages waiting for loading | dtm-20200219/hh=14 | completely load | dtm-20200219/hh=15 | completely load | dtm-20200219/hh=16 | completely load | dtm?20200219/hh=17 | incompletely load, there are still 1800 stages waiting for loading 2. show segment enhanced Only coarse-grained statistics maybe better. Shall we just show the paritition each segment belonging to, rather than outputing the min(collect_time). -- Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
