Sergey Soldatov created PHOENIX-4544: ----------------------------------------
Summary: Update statistics inconsistent behavior Key: PHOENIX-4544 URL: https://issues.apache.org/jira/browse/PHOENIX-4544 Project: Phoenix Issue Type: Bug Affects Versions: 5.x Reporter: Romil Choksi Update statistics may not generate the stats information for all dependent indexes. And this behavior may depend on whether the command executed synchronously or asynchronously. I have a table GIGANTIC_TABLE with ~500k rows with global index I1 and local index I2. If async is turned on (the default value): {noformat} 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE ALL; No rows affected (0.081 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1' AND COLUMN_FAMILY='0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 5 | +-------------------------------+ 1 row selected (0.009 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 520 | +-------------------------------+ 1 row selected (0.014 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='L#0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 0 | +-------------------------------+ 1 row selected (0.008 seconds) 0: jdbc:phoenix:> {noformat} As we can see there is no records for local index I2. But if we run statistics for indexes: {noformat} 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE INDEX; No rows affected (0.036 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='L#0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 20 | +-------------------------------+ 1 row selected (0.007 seconds) {noformat} the statistic for local index is generated correctly. Now we turn async off: {noformat} 0: jdbc:phoenix:> delete from SYSTEM.STATS; 547 rows affected (0.079 seconds) 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE ALL; 999,998 rows affected (4.671 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 520 | +-------------------------------+ 1 row selected (0.04 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='L#0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 20 | +-------------------------------+ 1 row selected (0.012 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1' AND COLUMN_FAMILY='0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 0 | +-------------------------------+ 1 row selected (0.011 seconds) {noformat} As we can see we got statistics for the table itself and local index. But not for the global index. Moreover, if we try to update statistics for indexes: {noformat} 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE INDEX; 499,999 rows affected (0.332 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1' AND COLUMN_FAMILY='0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 0 | +-------------------------------+ 1 row selected (0.009 seconds) {noformat} So, still no records for global index. But if we delete statistics first and run update for indexes: {noformat} 0: jdbc:phoenix:> delete from SYSTEM.STATS; 541 rows affected (0.024 seconds) 0: jdbc:phoenix:> update statistics GIGANTIC_TABLE INDEX; 999,998 rows affected (0.41 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='I1' AND COLUMN_FAMILY='0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 5 | +-------------------------------+ 1 row selected (0.01 seconds) 0: jdbc:phoenix:> select count(GUIDE_POSTS_ROW_COUNT) from SYSTEM.STATS WHERE PHYSICAL_NAME='GIGANTIC_TABLE' AND COLUMN_FAMILY='L#0'; +-------------------------------+ | COUNT(GUIDE_POSTS_ROW_COUNT) | +-------------------------------+ | 20 | +-------------------------------+ 1 row selected (0.01 seconds) {noformat} than we got statistics for both local and global indexes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)