Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/17521 )
Change subject: IMPALA-10700: Add query options to skip deleting stats ...................................................................... Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/17521/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17521/1//COMMIT_MSG@16 PS1, Line 16: to nit: remove 'to' http://gerrit.cloudera.org:8080/#/c/17521/1//COMMIT_MSG@24 PS1, Line 24: 1. SKIP_DELETE_TABLE_STATS nit: would be good to specify the default option value in the commit message (helps out the docs team). http://gerrit.cloudera.org:8080/#/c/17521/1/be/src/service/query-options.h File be/src/service/query-options.h: http://gerrit.cloudera.org:8080/#/c/17521/1/be/src/service/query-options.h@248 PS1, Line 248: QUERY_OPT_FN(skip_delete_table_stats, SKIP_DELETE_TABLE_STATS,\ : TQueryOptionLevel::ADVANCED)\ : QUERY_OPT_FN(skip_delete_column_stats, SKIP_DELETE_COLUMN_STATS,\ : TQueryOptionLevel::ADVANCED) > nit: Can we mention TRUNCATE in the names since these only affects TRUNCATE Agree with adding TRUNCATE in the name. Also why not just DELETE_TABLE_STATS_IN_TRUNCATE and leave the default as True ? Do we really need the SKIP prefix ? makes it verbose. I am curious why we need 2 separate options. One could get into a state where if one of these is enabled, the table stats may not be consistent with the column stats e.g the NDV is greater than row count. Even though this can occur in practice, I think if the pre-truncate state was a consistent state, then the post-truncate should also remain so and it would be useful to not give users more knobs than needed. Any thoughts ? http://gerrit.cloudera.org:8080/#/c/17521/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java: http://gerrit.cloudera.org:8080/#/c/17521/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@2323 PS1, Line 2323: if (!params.isSetSkip_delete_column_stats()) { This is pre-existing behavior but I would have expected the stats deletion to be done before the table truncate operation since the metadata has dependency on the data. The truncateIcebergTable() does do it in the right sequence. But since this is existing behavior, I am ok with not changing it. -- To view, visit http://gerrit.cloudera.org:8080/17521 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9400c3586b4bdf46d9b4056ea1023aabae8cc519 Gerrit-Change-Number: 17521 Gerrit-PatchSet: 1 Gerrit-Owner: Vihang Karajgaonkar <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Thu, 27 May 2021 06:51:20 +0000 Gerrit-HasComments: Yes
