Arnab Karmakar has posted comments on this change. ( http://gerrit.cloudera.org:8080/23536 )
Change subject: IMPALA-13066: Extend SHOW CREATE TABLE to include stats and partitions ...................................................................... Patch Set 11: (7 comments) http://gerrit.cloudera.org:8080/#/c/23536/10/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/23536/10/common/thrift/ImpalaService.thrift@1075 PS10, Line 1075: SHOW_CREATE_TABLE_PARTITION_LIMIT = 198 > The name is too generic that it seems a limit on all statements, including Done http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java: http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@180 PS10, Line 180: // Remove storage handler and internal ids > It seems this is the only difference between 'createTableProps' and 'alterT Done http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@494 PS10, Line 494: // Filter properties into two maps: one for CREATE TABLE, one for ALTER TABLE : Map<String, String> createTableProps = Maps.newLinkedHashMap(); : M > nit: we can simplify this to only use remove(). Done http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@658 PS10, Line 658: } : if (numRows >= 0) { : allTblProps.put("numRows", Long.toString(numRows)); : } : : // Add STATS_GENERATED_VIA_STATS_TASK if present in HMS parameters : if (msTbl != null && msTbl.getParameters() != null) { : String statsGenerated = msTbl.getParameters().get("STATS_GENERATED_VIA_STATS_TASK"); : if (statsGenerated != null) { : allTblProps.put("STATS_GENERATED_VIA_STATS_TASK", statsGenerated); : } : } : : / > Do we still need these if "numRows" is not removed in filterTblProperties() FeTable.NUM_ROWS = "numRows" and StatsSetupConst.ROW_COUNT are two different property keys. These line use table.getNumRows() which is the cached in-memory value from the FeTable interface and falls back to HMS if needed. If the in-memory value is negative (unknown), it tries to get the value from HMS using StatsSetupConst.ROW_COUNT http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@691 PS10, Line 691: out.append(warnings); : } : : return out.toString(); : } : : /** : * Appends ALTER TABLE ... SET COLUMN STATS statements for all non-partition columns : * that have statistics. : * @param table The table to generate column stats for : * @param out StringBuilder to append the statements to : * @return true if any column stats were appended, false otherwise : */ : private static boolean appendColumnStatsStatements(FeTable table, StringBuilder out) { : int numClusterCols = table.getNumClusteringCols(); : boolean hasColumnStats = false; : : for (int i = numClusterCols; i < table.getColumns().size(); i++) { : Column c = table.getColumns().get(i); : ColumnStats s = c.getStats(); : if (s == null) continue; : boolean isFixed = c.getType() != null && c.getType().isFixedLengthType(); : : List<String> kvs = new ArrayList<>(); : // Always include NDV and nulls (may be -1 for unknown) : kvs.add("'numDVs'='" + s.getNumDistinctValues() + "'"); : kvs.add("'numNulls'='" + s.getNumNulls() + "'"); : // Include size stats only for variable-length types : if (!isFixed) { : kvs.add("'maxSize'='" + s.getMaxSize() + "'"); : double avg = s.getAvgSize(); : String avgStr = (Math.rint(avg) == avg) ? : Long.toString((long) avg) : Double.toString(avg); : kvs.add("'avgSize'='" + avgStr + "'"); : } : // Include boolean-specific counts (may be -1 for non-boolean types) : kvs.add("'numTrues'='" + s.getNumTrues() + "'"); : > Can we extract these into a method which returns 'hasColumnStats'? The curr Ive added new helper methods for refactoring the method: appendColumnStatsStatements() appendPartitionStatements() appendAddPartitionStatements() appendPartitionPropertiesStatements() http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@735 PS10, Line 735: > I think we can move "numClusterCols > 0" to the check on L731. "numClusterC Done http://gerrit.cloudera.org:8080/#/c/23536/10/tests/metadata/test_show_create_table.py File tests/metadata/test_show_create_table.py: http://gerrit.cloudera.org:8080/#/c/23536/10/tests/metadata/test_show_create_table.py@261 PS10, Line 261: def __get_partition_properties(self, sql_str): : """ Extract properties from partition-level SET TBLPROPERTIES statements. : Handles statements like: : ALTER TABLE ... PARTITION (p=1) SET TBLPROPERTIES ('key'='value', ...) > I think the problem is due to missing a space between "TBLPROPERTIES" and " Ive added a space between TBLPROPERTIES and "(" now so properties_map_regex() is working now. -- To view, visit http://gerrit.cloudera.org:8080/23536 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I87950ae9d9bb73cb2a435cf5bcad076df1570dc2 Gerrit-Change-Number: 23536 Gerrit-PatchSet: 11 Gerrit-Owner: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Thu, 06 Nov 2025 18:17:53 +0000 Gerrit-HasComments: Yes
