Arnab Karmakar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23536 )

Change subject: IMPALA-13066: Extend SHOW CREATE TABLE to include stats and 
partitions
......................................................................


Patch Set 11:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/23536/10/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/23536/10/common/thrift/ImpalaService.thrift@1075
PS10, Line 1075:   SHOW_CREATE_TABLE_PARTITION_LIMIT = 198
> The name is too generic that it seems a limit on all statements, including
Done


http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java:

http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@180
PS10, Line 180:       // Remove storage handler and internal ids
> It seems this is the only difference between 'createTableProps' and 'alterT
Done


http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@494
PS10, Line 494:     // Filter properties into two maps: one for CREATE TABLE, 
one for ALTER TABLE
              :     Map<String, String> createTableProps = 
Maps.newLinkedHashMap();
              :     M
> nit: we can simplify this to only use remove().
Done


http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@658
PS10, Line 658:     }
              :     if (numRows >= 0) {
              :       allTblProps.put("numRows", Long.toString(numRows));
              :     }
              :
              :     // Add STATS_GENERATED_VIA_STATS_TASK if present in HMS 
parameters
              :     if (msTbl != null && msTbl.getParameters() != null) {
              :       String statsGenerated = 
msTbl.getParameters().get("STATS_GENERATED_VIA_STATS_TASK");
              :       if (statsGenerated != null) {
              :         allTblProps.put("STATS_GENERATED_VIA_STATS_TASK", 
statsGenerated);
              :       }
              :     }
              :
              :     /
> Do we still need these if "numRows" is not removed in filterTblProperties()
FeTable.NUM_ROWS = "numRows" and StatsSetupConst.ROW_COUNT are two different 
property keys.
These line use table.getNumRows() which is the cached in-memory value from the 
FeTable interface and falls back to HMS if needed. If the in-memory value is 
negative (unknown), it tries to get the value from HMS using 
StatsSetupConst.ROW_COUNT


http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@691
PS10, Line 691:       out.append(warnings);
              :     }
              :
              :     return out.toString();
              :   }
              :
              :   /**
              :    * Appends ALTER TABLE ... SET COLUMN STATS statements for 
all non-partition columns
              :    * that have statistics.
              :    * @param table The table to generate column stats for
              :    * @param out StringBuilder to append the statements to
              :    * @return true if any column stats were appended, false 
otherwise
              :    */
              :   private static boolean appendColumnStatsStatements(FeTable 
table, StringBuilder out) {
              :     int numClusterCols = table.getNumClusteringCols();
              :     boolean hasColumnStats = false;
              :
              :     for (int i = numClusterCols; i < table.getColumns().size(); 
i++) {
              :       Column c = table.getColumns().get(i);
              :       ColumnStats s = c.getStats();
              :       if (s == null) continue;
              :       boolean isFixed = c.getType() != null && 
c.getType().isFixedLengthType();
              :
              :       List<String> kvs = new ArrayList<>();
              :       // Always include NDV and nulls (may be -1 for unknown)
              :       kvs.add("'numDVs'='" + s.getNumDistinctValues() + "'");
              :       kvs.add("'numNulls'='" + s.getNumNulls() + "'");
              :       // Include size stats only for variable-length types
              :       if (!isFixed) {
              :         kvs.add("'maxSize'='" + s.getMaxSize() + "'");
              :         double avg = s.getAvgSize();
              :         String avgStr = (Math.rint(avg) == avg) ?
              :           Long.toString((long) avg) : Double.toString(avg);
              :         kvs.add("'avgSize'='" + avgStr + "'");
              :       }
              :       // Include boolean-specific counts (may be -1 for 
non-boolean types)
              :       kvs.add("'numTrues'='" + s.getNumTrues() + "'");
              :
> Can we extract these into a method which returns 'hasColumnStats'? The curr
Ive added new helper methods for refactoring the method:
appendColumnStatsStatements()
appendPartitionStatements()
appendAddPartitionStatements()
appendPartitionPropertiesStatements()


http://gerrit.cloudera.org:8080/#/c/23536/10/fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java@735
PS10, Line 735:
> I think we can move "numClusterCols > 0" to the check on L731. "numClusterC
Done


http://gerrit.cloudera.org:8080/#/c/23536/10/tests/metadata/test_show_create_table.py
File tests/metadata/test_show_create_table.py:

http://gerrit.cloudera.org:8080/#/c/23536/10/tests/metadata/test_show_create_table.py@261
PS10, Line 261:   def __get_partition_properties(self, sql_str):
              :     """ Extract properties from partition-level SET 
TBLPROPERTIES statements.
              :     Handles statements like:
              :     ALTER TABLE ... PARTITION (p=1) SET TBLPROPERTIES 
('key'='value', ...)
> I think the problem is due to missing a space between "TBLPROPERTIES" and "
Ive added a space between TBLPROPERTIES and "(" now so properties_map_regex() 
is working now.



--
To view, visit http://gerrit.cloudera.org:8080/23536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I87950ae9d9bb73cb2a435cf5bcad076df1570dc2
Gerrit-Change-Number: 23536
Gerrit-PatchSet: 11
Gerrit-Owner: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Arnab Karmakar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Thu, 06 Nov 2025 18:17:53 +0000
Gerrit-HasComments: Yes

Reply via email to