Dimitris Tsirogiannis has posted comments on this change. Change subject: IMPALA-2848: Simplify creation of a new Kudu table ......................................................................
Patch Set 4: (20 comments) Initial pass. Haven't gone through the tests yet. http://gerrit.cloudera.org:8080/#/c/2984/4/fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java File fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java: Line 51: private final boolean isPrimaryKey_; You may want to add a comment. This is specific to kudu tables only. Line 178: mapByColumnNames( maybe "getNameToColumnDefinitionMap()" http://gerrit.cloudera.org:8080/#/c/2984/4/fe/src/main/java/com/cloudera/impala/analysis/CreateTableAsSelectStmt.java File fe/src/main/java/com/cloudera/impala/analysis/CreateTableAsSelectStmt.java: Line 195: : I thought this was only for Kudu tables. http://gerrit.cloudera.org:8080/#/c/2984/4/fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java File fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java: Line 104: those are not included here Out of curiosity, why is that the case? Line 179: throwIfPrimaryKeysWereSpecified Can't you use the generic throwIfNotNullOrNotEmpty here as well? Line 211: // TODO: Find out what is creating a directory in HDFS and stop doing that. Kudu : // tables shouldn't have HDFS dirs. Is this still relevant? Maybe create a JIRA and reference it here? Line 221: throwIfPrimaryKeysWereSpecified( Same comment as in L179 Line 233: else { I would put the entire else block in a separate function. Line 265: if (!getDistributeParams().isEmpty()) { : Map<String, ColumnDef> primaryKeyColDefsByName : = ColumnDef.mapByColumnNames(getPrimaryKeyColumnDefs()); : Map<String, DistributeParam> hashParamsByColName = Maps.newHashMap(); : for (DistributeParam distributeParam: getDistributeParams()) { : List<String> colNames; : if (distributeParam.getColumnNames().isEmpty()) { : colNames = ColumnDef.toColumnNames(getPrimaryKeyColumnDefs()); : } else { : colNames = distributeParam.getColumnNames(); : } : for (String colName : colNames) { : ColumnDef columnDef = primaryKeyColDefsByName.get(colName); : if (columnDef == null) { : throw new AnalysisException(String.format("Column '%s' in '%s' is not " + : "a key column. Only key columns can be used in DISTRIBUTE BY.", : colName, distributeParam.toSql())); : } : if (distributeParam.getType() == Type.HASH) { : DistributeParam otherDistributeParam = hashParamsByColName.get(colName); : if (otherDistributeParam == null) { : hashParamsByColName.put(colName, distributeParam); : } else { : throw new AnalysisException(String.format("Column '%s' used in '%s' is " + : "already used in '%s'", colName, distributeParam.toSql(), : otherDistributeParam.toSql())); : } : } : distributeParam.getColumnDefs().add(columnDef); : } : distributeParam.analyze(analyzer); : } : } Can you briefly comment at the beginning of that block what you're doing here? Line 314: isKeyType nit: maybe rename to isSupportedKeyType http://gerrit.cloudera.org:8080/#/c/2984/4/fe/src/main/java/com/cloudera/impala/analysis/DistributeParam.java File fe/src/main/java/com/cloudera/impala/analysis/DistributeParam.java: Line 82: // The ColumnDefs that the colNames_ refer to. The caller of analyze() must set these : // before calling. : private final List<ColumnDef> colDefs_ = Lists.newArrayList(); nit: can you move it below colNames_? Line 89: for (String name: colNames) { : colNames_.add(name.toLowerCase()); : } single line Line 98: Preconditions.checkState(!colDefs_.isEmpty()); : Preconditions.checkState(colNames_.isEmpty() || colNames_.size() == colDefs_.size()); Not sure I follow those checks. Is it ok to have colDefs_ which are not empty and colNames_ which are empty? Line 101: Integer.MAX_VALUE Interesting, can Kudu really handle that many buckets? :) Line 106: longValue() Can't you use that in L101 to avoid the creation of a BigDecimal instance? Line 127: Preconditions.checkState(colType.isIntegerType() || colType.isStringType()); I believe you already have a function for that check. http://gerrit.cloudera.org:8080/#/c/2984/4/fe/src/main/java/com/cloudera/impala/analysis/TableDef.java File fe/src/main/java/com/cloudera/impala/analysis/TableDef.java: Line 48: private List<ColumnDef> primaryKeyColDefs_ = Lists.newArrayList(); final? http://gerrit.cloudera.org:8080/#/c/2984/4/fe/src/main/java/com/cloudera/impala/analysis/TableName.java File fe/src/main/java/com/cloudera/impala/analysis/TableName.java: Line 73: if (db_ == null) { : return tbl_; I don't think a fully qualified name can have only a table nane; http://gerrit.cloudera.org:8080/#/c/2984/4/fe/src/main/java/com/cloudera/impala/catalog/KuduTable.java File fe/src/main/java/com/cloudera/impala/catalog/KuduTable.java: Line 155: TableLoadingException parseColumnType throws a TableLoadingException? this is weird... http://gerrit.cloudera.org:8080/#/c/2984/4/fe/src/main/java/com/cloudera/impala/util/KuduUtil.java File fe/src/main/java/com/cloudera/impala/util/KuduUtil.java: Line 190: provide an : * explicit name. That description is kind of awkward given that the input params are a database and a table name which are used to create a name :) -- To view, visit http://gerrit.cloudera.org:8080/2984 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Iba5714941a81bad7a410c850e4037637cffaf8ab Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Casey Ching <[email protected]> Gerrit-Reviewer: Casey Ching <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-HasComments: Yes
