Dimitris Tsirogiannis has posted comments on this change. Change subject: Simplify creating external Kudu tables and add DROP DATABASE CASCADE ......................................................................
Patch Set 9: (26 comments) Flushing out some comments. Haven't looked at the tests yet. http://gerrit.cloudera.org:8080/#/c/2617/9/fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java File fe/src/main/java/com/cloudera/impala/analysis/ColumnDef.java: Line 125: this. You can get rid off 'this'. http://gerrit.cloudera.org:8080/#/c/2617/9/fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java File fe/src/main/java/com/cloudera/impala/analysis/CreateTableStmt.java: Line 151: RowFormat getRowFormat() { Private? public? Line 153: fileFormat_ == null Out of curiosity, when does this happen? Line 193: getRowFormat() rowFormat Line 228: // The old way of specifying a Kudu table was to use the storage handler table : // property. The new way is to specify the file format. Analysis is simpler if : // the old way is rejected. Not sure if this comment is particularly useful. Line 303: analyzeKudu maybe analyzeKuduStmt or analyzeKuduProperties? http://gerrit.cloudera.org:8080/#/c/2617/9/fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/com/cloudera/impala/catalog/CatalogServiceCatalog.java: Line 153: address nit: addresses http://gerrit.cloudera.org:8080/#/c/2617/9/fe/src/main/java/com/cloudera/impala/catalog/delegates/KuduDdlDelegate.java File fe/src/main/java/com/cloudera/impala/catalog/delegates/KuduDdlDelegate.java: Line 66: private String getTableName() { : return msTbl_.getTableName(); : } Single line? Line 96: if (Strings.isNullOrEmpty(masterAddrs)) { : throw new ImpalaRuntimeException(String.format( : "Table property '%s' is required when catalogd startup flag " + : "-kudu_master_addrs is not used.", KuduTable.KEY_MASTER_ADDRESSES)); : } Do we need this check this here? Shouldn't we be checking the default master addrs when we initialize the catalog? Line 119: explicitly or implicitly What does that mean? Do you mean fully qualified (e.g. db.table_name)? Line 128: new ArrayList<>(2) Lists.newArrayList(2); Line 183: if (kudu.tableExists(kuduTableName)) { What will happen if I try to create a managed kudu table and the kudu table name in tblproperties is not specified, i.e. kuduTableName is null or empty? Line 234: getKuduTableName() Isn't that the same as the kuduTableName? http://gerrit.cloudera.org:8080/#/c/2617/9/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java: Line 1115: fetched "from the Hive MetaStore". Line 1115: The metadata contained in the Metastore Table class is : // enough to identify the format. A direct RPC against the Metastore is used to avoid : // loading unnecessary partition data and statistics as would be done through the : // CatalogServiceCatalog. I think you can remove this part of the comment. Line 1119: new ArrayList<>(); Use guava: Lists.newArrayList(); also in L1120 Line 1130: assume that any table not in its cache also doesn't require the : // use of a DdlDelegate Maybe I've missed something, but I am not sure the code below reflects that comment. From what I see, you try to create a ddl delegate for every table that is simply not loaded in Impala. Line 1132: Users can issue : // an REFRESH command to load the database metadata before dropping to ensure : // delegates will be used when needed. Isn't that what is essentially happening in L1148? Line 1145: if (!incompleteTableNames.isEmpty()) { : MetaStoreClient msClient = catalog_.getMetaStoreClient(); : try { : msTables.addAll(msClient.getHiveClient().getTableObjectsByName( : db.getName(), incompleteTableNames)); : } catch (TException e) { : throw new ImpalaRuntimeException( : String.format(HMS_RPC_ERROR_FORMAT_STR, "getTableObjectsByName"), e); : } finally { : msClient.release(); : } : } I think this block will only be executed in the db != null case. Why don't you move it inside the if block above? I think it will make it a bit easier to follow. Line 1158: createDdlDelegate(msTable) My understanding is that msTables contains all the tables that haven't been loaded in Impala. That might include both kudu and non-kudu tables, right? So, is it ok to try to create a ddl delegate and then call dropTable on it? Line 1542: DdlDelegate ddlDelegate = createDdlDelegate(newTable) : .setDistributeParams(distributeBy); : ddlDelegate.createTable(); It feels weird to be calling a DdlDelegate even for tables that don't need one, even though this turns out to be a no-op. Can you leave a TODO to clean this up? Also, don't we support IF NOT EXISTS for Kudu tables? We don't seem to be taking this into account here. Line 2731: sd != null In which cases sd is null? http://gerrit.cloudera.org:8080/#/c/2617/9/fe/src/main/java/com/cloudera/impala/service/JniCatalog.java File fe/src/main/java/com/cloudera/impala/service/JniCatalog.java: Line 78: defaultKuduMasterAddrs Why 'default'? Couldn't it be any Kudu master addrs specified through the starting arguments? http://gerrit.cloudera.org:8080/#/c/2617/9/fe/src/main/java/com/cloudera/impala/util/KuduClient.java File fe/src/main/java/com/cloudera/impala/util/KuduClient.java: Line 34: public class KuduClient I wish we could keep it in sync with the org.kududb.client API. Maybe if they their client was implementing a specific interface we could catch diverging APIs faster. Line 46: nit: maybe "Kudu table" here and in all the other error messages? http://gerrit.cloudera.org:8080/#/c/2617/9/fe/src/main/java/com/cloudera/impala/util/KuduUtil.java File fe/src/main/java/com/cloudera/impala/util/KuduUtil.java: Line 238: Kudu type %s is not supported in Impala Is that error message accurate? I think Impala supports a superset of the data types supported by Kudu. -- To view, visit http://gerrit.cloudera.org:8080/2617 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic141102818b6dad3016181b179a14024d0ff709d Gerrit-PatchSet: 9 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Casey Ching <[email protected]> Gerrit-Reviewer: Casey Ching <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-HasComments: Yes
