Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10869 )
Change subject: IMPALA-5035 Impala should not load INDEX_TABLEs from HMS ...................................................................... Patch Set 5: (4 comments) In addition to the one implementation concern, I'm not sure that I agree with the behavior change here. It seems inconsistent in a few ways: 1) for other unsupported table types, we still list them, and fail to load. Why should index tables be treated differently than unsupported serdes? Won't this produce confusing behavior if someone attempts to create a table with a conflicting name? 2) Seems inconsistent with Hive as well. 'SHOW TABLES IN functional;' in hive does show hive_index_tbl. Why introduce the inconsistency between engines? Perhaps we'd be better off just throwing a more useful exception like "Hive index tables are not supported" if we think that's better than the 'Unsupported table type: INDEX_TABLE' message? http://gerrit.cloudera.org:8080/#/c/10869/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10869/5//COMMIT_MSG@28 PS5, Line 28: egion; //external table it seems you have some typos here. Did you not just copy-paste this from the shell? http://gerrit.cloudera.org:8080/#/c/10869/5//COMMIT_MSG@29 PS5, Line 29: > show create talbe hive_index_tbl; here too http://gerrit.cloudera.org:8080/#/c/10869/5//COMMIT_MSG@31 PS5, Line 31: ERROR: AnalysisException: Table does not exist: hive_index_tbl what is the behavior if you try to create a table with the same name that conflicts with a Hive index table? http://gerrit.cloudera.org:8080/#/c/10869/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/10869/5/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1055 PS5, Line 1055: List<org.apache.hadoop.hive.metastore.api.Table> tableList = : msClient.getHiveClient().getTableObjectsByName(dbName, tableNameList); I'm worried about the performance of this in metastores with large numbers of tables. Loading the full table object at this stage is a lot more expensive than just the names. In a recent case I was working on with a production cluster there were many millions of tables and putting this here would be very harmful to performance. -- To view, visit http://gerrit.cloudera.org:8080/10869 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6e2ccbeee3e612eab96d89a211d15fc119b08d70 Gerrit-Change-Number: 10869 Gerrit-PatchSet: 5 Gerrit-Owner: Peikai Zheng <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Comment-Date: Mon, 09 Jul 2018 17:53:32 +0000 Gerrit-HasComments: Yes
