Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14811 )
Change subject: IMPALA-8046: Support CREATE TABLE from an ORC file ...................................................................... Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/14811/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14811/3//COMMIT_MSG@19 PS3, Line 19: CREATE TABLE tablename LIKE ORC '/path/to/file' CREATE TABLE LIKE without file format name always means Parquet? It would be good to write this down in the commit message, and probably somewhere in the tests. http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java File fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java: http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@35 PS3, Line 35: import org.apache.orc.OrcFile; : import org.apache.orc.OrcFile.ReaderOptions; : import org.apache.orc.Reader; : import org.apache.orc.TypeDescription; : import org.apache.orc.TypeDescription.Category; Did we already have this dependency? It would be good to know where do we get the orc java dependencies from, e.g. does it come from Hive? In that case our java Orc version depends on the Hive version, while c++ Orc version depends on native-toolchain. http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@80 PS3, Line 80: case BYTE: return Type.TINYINT; I saw different type names in https://orc.apache.org/docs/types.html, e.g. this was tiny int. Maybe the Orc types are mapped to java types by the library? If yes, then this could be mentioned in a comment. http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@81 PS3, Line 81: case CHAR: return Type.CHAR; For CHAR(N) we should need the N parameter too. I also didn't see any test for this. http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@83 PS3, Line 83: case DECIMAL: return Type.DECIMAL; This seems incomplete to me, we should need precision and scale too. http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@91 PS3, Line 91: case VARCHAR: return Type.VARCHAR; Same as for CHAR(N). http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/util/FileAnalysisUtil.java File fe/src/main/java/org/apache/impala/util/FileAnalysisUtil.java: http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/util/FileAnalysisUtil.java@1 PS3, Line 1: package org.apache.impala.util; Missing apache license (this also led to the build failure) http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test File testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test: http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test@58 PS3, Line 58: nit: extra space http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test@74 PS3, Line 74: ==== : ---- QUERY : drop table allcomplextypes_clone_orc : ---- RESULTS : 'Table has been dropped.' Why is this needed? Other tables are not dropped + it is done automatically where we use unique_database. http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-table.test File testdata/workloads/functional-query/queries/QueryTest/create-table-like-table.test: http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-table.test@255 PS3, Line 255: drop table if exists allcomplextypes_clone We shouldn't need this - test_create_table_like_table() uses a unique_database, so no table exists in it and it will be automatically cleaned up after the test run. -- To view, visit http://gerrit.cloudera.org:8080/14811 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I77cd84cda2ed86516937a67eb320fd41e3f1cf2d Gerrit-Change-Number: 14811 Gerrit-PatchSet: 3 Gerrit-Owner: Norbert Luksa <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 29 Nov 2019 14:39:46 +0000 Gerrit-HasComments: Yes
