Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14811 )

Change subject: IMPALA-8046: Support CREATE TABLE from an ORC file
......................................................................


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/14811/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14811/3//COMMIT_MSG@19
PS3, Line 19: CREATE TABLE tablename LIKE ORC '/path/to/file'
CREATE TABLE LIKE without file format name always means Parquet? It would be 
good to write this down in the commit message, and probably somewhere in the 
tests.


http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java
File fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java:

http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@35
PS3, Line 35: import org.apache.orc.OrcFile;
            : import org.apache.orc.OrcFile.ReaderOptions;
            : import org.apache.orc.Reader;
            : import org.apache.orc.TypeDescription;
            : import org.apache.orc.TypeDescription.Category;
Did we already have this dependency? It would be good to know where do we get 
the orc java dependencies from, e.g. does it come from Hive? In that case our 
java Orc version depends on the Hive version, while c++ Orc version depends on 
native-toolchain.


http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@80
PS3, Line 80:       case BYTE: return Type.TINYINT;
I saw different type names in https://orc.apache.org/docs/types.html, e.g. this 
was tiny int. Maybe the Orc types are mapped to java types by the library? If 
yes, then this could be mentioned in a comment.


http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@81
PS3, Line 81:       case CHAR: return Type.CHAR;
For CHAR(N) we should need the N parameter too. I also didn't see any test for 
this.


http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@83
PS3, Line 83:       case DECIMAL: return Type.DECIMAL;
This seems incomplete to me, we should need precision and scale too.


http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/analysis/ORCSchemeExtractor.java@91
PS3, Line 91:       case VARCHAR: return Type.VARCHAR;
Same as for CHAR(N).


http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/util/FileAnalysisUtil.java
File fe/src/main/java/org/apache/impala/util/FileAnalysisUtil.java:

http://gerrit.cloudera.org:8080/#/c/14811/3/fe/src/main/java/org/apache/impala/util/FileAnalysisUtil.java@1
PS3, Line 1: package org.apache.impala.util;
Missing apache license (this also led to the build failure)


http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test
File 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test:

http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test@58
PS3, Line 58:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-file-orc.test@74
PS3, Line 74: ====
            : ---- QUERY
            : drop table allcomplextypes_clone_orc
            : ---- RESULTS
            : 'Table has been dropped.'
Why is this needed? Other tables are not dropped + it is done automatically 
where we use unique_database.


http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-table.test
File 
testdata/workloads/functional-query/queries/QueryTest/create-table-like-table.test:

http://gerrit.cloudera.org:8080/#/c/14811/3/testdata/workloads/functional-query/queries/QueryTest/create-table-like-table.test@255
PS3, Line 255: drop table if exists allcomplextypes_clone
We shouldn't need this - test_create_table_like_table() uses a unique_database, 
so no table exists in it and it will be automatically cleaned up after the test 
run.



--
To view, visit http://gerrit.cloudera.org:8080/14811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I77cd84cda2ed86516937a67eb320fd41e3f1cf2d
Gerrit-Change-Number: 14811
Gerrit-PatchSet: 3
Gerrit-Owner: Norbert Luksa <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Fri, 29 Nov 2019 14:39:46 +0000
Gerrit-HasComments: Yes

Reply via email to