[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala

wangsheng (Code Review) Wed, 15 Jul 2020 23:26:48 -0700

wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )


Change subject: IMPALA-9741: Supported query Icebreg table by impala
......................................................................


Patch Set 8:

(13 comments)

Done!

http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift@512
PS6, Line 512: column_to_sourc
> nit: column_to_source_id ?
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift@515
PS6, Line 515: source_id_to_partition
> The mapping is reversed. Name it "source_id_to_partition" ?
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift@516
PS6, Line 516: map<string,THdfsFileDesc> path_md5_to_file
> Please follow the above conventions for naming maps.
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java:

http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java@28
PS6, Line 28:   // The id of the source column in the Iceberg table schema. The 
source column is
            :   // used as the input for this partition field.
> Might worth rewording it a bit:
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java@88
PS6, Line 88:         if (table_ instanceof FeIcebergTable) {
            :           if (((FeIcebergTable) 
table_).getSourceIdToPartitionMap().isEmpty()) {
            :             notPartitioned = true;
            :           }
> Probably we should treat all Iceberg tables as not partitioned, since it's
Yes, you are right, we treated iceberg table as unpartitioned hdfs table, but 
iceberg table still has it's own partition info, we get this info by 'show 
partitions xxx.iceberg_table_test' like this:

+--------------+-----------+----------+------------+---------------------------+
| Partition Id | Source Id | Field Id | Field Name | Field Partition Transform |
+--------------+-----------+----------+------------+---------------------------+
| 0            | 2         | 1000     | sex        | IDENTITY                  |
| 0            | 3         | 1001     | action     | IDENTITY                  |
+--------------+-----------+----------+------------+---------------------------+

If I set 'notPartitioned' as true, even if getPartitionColToSourceIdMap() is 
not empty, how can I get the iceberg partition info? 'show partitions 
xxx.iceberg_table_test' will always return AnalysisException.


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@66
PS6, Line 66: getPathMD5ToFi
> nit: getPartitionToFileDescMap
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@219
PS6, Line 219: isPartitioned(Fe
> nit: isPartitioned?
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@258
PS6, Line 258: PartitionColToSourceId
> It returns a mapping from source ids to partition columns, therefore please
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@271
PS6, Line 271: getColumnToSourc
> nit: getColumnToSourceIdMap?
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@305
PS6, Line 305:   
> nit: wrong indentation
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@114
PS6, Line 114:     if ("PARQUET".equalsIgnoreCase(format)) return 
TIcebergFileFormat.PARQUET;
             :     return null;
             :   }
             :
             :   /**
             :    * Build TIceb
> How about:
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py@193
PS6, Line 193:   }
> You probably don't need to modify this file. I think adding HUDIPARQUET to
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py@766
PS6, Line 766:
> flake8: E501 line too long (94 > 90 characters)
Done



--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: wangsheng <[email protected]>
Gerrit-Comment-Date: Thu, 16 Jul 2020 06:26:20 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala

Reply via email to