Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/15797 )
Change subject: IMPALA-9688: Support create iceberg table by impala ...................................................................... Patch Set 11: (5 comments) Thanks a ton for the changes and tests! http://gerrit.cloudera.org:8080/#/c/15797/17/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java File fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java: http://gerrit.cloudera.org:8080/#/c/15797/17/fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java@569 PS17, Line 569: "a managed Iceberg" http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java File fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java: http://gerrit.cloudera.org:8080/#/c/15797/11/fe/src/main/java/org/apache/impala/catalog/HdfsFileFormat.java@236 PS11, Line 236: case ICEBERG: > Yes, Avro, ORC, or Parquet are all splittable. But we found that it's diffi Do you have a design doc about it? E.g. a google doc that you could share with the community via an email to the dev@impala channel? We'd be happy to review it. I thought we'd just use Iceberg as a library to process the metadata, prune partitions, and select the data files that Impala needs to scan. But probably I'm missing something since I don't really know Iceberg. http://gerrit.cloudera.org:8080/#/c/15797/17/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/15797/17/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@64 PS17, Line 64: icebergTableName_ Shouldn't this be renamed to icebergTableLocation_ here and in the thrift structure as well? Or is it an hdfs location for "File System Tables" and table name for "Metastore Tables"? http://gerrit.cloudera.org:8080/#/c/15797/17/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/15797/17/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@24 PS17, Line 24: import org.apache.iceberg.hadoop.HadoopTables; : import org.apache.iceberg.PartitionField; So initially it'll only support "Files System Tables", and not "Metastore tables"? https://iceberg.apache.org/spec/#file-system-tables https://iceberg.apache.org/spec/#metastore-tables http://gerrit.cloudera.org:8080/#/c/15797/17/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@66 PS17, Line 66: nit: tableLocation? Or "tableIdentifier" if later we'll support "Metastore tables"? But as I look at Iceberg's code "Metastore tables" maybe not even a thing yet. -- To view, visit http://gerrit.cloudera.org:8080/15797 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8d85db4c904a8c758c4cfb4f19cfbdab7e6ea284 Gerrit-Change-Number: 15797 Gerrit-PatchSet: 11 Gerrit-Owner: wangsheng <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: wangsheng <[email protected]> Gerrit-Comment-Date: Mon, 11 May 2020 15:34:40 +0000 Gerrit-HasComments: Yes
