wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16446 )

Change subject: IMPALA-10164: Supporting HadoopCatalog for Iceberg table
......................................................................


Patch Set 15:

> Hi WangSheng,
 >
 > Yes, it's what I meant, I only have minor corrections, answered
 > inline:
 >
 > > 1. We prohibit explicitly setting the table LOCATION when using
 > 'hadoop.catalog'
 > Yes, I think we should prohibit it and let Iceberg decide the
 > location.
 >
 > > 2. When execute 'SHOW CREATE TABLE', table location is
 > 'hdfs://test-warehouse/ice-catalog/default/ice_t'
 > If we prohibit setting table location, then we can't output it in
 > SHOW CREATE TABLE.
 >
 > > 3. When execute 'DESCRIBE FORMATTED', table location is
 > 'hdfs://test-warehouse/ice-catalog/default/ice_t', and
 > 'iceberg.catalog_location' is 'hdfs://test-warehouse/ice-catalog'
 > Yes.
 >
 > > 4. When execute 'DROP TABLE', we only delete 
 > > 'hdfs://test-warehouse/ice-catalog/default/ice_t',
 > and reserve 'hdfs://test-warehouse/ice-catalog'
 > Yes.
 >
 > > 5. 'iceberg.catalog_location' is necessary for both managed and
 > external 'hadoop.catalog' Iceberg table
 > Yes. Maybe later we could also have a default location somewhere
 > under the warehouse directory. But for now I think explicit catalog
 > locations are better.
 >
 > > 6. When loading managed table, we use 
 > > load('iceberg.catalog_location',TableIdentifier(db.tbl))
 > Yes.
 >
 > > 7. When loading external table, we use 
 > > load('iceberg.catalog_location',TableIdentifier('iceberg.table_name'))
 > Yes, maybe rename 'iceberg.table_name' to 'iceberg.table_identifier'?
 > And the value of it should be a dot-separated string of names that
 > was originally used to create the Iceberg table. Iceberg can
 > already make table identifiers from dot-separated strings:
 > https://github.com/apache/iceberg/blob/1a797cd986c9193f2e422fec295aaf25ce7e1916/api/src/main/java/org/apache/iceberg/catalog/TableIdentifier.java#L47
 >
 > >
 > > Does my understanding correct?
 > >
 > > And still some questions:
 > >
 > > 1. When creating external 'hadoop.catalog' Iceberg table, are we
 > load table by 'iceberg.catalog_location' and 'iceberg.table_name'?
 > If so, both these two properties are necessary.
 > So instead of 'iceberg.table_name' we could have 'iceberg.table_identifier',
 > just to use Iceberg terminology and semantics. Yeah, probably we
 > want it to be a required property for EXTERNAL hadoop catalog
 > tables. I just wonder if it would make sense to infer it from
 > 'db_name.tbl_name'. But let's start with it as a required property,
 > then later we'll se if inferring it makes sense and add support for
 > it in a future patch.
 >
 > > 2. Can we set the table LOCATION for 'hadoop.catalog' external
 > table? If not, how do we define table LOCATION? Using
 > 'iceberg.catalog_location'+/db/external_tbl? Or use
 > 'iceberg.catalog_location'+'iceberg.table_name'?
 > I think we should not set LOCATION. Iceberg uses catalog location +
 > table identifier to find and load tables, so I think we should
 > provide those:
 > 'iceberg.catalog_location'+'iceberg.table_identifier'
 > And if Iceberg can load the table with the given information, we
 > can set the actual table location. If the table is not found, we
 > should raise an error.

Hi Zoltan, thanks for your careful review and patient reply. I've already 
modify code according  to your advice. A Jenkins test has already been 
submitted: https://jenkins.impala.io/job/pre-review-test/721/


--
To view, visit http://gerrit.cloudera.org:8080/16446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1893c50a633ca22d4bca6726c9937b026f5d5ef
Gerrit-Change-Number: 16446
Gerrit-PatchSet: 15
Gerrit-Owner: wangsheng <sky...@163.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: wangsheng <sky...@163.com>
Gerrit-Comment-Date: Mon, 28 Sep 2020 07:45:19 +0000
Gerrit-HasComments: No

Reply via email to