Hey Yong, I've created a design doc about write support: https://docs.google.com/document/d/1_KL0YptDKwhiXvJyx4Vb-yZjggrPQAW2yjeGV4C0vMU/edit
We don't have an upstream release of Impala that supports Iceberg, but you can checkout and build Impala master: https://cwiki.apache.org/confluence/display/IMPALA/Building+Impala The Iceberg support is still under development and the syntax may change, see: https://lists.apache.org/thread.html/re89c80c8218439a2a431fc4c0d2530522841c86858290a4bf36b9805%40%3Cdev.impala.apache.org%3E Therefore we don't have user docs for it yet, but you can take a look at our tests to see how you can create Iceberg tables with the current DDL: https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test Please note that there are other *.test files as well for Iceberg in the QueryTest directory. I hope this helps. Cheers, Zoltan On Tue, Jun 8, 2021 at 4:07 AM yong.sunny <yong.su...@163.com> wrote: > Hi Zoltan, > > looks like you had worked on iceberg integration with impala. is there > any doc to introduce how to run iceberg in impala so that iI can play? > and wondering if there is any design doc? > > thanks and best regards > Yong > > > > yong.sunny > 邮箱yong.su...@163.com from phone > > <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=yong.sunny&uid=yong.sunny%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fwzpmmc%2Fb04ea4676f5ca1dc236a340a5d9d3031.jpg&items=%5B%22%E9%82%AE%E7%AE%B1yong.sunny%40163.com+from+phone%22%5D> > > 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail88> 定制 > > On 05/27/2021 16:54, Zoltán Borók-Nagy <borokna...@cloudera.com> wrote: > Hi Yong Yang, > > It is supported by Iceberg, and this is exactly how Impala is working. > I.e. Impala's Parquet writer writes the data files, then we use Iceberg's > API to append them to the table. > You can find the relevant code here: > > https://github.com/apache/impala/blob/822e8373d1f1737865899b80862c2be7b07cc950/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java#L197-L271 > > For inserting files we use Iceberg's AppendFiles class. For overwriting a > table/partitions we use Iceberg's ReplacePartitions class. > > One important thing that you need to do during writing the Parquet data > files is to fill the 'field_id > <https://github.com/apache/parquet-format/blob/473a3a7710f992b01af79095757d71e1fc68ef62/src/main/thrift/parquet.thrift#L398>' > member for each schema element, corresponding to the Iceberg Column ID. > > Cheers, > Zoltan > > > On Thu, May 27, 2021 at 7:39 AM Peter Vary <pv...@cloudera.com.invalid> > wrote: > >> Hi Yong Yang, >> >> Your message is ended up in my spam folder claiming that many messages >> from @163.com are spam messages, but your question seems legitimate. >> >> With the Java API you can add Parquet >> files to the Iceberg tables where the files conform to the specification. >> >> For Parquet, take a look here: >> http://iceberg.apache.org/spec/#parquet >> >> For the Java API, take a look at here: https://iceberg.apache.org/api/ >> >> Thanks, Peter >> >> On Wed, 19 May 2021, 18:44 yong.sunny, <yong.su...@163.com> wrote: >> >>> Hi Iceberg Devs, >>> >>> I am new to the Iceberg. And I have a question about the iceberg >>> manifest/manifest list/metadata api. >>> I am wondering if the following is supported: >>> 1. parquet file is writen by other apps >>> 2. use the APIes of iceberg to create manifest file/manifest >>> list/metadata(snapshot). The applicant to do that would be loading an >>> independent, that is not loaded flink or spark. >>> >>> Could you please tell me if that is supported, and if that is supported, >>> which APIes should I use? >>> >>> If I post the question in wrong channel, please tell me which one I >>> should use. >>> >>> Thanks and Best regards, >>> Yong Yang >>> >>> >>> >>> >>