Hi Zoltan, looks like you had worked on iceberg integration with impala. is there any doc to introduce how to run iceberg in impala so that iI can play? and wondering if there is any design doc?
thanks and best regards Yong | | yong.sunny | | 邮箱yong.su...@163.com from phone | 签名由 网易邮箱大师 定制 On 05/27/2021 16:54, Zoltán Borók-Nagy wrote: Hi Yong Yang, It is supported by Iceberg, and this is exactly how Impala is working. I.e. Impala's Parquet writer writes the data files, then we use Iceberg's API to append them to the table. You can find the relevant code here: https://github.com/apache/impala/blob/822e8373d1f1737865899b80862c2be7b07cc950/fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java#L197-L271 For inserting files we use Iceberg's AppendFiles class. For overwriting a table/partitions we use Iceberg's ReplacePartitions class. One important thing that you need to do during writing the Parquet data files is to fill the 'field_id' member for each schema element, corresponding to the Iceberg Column ID. Cheers, Zoltan On Thu, May 27, 2021 at 7:39 AM Peter Vary <pv...@cloudera.com.invalid> wrote: Hi Yong Yang, Your message is ended up in my spam folder claiming that many messages from @163.com are spam messages, but your question seems legitimate. With the Java API you can add Parquet files to the Iceberg tables where the files conform to the specification. For Parquet, take a look here: http://iceberg.apache.org/spec/#parquet For the Java API, take a look at here: https://iceberg.apache.org/api/ Thanks, Peter On Wed, 19 May 2021, 18:44 yong.sunny, <yong.su...@163.com> wrote: Hi Iceberg Devs, I am new to the Iceberg. And I have a question about the iceberg manifest/manifest list/metadata api. I am wondering if the following is supported: 1. parquet file is writen by other apps 2. use the APIes of iceberg to create manifest file/manifest list/metadata(snapshot). The applicant to do that would be loading an independent, that is not loaded flink or spark. Could you please tell me if that is supported, and if that is supported, which APIes should I use? If I post the question in wrong channel, please tell me which one I should use. Thanks and Best regards, Yong Yang