Agreed on that - it will be very cool to have Iceberg supported at this
(query) level.
On 3/30/23 11:42 AM, Ian Maxon wrote:
This will be great to support and the changes required are not invasive or
radical in any way, so it's a win on both fronts.
Really looking forward to this being accepted and merged in.
On Mar 30, 2023 at 03:53:15, Hari Kishore Chaparala<[email protected]>
wrote:
Initiating the discussion thread proposing a new external dataset feature
in AsterixDB.
*Feature:* External dataset support for reading Apache Iceberg tables
*Details:* Apache Iceberg is a table format for huge analytic tables. It
allows time travel queries, partitioning, and fast query planning from its
efficient tree-like metadata format, among several other features (
https://urldefense.com/v3/__https://iceberg.apache.org/docs/latest/__;!!CzAuKJ42GuquVTTmVmPViYEvSg!OxoFKyEbJveh21afPgqx-_1dGhoAPvKbEYAkrnA7KYaMNOE1rh8qQunoFwXpj99uTer_n9H8rQFHFTHz$
). As part of Iceberg-AsterixDB
integration, we first plan to support reading from Iceberg format version-1
tables with AsterixDB as the query engine utilizing our inherent read
parallelization. The Iceberg table details will be specified in the
external dataset DDL, and all queries will fetch the data from the latest
iceberg table snapshot. At present, AWS S3 and HDFS adapters can be used to
read Iceberg tables with data files in Parquet format.
*Changeset*:https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17419
*APE*:
https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/ASTERIXDB/APE*1*3A*Iceberg*API*Integration__;KyUrKys!!CzAuKJ42GuquVTTmVmPViYEvSg!OxoFKyEbJveh21afPgqx-_1dGhoAPvKbEYAkrnA7KYaMNOE1rh8qQunoFwXpj99uTer_n9H8rR_N9lx4$