Re: Hive table compatibility for Iceberg readers

2022-03-09 Thread Walaa Eldin Moustafa
The union type conversion PR is up: https://github.com/apache/iceberg/pull/4242. Thanks, Walaa. On Fri, Feb 11, 2022 at 8:53 AM Walaa Eldin Moustafa wrote: > Thanks Ryan! Yes there is an active discussion on the PR on the spec > aspect. > > On Fri, Feb 11, 2022 at 8:47 AM Ryan Blue wrote: > >

Re: Hive table compatibility for Iceberg readers

2022-02-11 Thread Walaa Eldin Moustafa
Thanks Ryan! Yes there is an active discussion on the PR on the spec aspect. On Fri, Feb 11, 2022 at 8:47 AM Ryan Blue wrote: > Sounds great. Thanks for the update! That PR is on my list to take a look > at, but I still recommend starting with the spec changes. For example, how > should default

Re: Hive table compatibility for Iceberg readers

2022-02-11 Thread Ryan Blue
Sounds great. Thanks for the update! That PR is on my list to take a look at, but I still recommend starting with the spec changes. For example, how should default values be stored in Iceberg metadata for each type? Currently, the spec changes just mention defaults without going into detail about h

Re: Hive table compatibility for Iceberg readers

2022-02-09 Thread Walaa Eldin Moustafa
Thanks Ryan and Owen! Glad we have converged on this. Next steps for us: * Continuing the discussion on the default value PR (already ongoing [1]). * Filing the union type conversion PR (ETA end of next week). * Moving listing-based Hive table scan using Iceberg to a separate repo (likely open sou

Re: Hive table compatibility for Iceberg readers

2022-02-02 Thread Ryan Blue
Walaa, thanks for this list. I think most of these are definitely useful. I think the best one to focus on first is the default values, since those will make Iceberg tables behave more like standard SQL tables, which is the goal. I'm really curious to learn more about #1, but I don't think that I

Re: Hive table compatibility for Iceberg readers

2022-01-31 Thread Owen O'Malley
On Thu, Jan 27, 2022 at 10:26 PM Walaa Eldin Moustafa wrote: > *2. Iceberg schema lower casing:* Before Iceberg, when users read Hive > tables from Spark, the returned schema is lowercase since Hive stores all > metadata in lowercase mode. If users move to Iceberg, such readers could > break once

Re: Hive table compatibility for Iceberg readers

2022-01-31 Thread Walaa Eldin Moustafa
Hi everyone, bumping up this thread. Does it help to discuss each feature individually? I am guessing (4. Default value support) is straightforward since it is on the roadmap already. If not, let us start with this. As for the next step, it would be great if we can focus on (1. Reading tables wit

Hive table compatibility for Iceberg readers

2022-01-27 Thread Walaa Eldin Moustafa
Hi Iceberg community, We have been working on converting our tables from the Hive table format to Iceberg. In order to achieve that switch transparently, we have introduced a number of Hive table features and compatibility modes in Iceberg, and connected them to Spark DataSource API. At a high lev