Thanks Eduard and Ryan.

I use spark on a K8S cluster to write parquet on s3 and then add an
external table in hive metastore for this parquet. In the future, when
using iceberg, I prefer hive metadata store since it is my
centralized metastore for batch and streaming datasets. I don't see that
hive metastore is supported in iceberg AWS integration on
https://iceberg.apache.org/aws/. Is there another link for that?

Most of the examples use spark sql to write/read iceberg. For example,
there is no "sql merge into" like support for spark API. Is spark sql
preferred over spark dataframe/dataset API in Iceberg? If so, could you
clarify the rationale behind? I personally feel spark API is more dev
friendly and scalable. Thanks very much!


On Mon, Aug 9, 2021 at 8:53 AM Ryan Blue <b...@tabular.io> wrote:

> Lian,
>
> Iceberg tables work great in S3. When creating the table, just pass the
> `LOCATION` clause with an S3 path, or set your catalog's warehouse location
> to S3 so tables are automatically created there.
>
> The only restriction for S3 is that you need a metastore to track the
> table metadata location because S3 doesn't have a way to implement a
> metadata commit. For a metastore, there are implementations backed by the
> Hive MetaStore, Glue/DynamoDB, and Nessie. And the upcoming release adds
> support for DynamoDB without Glue and JDBC.
>
> Ryan
>
> On Mon, Aug 9, 2021 at 2:24 AM Eduard Tudenhoefner <edu...@dremio.com>
> wrote:
>
>> Lian you can have a look at https://iceberg.apache.org/aws/. It should
>> contain all the info that you need. The codebase contains a *S3FileIO *class,
>> which is an implementation that is backed by S3.
>>
>> On Mon, Aug 9, 2021 at 7:37 AM Lian Jiang <jiangok2...@gmail.com> wrote:
>>
>>> I am reading https://iceberg.apache.org/spark-writes/#spark-writes and
>>> wondering if it is possible to create an iceberg table on S3. This guide
>>> seems to say only write to a hive table (backed up by HDFS if I understand
>>> correctly). Hudi and Delta can write to s3 with a specified S3 path. How
>>> can I do it using iceberg? Thanks for any clarification.
>>>
>>>
>>>
>
> --
> Ryan Blue
> Tabular
>


-- 

Create your own email signature
<https://www.wisestamp.com/signature-in-email/?utm_source=promotion&utm_medium=signature&utm_campaign=create_your_own&srcid=5234462839406592>

Reply via email to