Re: Question about Snappy compression format.

Ryan Blue Fri, 05 Mar 2021 08:48:19 -0800

Russell is right. The property you're trying to set is a table property and
needs to be set on the table.


We don't currently support overriding arbitrary table properties in write
options, mainly because we want to encourage people to set their
configuration on the table instead of in jobs. That's a best practice that
I highly recommend so you don't need to configure every job that writes to
the table, and so you can make changes and have them automatically take
effect without recompiling your write job.

On Fri, Mar 5, 2021 at 8:44 AM Russell Spitzer <russell.spit...@gmail.com>
wrote:

> I believe those are currently only respected as table properties and not
> as "spark write" properties although there is a case to be made that we
> should accept them there as well. You can alter your table so that it
> contains those properties and new files will be created with the
> compression you would like.
>
> On Mar 5, 2021, at 7:15 AM, Javier Sanchez Beltran <
> jabelt...@expediagroup.com.INVALID> wrote:
>
> Hello Iceberg team!
>
> I have been researching Apache Iceberg to see how would work in our
> environment. We are still trying out things. We would like to have Parquet
> format with SNAPPY compression type.
>
> I already try changing these two properties to SNAPPY, but it didn’t work (
> https://iceberg.apache.org/configuration/):
>
>
> write.avro.compression-codec
>
> Gzip -> SNAPPY
>
> write.parquet.compression-codec
>
> Gzip -> SNAPPY
> In this way:
>
> dataset
>           .writeStream()
>           .format("iceberg")
>           .outputMode("append")
>           .option("write.parquet.compression-codec", "SNAPPY")
>           .option("write.avro.compression-codec", "SNAPPY")
>           …start()
>
>
> Did I do something in a bad way? Or maybe we need to take care of the
> implementation of this SNAPPY compression?
>
> Thank you in advance,
> Javier.
>
>
>

-- 
Ryan Blue
Software Engineer
Netflix

Re: Question about Snappy compression format.

Reply via email to