[
https://issues.apache.org/jira/browse/SPARK-45855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Robertson updated SPARK-45855:
----------------------------------
Description:
Hi,
We've discovered code that worked in Spark 3.3.0 doesn't in 3.4.0. I can't find
anything in the release notes to indicate why, so I wonder if this is a bug.
Thank you for looking.
Here we're using our own custom codec, but we noticed we can't set gzip either.
{{ SparkConf conf = spark.sparkContext().conf();}}
{{ conf.set("hive.exec.compress.output", "true");}}
{{ conf.set("mapred.output.compression.codec", D2Codec.class.getName()); }}
{{ spark.sql("CREATE TABLE b AS SELECT id FROM a");}}
This will create the table, but it writes uncompressed files, where Spark 3.3.0
would write compressed files.
Any advice is appreciated and I can help run tests. We run Spark on K8S using
the stackable.tech distribution.
was:
Hi,
We've discovered code that worked in Spark 3.3.0 doesn't in 3.4.0. I can't find
anything in the release notes to indicate why, so I wonder if this is a bug.
Thank you for looking.
Here we're using our own custom codec, but we noticed we can't set gzip either.
{{ SparkConf conf = spark.sparkContext().conf();}}
{{ conf.set("hive.exec.compress.output", "true");}}
{{ conf.set("mapred.output.compression.codec", D2Codec.class.getName()); }}
{{ spark.sql("CREATE TABLE b AS SELECT id FROM a");}}
Any advice is appreciated and I can help run tests. We run Spark on K8S using
the stackable.tech distribution.
> Unable to set codec for Hive CTAS
> ---------------------------------
>
> Key: SPARK-45855
> URL: https://issues.apache.org/jira/browse/SPARK-45855
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.4.0
> Environment: Spark 3.4.0
> Stackable.tech release 23.7.0 which runs spark on K8s.
> Reporter: Tim Robertson
> Priority: Major
>
> Hi,
> We've discovered code that worked in Spark 3.3.0 doesn't in 3.4.0. I can't
> find anything in the release notes to indicate why, so I wonder if this is a
> bug. Thank you for looking.
> Here we're using our own custom codec, but we noticed we can't set gzip
> either.
> {{ SparkConf conf = spark.sparkContext().conf();}}
> {{ conf.set("hive.exec.compress.output", "true");}}
> {{ conf.set("mapred.output.compression.codec", D2Codec.class.getName()); }}
> {{ spark.sql("CREATE TABLE b AS SELECT id FROM a");}}
> This will create the table, but it writes uncompressed files, where Spark
> 3.3.0 would write compressed files.
> Any advice is appreciated and I can help run tests. We run Spark on K8S using
> the stackable.tech distribution.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]