dongjoon-hyun commented on code in PR #44786:
URL: https://github.com/apache/spark/pull/44786#discussion_r1459618369
##########
connector/avro/src/main/scala/org/apache/spark/sql/avro/AvroUtils.scala:
##########
@@ -110,10 +110,12 @@ private[sql] object AvroUtils extends Logging {
case compressed =>
job.getConfiguration.setBoolean("mapred.output.compress", true)
job.getConfiguration.set(AvroJob.CONF_OUTPUT_CODEC,
compressed.getCodecName)
- if (compressed == DEFLATE) {
- val deflateLevel = sqlConf.avroDeflateLevel
- logInfo(s"Compressing Avro output using the $codecName codec at
level $deflateLevel")
- job.getConfiguration.setInt(AvroOutputFormat.DEFLATE_LEVEL_KEY,
deflateLevel)
+ if (compressed.getSupportCompressionLevel) {
+ val level =
sqlConf.getConfString(s"spark.sql.avro.$codecName.level",
+ compressed.getDefaultCompressionLevel.toString)
+ logInfo(s"Compressing Avro output using the $codecName codec at
level $level")
+ val s = if (compressed == ZSTANDARD) "zstd" else codecName
Review Comment:
@beliefer May I ask your reason? For me, it's not required because Avro's
real codec name is `zstandard` instead of `zstd` .
**AVRO REPO**
https://github.com/apache/avro/blob/8d610fb5c7d3958256801848dbd80d6f9d3c556b/lang/java/avro/src/main/java/org/apache/avro/file/DataFileConstants.java#L41
```
public static final String ZSTANDARD_CODEC = "zstandard";
```
**SPARK REPO**
https://github.com/apache/spark/blob/39f8e1a5953b5897f893151d24dc585a80c0c8a0/connector/avro/src/main/java/org/apache/spark/sql/avro/AvroCompressionCodec.java#L36
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]