Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/17024#discussion_r102418860
--- Diff: core/src/main/scala/org/apache/spark/io/CompressionCodec.scala ---
@@ -95,6 +95,7 @@ private[spark] object CompressionCodec {
val FALLBACK_COMPRESSION_CODEC = "snappy"
val DEFAULT_COMPRESSION_CODEC = "lz4"
val ALL_COMPRESSION_CODECS = shortCompressionCodecNames.values.toSeq
+ val ALL_COMPRESSION_CODECS_SHORT: Set[String] =
shortCompressionCodecNames.keySet
--- End diff --
Instead of exposing this and supporting only short codec names for
checkpoint, the pattern should be same as in rest of spark code when dealing
with codec's.
```
sparkConf.getOption("spark.checkpoint.compress.codec").map(c =>
logInfo(s"Compressing checkpoint using $c.")
CompressionCodec.createCodec(conf, c)
).getOrElse(fileStream)
```
This will ensure that support for checkpoint compression is in line with
rest of spark (short and long classes, no need to introduce 'none')
Note: you will need to change fileStream to a `lazy val` - so that if codec
creation throws exception, we dont leave dangling streams around (with limited
block visibility scope to fileStream)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]