kl0u commented on a change in pull request #11307: [FLINK-16371] [BulkWriter]
Fix Hadoop Compression BulkWriter
URL: https://github.com/apache/flink/pull/11307#discussion_r389642181
##########
File path:
flink-formats/flink-compress/src/main/java/org/apache/flink/formats/compress/CompressWriterFactory.java
##########
@@ -42,39 +47,57 @@
private Extractor<IN> extractor;
private CompressionCodec hadoopCodec;
+ private String hadoopCodecName;
+ private Map<String, String> hadoopConfigurationMap;
+ private String codecExtension;
public CompressWriterFactory(Extractor<IN> extractor) {
- this.extractor = Preconditions.checkNotNull(extractor,
"extractor cannot be null");
+ this.extractor = checkNotNull(extractor, "Extractor cannot be
null");
+ this.hadoopConfigurationMap = new HashMap<>();
}
public CompressWriterFactory<IN> withHadoopCompression(String
hadoopCodecName) {
return withHadoopCompression(hadoopCodecName, new
Configuration());
}
public CompressWriterFactory<IN> withHadoopCompression(String
hadoopCodecName, Configuration hadoopConfiguration) {
- return withHadoopCompression(new
CompressionCodecFactory(hadoopConfiguration).getCodecByName(hadoopCodecName));
- }
+ CompressionCodec codec = new
CompressionCodecFactory(hadoopConfiguration).getCodecByName(hadoopCodecName);
Review comment:
I think here it is more reasonable to throw an `IOException` and not an
`NPE` one. Also I would separate it into a different method like:
```
private void validateCompressionCodecConfig(String hadoopCodecName,
Configuration hadoopConfiguration) throws IOException {
CompressionCodec codec = new
CompressionCodecFactory(hadoopConfiguration).getCodecByName(hadoopCodecName);
if (codec == null) {
throw new IOException("Unable to load the provided
Hadoop codec [" + hadoopCodecName + "]");
}
}
```
What do you think @zenfenan ?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services