ggershinsky commented on code in PR #959:
URL: https://github.com/apache/parquet-mr/pull/959#discussion_r984394753


##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/CodecFactory.java:
##########
@@ -244,16 +272,60 @@ protected CompressionCodec getCodec(CompressionCodecName 
codecName) {
     }
   }
 
+  /**
+   * Modified for https://issues.apache.org/jira/browse/PARQUET-2126
+   * This releases all cached instances of all compressors and decompressors 
created by all threads that share
+   * this CodeFactory instance.
+   * Note: A problem might occur if release() were called while some codec 
instances were still in use, but it
+   * would not make sense to call close() or release() on a shared 
CodecFactory while some threads are still
+   * actively using it. The usage pattern should be:

Review Comment:
   It would be also good to have a look at an example of such callers, eg in 
Apache Spark, to start analyzing the implications.



##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/CodecFactory.java:
##########
@@ -244,16 +272,60 @@ protected CompressionCodec getCodec(CompressionCodecName 
codecName) {
     }
   }
 
+  /**
+   * Modified for https://issues.apache.org/jira/browse/PARQUET-2126
+   * This releases all cached instances of all compressors and decompressors 
created by all threads that share
+   * this CodeFactory instance.
+   * Note: A problem might occur if release() were called while some codec 
instances were still in use, but it
+   * would not make sense to call close() or release() on a shared 
CodecFactory while some threads are still
+   * actively using it. The usage pattern should be:

Review Comment:
   regarding the code parts that call this release() method. For those parts 
inside the parquet-mr codebase, can  some of them implement/enforce this 
pattern fully internally? I'd guess most/all of them eventually pass this 
responsibility to the "app" code above parquet-mr API; so probably this pattern 
documentation should be moved/copied/referenced in these APIs. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to