leesf commented on a change in pull request #4013:
URL: https://github.com/apache/hudi/pull/4013#discussion_r754042366



##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/util/ParquetUtils.java
##########
@@ -283,17 +283,37 @@ public Boolean apply(String recordKey) {
 
   /**
    * Parse min/max statistics stored in parquet footers for all columns.
+   * ParquetRead.readFooter is not a thread safe method.
+   *
+   * @param conf hadoop conf.
+   * @param parquetFilePath file to be read.
+   * @param cols cols which need to collect statistics.
+   * @param useLock if use lock when read parquet footer.
+   * @return a HoodieColumnRangeMetadata instance.
    */
-  public Collection<HoodieColumnRangeMetadata<Comparable>> 
readRangeFromParquetMetadata(Configuration conf, Path parquetFilePath, 
List<String> cols) {
-    ParquetMetadata metadata = readMetadata(conf, parquetFilePath);
+  public Collection<HoodieColumnRangeMetadata<Comparable>> 
readRangeFromParquetMetadata(
+      Configuration conf,
+      Path parquetFilePath,
+      List<String> cols,
+      boolean useLock) {

Review comment:
       I am so curious about the change, would you please point me to the docs 
that read footer is not thread-safe and affect the result?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to