yui2010 commented on a change in pull request #2427:
URL: https://github.com/apache/hudi/pull/2427#discussion_r555027450



##########
File path: 
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileWriter.java
##########
@@ -121,17 +121,10 @@ public void writeAvro(String recordKey, IndexedRecord 
object) throws IOException
 
     if (hfileConfig.useBloomFilter()) {
       hfileConfig.getBloomFilter().add(recordKey);
-      if (minRecordKey != null) {
-        minRecordKey = minRecordKey.compareTo(recordKey) <= 0 ? minRecordKey : 
recordKey;
-      } else {
+      if (minRecordKey == null) {

Review comment:
       Hi @vinothchandar, Thanks for reviewing. it not computing the min/max. 
it only use the first recordKey and the last recordKey as 
min/max(HoodieSortedMergeHandle/BaseSparkCommitActionExecutor already compare 
the input records in order by recordKey) . It's like hbase store 
keyRange(firstKey/lastKey) as 
   
[https://github.com/apache/hbase/blob/rel/1.2.3/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java#L292](https://github.com/apache/hbase/blob/rel/1.2.3/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java#L292)
   and actually we also can get min/max RecordKey from HFile native throught 
`HFileReaderV3#getFirstKey()` and `HFileReaderV3#getLastRowKey()` from 
load-on-open section
   
   and i think we can use current implement(put min/max in FileInfo map). maybe 
we will add more properties. for example: add recordCount so we can  choose 
seekto or loadall in `HoodieHFileReader#filterRowKeys`
    




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to