lamber-ken edited a comment on issue #1469: [HUDI-686] Implement BloomIndexV2 
that does not depend on memory caching
URL: https://github.com/apache/incubator-hudi/pull/1469#issuecomment-612499511
 
 
   Hi @vinothchandar base on your branch, there are mainly the following 
updates:
   - Rebase master branch
   - Add TestHoodieBloomIndexV2.java
   - Add DeltaTimer.java
   - Fix an implicit bug which causes repeat input record
   
   **Bug fix**
   In the stage of double 
check(`HoodieBloomIndexV2.LazyKeyChecker#computeNext`), 
   when the target file doesn't contains the record key, should return 
`Option.empty()`.
   
   **Previous**
   ```
   Option<HoodieRecord<T>> ret = fileIdOpt.map(fileId -> {
   
   if (currHandle == null || !currHandle.getFileId().equals(fileId)) {
     currHandle = new HoodieKeyLookupHandle<>(config, table, 
Pair.of(record.getPartitionPath(), fileId));
   }
   
   Option<HoodieRecordLocation> location = 
currHandle.containsKey(record.getRecordKey())
       ? Option.of(new HoodieRecordLocation(currHandle.getBaseInstantTime(), 
currHandle.getFileId()))
       : Option.empty();
   return Option.of(getTaggedRecord(record, location));
   }).orElse(Option.of(record));
   ```
   
   **Changes**
   ```
   Option<HoodieRecord<T>> recordOpt = fileIdOpt.map((Function<String, 
Option<HoodieRecord<T>>>) fileId -> {
     DeltaTimer deltaTimer = new DeltaTimer();
     if (currHandle == null || !currHandle.getFileId().equals(fileId)) {
       currHandle = new HoodieKeyLookupHandle<>(config, table, 
Pair.of(record.getPartitionPath(), fileId));
     }
     totalReadTimeMs += deltaTimer.deltaTime();
   
     if (currHandle.containsKey(record.getRecordKey())) {
       HoodieRecordLocation recordLocation = new 
HoodieRecordLocation(currHandle.getBaseInstantTime(), currHandle.getFileId());
       return Option.of(getTaggedRecord(record, recordLocation));
     } else {
       return Option.empty();
     }
   }).orElse(Option.of(record));
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to