yui2010 commented on a change in pull request #2427:
URL: https://github.com/apache/hudi/pull/2427#discussion_r555027450
##########
File path:
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/storage/HoodieHFileWriter.java
##########
@@ -121,17 +121,10 @@ public void writeAvro(String recordKey, IndexedRecord
object) throws IOException
if (hfileConfig.useBloomFilter()) {
hfileConfig.getBloomFilter().add(recordKey);
- if (minRecordKey != null) {
- minRecordKey = minRecordKey.compareTo(recordKey) <= 0 ? minRecordKey :
recordKey;
- } else {
+ if (minRecordKey == null) {
Review comment:
Hi @vinothchandar, Thanks for reviewing. it not computing the min/max.
it only use the first recordKey and the last recordKey as
min/max(HoodieSortedMergeHandle/BaseSparkCommitActionExecutor already compare
the input records in order by recordKey) . It's like hbase store
keyRange(firstKey/lastKey) as
[https://github.com/apache/hbase/blob/rel/1.2.3/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java#L292](https://github.com/apache/hbase/blob/rel/1.2.3/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java#L292)
and actually we also can get min/max RecordKey from HFile native thought
`HFileReaderV3#getFirstKey()` and `HFileReaderV3#getLastRowKey()` from
load-on-open section
and i think we can use current implement(put min/max in FileInfo map). maybe
we will add more properties. for example: add recordCount so we can choose
seekto or loadall in `HoodieHFileReader#filterRowKeys`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]