prashantwason commented on a change in pull request #1804:
URL: https://github.com/apache/hudi/pull/1804#discussion_r458319202
##########
File path:
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieLogBlock.java
##########
@@ -110,7 +110,7 @@ public long getLogBlockLength() {
* Type of the log block WARNING: This enum is serialized as the ordinal.
Only add new enums at the end.
*/
public enum HoodieLogBlockType {
- COMMAND_BLOCK, DELETE_BLOCK, CORRUPT_BLOCK, AVRO_DATA_BLOCK
+ COMMAND_BLOCK, DELETE_BLOCK, CORRUPT_BLOCK, AVRO_DATA_BLOCK,
HFILE_DATA_BLOCK
Review comment:
Yes, a separate DELETE block is not required for HFile. The delete
functionality is implemented independent of the data blocks which only save
record updates.
DELETE_BLOCK saves record keys which have been deleted since. While reading
the log blocks (HoodieMergedLogRecordScanner), if a DELETE block is encountered
then we save a EmptyPayload which represents a delete marker for the record.
Such records wont be written out (compaction) or processed
(RealtimeRecordReader) thereby representing a delete.
>> So, we might have to fetch all values and resolve to the latest one to
find if the value represents delete or active.
Deleted records are never saved. Only deleted keys are saved within the
DELETE block.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]