prashantwason commented on a change in pull request #1804:
URL: https://github.com/apache/hudi/pull/1804#discussion_r458319202



##########
File path: 
hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieLogBlock.java
##########
@@ -110,7 +110,7 @@ public long getLogBlockLength() {
    * Type of the log block WARNING: This enum is serialized as the ordinal. 
Only add new enums at the end.
    */
   public enum HoodieLogBlockType {
-    COMMAND_BLOCK, DELETE_BLOCK, CORRUPT_BLOCK, AVRO_DATA_BLOCK
+    COMMAND_BLOCK, DELETE_BLOCK, CORRUPT_BLOCK, AVRO_DATA_BLOCK, 
HFILE_DATA_BLOCK

Review comment:
       Yes, a separate DELETE block is not required for HFile. The delete 
functionality is implemented independent of the data blocks which only save 
record updates.
   
   DELETE_BLOCK saves record keys which have been deleted since. While reading 
the log blocks (HoodieMergedLogRecordScanner), if a DELETE block is encountered 
then we save a EmptyPayload which represents a delete marker for the record. 
Such records wont be written out (compaction) or processed 
(RealtimeRecordReader) thereby representing a delete.
   
   >> So, we might have to fetch all values and resolve to the latest one to 
find if the value represents delete or active.
   Deleted records are never saved. Only deleted keys are saved within the 
DELETE block. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to