hudi-bot opened a new issue, #17408:
URL: https://github.com/apache/hudi/issues/17408

   Flink writer do not support record-position for updates and deletes yet, 
will support it later.
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-9192
   - Type: Sub-task
   - Parent: https://issues.apache.org/jira/browse/HUDI-9075
   - Fix version(s):
     - 1.2.0
   
   
   ---
   
   
   ## Comments
   
   30/Apr/25 09:09;geserdugarov;I've raised more questions then figured out how 
to implement it properly during research of this task.
   
    
   
   Initially, Flink integration doesn't support record positions processing 
even for `HoodieRecord`s. But I didn't found any explicit description of how 
"log records positions" feature should work. So, we have to use Spark 
implementation as an example.
   
   Writing of log records positions and using them are separated between 
different PRs, and were implemented by different developers:
   
   1) additional property in log block header (without writing): 
[https://github.com/apache/hudi/pull/9376]
   
   2) writing log record positions in log block header: 
[https://github.com/apache/hudi/pull/9581]
   
   3) using of log record positions from log block headers in file group 
readers: [https://github.com/apache/hudi/pull/9819]
   
    
   
   Confusing part here, that we write log record positions as a set in 
`HoodieLogBlock::addRecordPositionsIfRequired`:
   
   
[https://github.com/apache/hudi/blob/6f84c401b3a809997be1573b0d04e8106fd87fac/hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieLogBlock.java#L390-L392]
   
   But later extract them as a list in 
`PositionBasedFileGroupRecordBuffer::extractRecordPositions`:
   
   
[https://github.com/apache/hudi/blob/6f84c401b3a809997be1573b0d04e8106fd87fac/hudi-common/src/main/java/org/apache/hudi/common/table/read/PositionBasedFileGroupRecordBuffer.java#L305-L307]
   
   and use them as a list in 
`PositionBasedFileGroupRecordBuffer::processDataBlock`:
   
   
[https://github.com/apache/hudi/blob/6f84c401b3a809997be1573b0d04e8106fd87fac/hudi-common/src/main/java/org/apache/hudi/common/table/read/PositionBasedFileGroupRecordBuffer.java#L132-L136]
   
    ;;;
   
   ---
   
   30/Apr/25 11:51;geserdugarov;During checking of cases when log record 
positions are written, found another issue with not persistent index 
configuration:
   
   https://github.com/apache/hudi/issues/13241;;;
   
   ---
   
   30/Apr/25 12:03;geserdugarov;Need to clarify a lot of different questions 
first. So I postponed this task for now, and unassigned it from myself to allow 
anybody to work on this task.;;;


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to