luoyuxia opened a new issue, #2371:
URL: https://github.com/apache/fluss/issues/2371

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/fluss/issues) and 
found nothing similar.
   
   
   ### Fluss version
   
   0.8.0 (latest release)
   
   ### Please describe the bug 🐞
   
   For tables using the First Row Merge Engine, the Tiering Service (and 
potentially any future batch reader) can fall into an infinite loop and fail to 
advance if the log contains empty records at the end of a scan range.
   
   **Scenario & Root Cause**
   Assume we have a table with a log offset range of [0, 10]:
   
   Offset 0: Contains a valid record.
   
   Offsets 1 to 10: These are "empty records" (e.g., filtered out by the merge 
engine or internal log management), yet the log offset still advances.
   
   When the Tiering Service tries to scan the range [0, 10]:
   
   1. It successfully reads the record at offset 0.
   
   2. It continues to scan [1, 10]. However, since these are empty records, the 
ScanRecords result returned to the Tiering Service contains no data.
   
   3. The Trap: Currently, the Tiering Service (and the underlying reader) only 
updates its scan status (current offset) based on the records actually read.
   
   4. Since no records are returned from [1, 10], the next_offset or 
last_read_offset is never updated to 10.
   
   5. Consequently, the Tiering Service assumes there is still data to be read 
within the range and will repeatedly poll [1, 10], leading to a permanent hang.
   
   ### Solution
   
   I expect `ScanRecords` in `fluss-client` should expose this offset to the 
caller:
   
   ### Are you willing to submit a PR?
   
   - [ ] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to