leekeiabstraction opened a new pull request, #2061: URL: https://github.com/apache/fluss/pull/2061
### Purpose Improve internals of LogScanner by lazily deserialising from record stream / arrow buffer Linked issue: close #2041 ### Brief change log Change the signatures and implementation that currently returns List<ScanRecord> to return CloseableIterator<ScanRecord>. This includes (not exhaustive): - LogFetchCollector's Map<TableBucket, List<ScanRecord>> collectFetch(LogFetchBuffer) to Map<TableBucket, CloseableIterator<ScanRecord>> collectFetch(LogFetchBuffer) - LogFetcher's Map<TableBucket, List<ScanRecord>> collectFetch(LogFetchBuffer) to Map<TableBucket, CloseableIterator <ScanRecord>> collectFetch(LogFetchBuffer) Lazy initialisation of CompletedFetch and deserialisation of ScanRecord. The closing of resources is done within: - FetchCollector#ScanRecordIterator.close() and makeNext() (to ensure that concatenated CloseableIterators are closed upon finish consumption) - LogScannerImpl.close() where LogScannerImpl will track unclosed CloseableIterators and close them. - ScanRecords.close() - FlinkRecordsWithSplitIds.recycle() - TieringSplitReader.fetch() ### Tests TODO: Address failing IT test cases and add additional tests. ### API and Format TODO: Decide LogScanner/ScanRecords interface changes. See https://github.com/apache/fluss/issues/2041#issuecomment-3592642088 ### Documentation TODO: Update documentation -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
