[I] [Feature] Optimize read table with limit. [paimon]

via GitHub Fri, 19 Dec 2025 22:48:28 -0800


wwj6591812 opened a new issue, #6847:
URL: https://github.com/apache/paimon/issues/6847


   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Motivation
   
   一、Problem Background
   In our company's Flink Session-based OLAP environment, we observed that a 
very basic Flink SQL query against a Paimon table with a LIMIT clause took an 
unexpectedly long time—over 20 seconds—to complete.
   （1）The query is as follows:
   `select * from 
`paimon_na61`.`sample`.`s_holo_mainse_rank_xfc_all_features_swift_parsed` limit 
10;`
   
   （2）The table location is : 
   
http://bs.alibaba-inc.com/fsutil?path=dfs%3A//na61dfsalake1--cn-zhangjiakou/alake/omega_na61/sample.db/s_holo_mainse_rank_xfc_all_features_swift_parsed/&isdir=true
   
   二、Root Cause Analysis
   Through code review and by attaching to the live Java process with Arthas, 
we identified the following:
   1、The primary bottleneck is the SnapshotReaderImpl#generateSplits method.
   （1）We only need select 10 record, but the List<split> is too large.
   <img width="1652" height="1580" alt="Image" 
src="https://github.com/user-attachments/assets/ecb0b910-2186-446e-9a05-51c68226f1bc";
 />
   
   <img width="1698" height="288" alt="Image" 
src="https://github.com/user-attachments/assets/13c6f2bb-2406-4ffe-9a02-622d36231d06";
 />
   
   We found the number of files observed in the manifests system table is also 
consistent with this large scale.
   <img width="2214" height="420" alt="Image" 
src="https://github.com/user-attachments/assets/0aa864c3-27ef-499a-b52f-d558c06177f8";
 />
   
   
   （2）The main time cost:
   
   
http://ha3.oss-cn-hangzhou-zmf.aliyuncs.com/052894/paimon_sample_table_slow.arthas.html
   
   <img width="3004" height="1446" alt="Image" 
src="https://github.com/user-attachments/assets/6083f7e0-30d2-4108-80cf-7a13d36e98ae";
 />
   
   
   2、Upon further investigation, we found that when executing a SELECT query, 
the current implementation of AbstractFileStoreScan#plan does not take the 
LIMIT clause from the SQL into account. This can cause the splits variable 
(List<DataSplit>) within SnapshotReaderImpl#read() to become excessively large. 
Consequently, when the table consists of a large number of files, 
SnapshotReaderImpl#generateSplits takes a very long time to execute.
   
   三、Conclusion and Plan
   To resolve this, we propose to push the LIMIT clause down into the 
AbstractFileStoreScan logic.
   We plan to create two separate Pull Requests (PRs) for this effort: one to 
support Append-only tables and another for Primary Key (PK) tables. Since the 
logic for PK tables is significantly more complex than for Append-only tables, 
we will implement support for Append-only tables first.
   
   
   
   ### Solution
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Feature] Optimize read table with limit. [paimon]

Reply via email to