wwj6591812 opened a new issue, #6847: URL: https://github.com/apache/paimon/issues/6847
### Search before asking - [x] I searched in the [issues](https://github.com/apache/paimon/issues) and found nothing similar. ### Motivation 一、Problem Background In our company's Flink Session-based OLAP environment, we observed that a very basic Flink SQL query against a Paimon table with a LIMIT clause took an unexpectedly long time—over 20 seconds—to complete. (1)The query is as follows: `select * from `paimon_na61`.`sample`.`s_holo_mainse_rank_xfc_all_features_swift_parsed` limit 10;` (2)The table location is : http://bs.alibaba-inc.com/fsutil?path=dfs%3A//na61dfsalake1--cn-zhangjiakou/alake/omega_na61/sample.db/s_holo_mainse_rank_xfc_all_features_swift_parsed/&isdir=true 二、Root Cause Analysis Through code review and by attaching to the live Java process with Arthas, we identified the following: 1、The primary bottleneck is the SnapshotReaderImpl#generateSplits method. (1)We only need select 10 record, but the List<split> is too large. <img width="1652" height="1580" alt="Image" src="https://github.com/user-attachments/assets/ecb0b910-2186-446e-9a05-51c68226f1bc" /> <img width="1698" height="288" alt="Image" src="https://github.com/user-attachments/assets/13c6f2bb-2406-4ffe-9a02-622d36231d06" /> We found the number of files observed in the manifests system table is also consistent with this large scale. <img width="2214" height="420" alt="Image" src="https://github.com/user-attachments/assets/0aa864c3-27ef-499a-b52f-d558c06177f8" /> (2)The main time cost: http://ha3.oss-cn-hangzhou-zmf.aliyuncs.com/052894/paimon_sample_table_slow.arthas.html <img width="3004" height="1446" alt="Image" src="https://github.com/user-attachments/assets/6083f7e0-30d2-4108-80cf-7a13d36e98ae" /> 2、Upon further investigation, we found that when executing a SELECT query, the current implementation of AbstractFileStoreScan#plan does not take the LIMIT clause from the SQL into account. This can cause the splits variable (List<DataSplit>) within SnapshotReaderImpl#read() to become excessively large. Consequently, when the table consists of a large number of files, SnapshotReaderImpl#generateSplits takes a very long time to execute. 三、Conclusion and Plan To resolve this, we propose to push the LIMIT clause down into the AbstractFileStoreScan logic. We plan to create two separate Pull Requests (PRs) for this effort: one to support Append-only tables and another for Primary Key (PK) tables. Since the logic for PK tables is significantly more complex than for Append-only tables, we will implement support for Append-only tables first. ### Solution _No response_ ### Anything else? _No response_ ### Are you willing to submit a PR? - [x] I'm willing to submit a PR! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
