wwj6591812 commented on PR #6914:
URL: https://github.com/apache/paimon/pull/6914#issuecomment-3755146184

   > Hi @wwj6591812
   > 
   > 1. We can optimize the bucket path function (this is the performance 
bottleneck) and test its performance to see if the optimization effect can be 
achieved without pushing down the limit to scan.
   > 2. If the performance improvement is not obvious, it is necessary to merge 
the two methods, postFilterManifestEntries and limitPushManifestEntries, and 
only keep one, postFilterManifestEntries.
   
   Hi,I add cache to test. 
   
   一、The result of test is :
   1、Append Table
   <img width="2008" height="968" alt="image" 
src="https://github.com/user-attachments/assets/98b5748a-b324-489c-8c2a-2447c45f355e";
 />
   
   2、PK Table
   <img width="2004" height="956" alt="image" 
src="https://github.com/user-attachments/assets/38d9cd64-30f7-41ba-b388-9af93329fe20";
 />
   
   
   二、After test, I've decided to:
   1、Performance can degrade with a low cache hit rate. Therefore, instead of 
adding a cache for bucketPath construction, I've moved this logic out of the 
for loop to reduce the number of calls to FileStorePathFactory#bucketPath.
   2、Merged the limitPushManifestEntries method into postFilterManifestEntries, 
keeping only the latter.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to