[GitHub] [incubator-paimon] Dkbei opened a new issue, #1253: [Bug] Symptom oom is displayed when hive reads data using limit

via GitHub Sun, 28 May 2023 22:26:57 -0700


Dkbei opened a new issue, #1253:
URL: https://github.com/apache/incubator-paimon/issues/1253


   ### Search before asking
   
   - [X] I searched in the 
[issues](https://github.com/apache/incubator-paimon/issues) and found nothing 
similar.
   
   
   ### Paimon version
   
   Scenario description:
   1. A partition contains 17 million data
   2. 1 bucket
   3. Query script: select * from dwd.paimon_table_test where dt='20211231' 
limit 10;
   4.Number of files in the partition: 583  Average file size: 2.6mb
   
   Abnormal information:
   <img width="1727" alt="image" 
src="https://github.com/apache/incubator-paimon/assets/38800374/452798e0-d893-42cd-99ab-222402e51cf1";>
   
   When the limit command is used to query data, fetch cannot be used. Data is 
read directly through mapreduce
   
   ### Compute Engine
   
   hive ：cdh-6.3.2 hive 2.1.1
   Paimon: master branches
   
   
   ### Minimal reproduce step
   
   A large amount of data is written into the paimon table from the hive table 
to generate multiple small files
   
   ### What doesn't meet your expectations?
   
   The limit operation should not cause oom, and the limit operation can fetch 
directly
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-paimon] Dkbei opened a new issue, #1253: [Bug] Symptom oom is displayed when hive reads data using limit

Reply via email to