Re: [PR] [AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter [amoro]

via GitHub Fri, 11 Oct 2024 21:01:21 -0700


majin1102 commented on PR #3240:
URL: https://github.com/apache/amoro/pull/3240#issuecomment-2408341191


   > > Is this a bad usecase of iceberg expression?
   > > I mean partition filter is what we could need in this case, especially 
for so many partitions and large mount of data per partition. what if we do not 
use iceberg expressions here and use a set to filter or something else. can we 
solve the problem?
   > 
   > Yes, It is a bad case to construct iceberg expression with too many 
conditions. We can filter the data file by ourselves rather than pass it to 
iceberg scan, but we cannot get better plan performance, but still save some 
memory for our optimizing plan process.
   > 
   > We can improve this case in another PR.
   > 
   > > On the other hand, if we do not filter partitions, the evaluation stage 
is somehow insiginificant, we could eliminate pending partitions in 
pendingInput to save DB storage
   > 
   > Yes, we may drop the partition set in pending state to save our db storage 
in current implementation.
   
   ‘optimizer.ignore-filter-partition-count’
   I think this parameter is hard to describe on documents. since it appears to 
be a temporary solution and not quite general


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@amoro.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] [AMORO-3239] Fix stack overflow caused by reading too many partitions in the filter [amoro]

Reply via email to