[
https://issues.apache.org/jira/browse/PHOENIX-6436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314644#comment-17314644
]
ASF GitHub Bot commented on PHOENIX-6436:
-----------------------------------------
lhofhansl commented on pull request #1189:
URL: https://github.com/apache/phoenix/pull/1189#issuecomment-813118325
It turns out that for large set Trino is actually doing the topN, even
though it has to pull in all the data. So it's less important now to fix this
here, since I'm limiting topN pushdown in Trino now. Still wrong, though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> OrderedResultIterator overestimates memory requirements.
> --------------------------------------------------------
>
> Key: PHOENIX-6436
> URL: https://issues.apache.org/jira/browse/PHOENIX-6436
> Project: Phoenix
> Issue Type: Wish
> Reporter: Lars Hofhansl
> Priority: Major
>
> Just came across this.
> The size estimation is: {{(limit + offset) * estimatedEntrySize}}
> with just the passed limit and offset, and this estimate is applied for each
> single scan.
> This is way too pessimistic when a large limit is passed as just a safety
> measure.
> Assuming you pass 10.000.000. That is the overall limit, but Phoenix will
> apply it to every scan (at least one per involved region) and take that much
> memory of the pool.
> Not sure what a better estimate would be. Ideally we'd divide by the number
> of involved regions with some fuss, or use a size estimate of the region.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)