[ 
https://issues.apache.org/jira/browse/HIVE-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-5705:
-----------------------------------

    Description: 
Right now, if TopN overruns memory threshold it disables itself if it couldn't 
directly exclude rows as they are sent; it doesn't count evictions that were 
initially put in the heap and then superceded for this purpose. 
It's reasonable in most cases, but if N is relatively small, and map output is 
large, the cost could still be worth it even if rows don't get excluded 
immediately and are only evicted after being stored for some time. So we'd pay 
some memory copies but emit much less rows.

  was:
Right now, if TopN overruns memory threshold it disables itself if it couldn't 
directly exclude rows as they are sent; it doesn't count evictions that were 
initially put in the heap and then superceded for this purpose. 
It's reasonable in most cases, but if N is relatively small, and map output is 
large, the cost could still be worth it even if rows don't get excluded. So 
we'd pay some memory copies but emit much less rows.


> TopN might use better heuristic for disable
> -------------------------------------------
>
>                 Key: HIVE-5705
>                 URL: https://issues.apache.org/jira/browse/HIVE-5705
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Priority: Minor
>
> Right now, if TopN overruns memory threshold it disables itself if it 
> couldn't directly exclude rows as they are sent; it doesn't count evictions 
> that were initially put in the heap and then superceded for this purpose. 
> It's reasonable in most cases, but if N is relatively small, and map output 
> is large, the cost could still be worth it even if rows don't get excluded 
> immediately and are only evicted after being stored for some time. So we'd 
> pay some memory copies but emit much less rows.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to