[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Taewoo Kim closed ASTERIXDB-1628.
---------------------------------
    Resolution: Fixed

> The number of partitions in External Hash-Groupby is calculated improperly 
> for smaller data size.
> -------------------------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1628
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1628
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>              Labels: soon
>
> If the number of frames required for a data (e.g., external file), say A,  is 
> slightly larger than the number of available frames (= memory budget), say B, 
> then the number of partitions may be calculated as 1 and it will cause the 
> infinite cycles during the merge phase.
> If the number of partition is 1, the current code assumes that there is no 
> spilling due to the out of memory budget and the output of the build phase is 
> directly generated as the final output. 
> But, if A > B, then a spill would happen and once a partition is spilled to 
> the disk, it can't be generated as the final output. So, the merge process 
> goes to the next round that just creates only one partition again and tries 
> to generate some as final output. But, it can't. Thus, an infinite cycle 
> begins.
> The resolution is that if A > B, we should not set the number of partition as 
> one.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to