[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476210#comment-15476210
 ] 

ASF subversion and git services commented on ASTERIXDB-1628:
------------------------------------------------------------

Commit b0dc27e8dfb8a0cc909874fe1bbaaffa97ddfc29 in asterixdb's branch 
refs/heads/master from [~wangsaeu]
[ https://git-wip-us.apache.org/repos/asf?p=asterixdb.git;h=b0dc27e ]

ASTERIXDB-1628: Fixed an issue in External Hash Group by

 - The number of partitions in External Hash Group By is now
   properly calculated by considering a corner case.

Change-Id: I8901d2b64659fb0d2b97d73f45a9fe113232e860
Reviewed-on: https://asterix-gerrit.ics.uci.edu/1144
Tested-by: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Integration-Tests: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Reviewed-by: Taewoo Kim <wangs...@yahoo.com>


> The number of partitions in External Hash-Groupby is calculated improperly 
> for smaller data size.
> -------------------------------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1628
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1628
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>              Labels: soon
>
> If the number of frames required for a data (e.g., external file), say A,  is 
> slightly larger than the number of available frames (= memory budget), say B, 
> then the number of partitions may be calculated as 1 and it will cause the 
> infinite cycles during the merge phase.
> If the number of partition is 1, the current code assumes that there is no 
> spilling due to the out of memory budget and the output of the build phase is 
> directly generated as the final output. 
> But, if A > B, then a spill would happen and once a partition is spilled to 
> the disk, it can't be generated as the final output. So, the merge process 
> goes to the next round that just creates only one partition again and tries 
> to generate some as final output. But, it can't. Thus, an infinite cycle 
> begins.
> The resolution is that if A > B, we should not set the number of partition as 
> one.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to