[
https://issues.apache.org/jira/browse/ASTERIXDB-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Taewoo Kim closed ASTERIXDB-1628.
---------------------------------
Resolution: Fixed
> The number of partitions in External Hash-Groupby is calculated improperly
> for smaller data size.
> -------------------------------------------------------------------------------------------------
>
> Key: ASTERIXDB-1628
> URL: https://issues.apache.org/jira/browse/ASTERIXDB-1628
> Project: Apache AsterixDB
> Issue Type: Bug
> Reporter: Taewoo Kim
> Assignee: Taewoo Kim
> Labels: soon
>
> If the number of frames required for a data (e.g., external file), say A, is
> slightly larger than the number of available frames (= memory budget), say B,
> then the number of partitions may be calculated as 1 and it will cause the
> infinite cycles during the merge phase.
> If the number of partition is 1, the current code assumes that there is no
> spilling due to the out of memory budget and the output of the build phase is
> directly generated as the final output.
> But, if A > B, then a spill would happen and once a partition is spilled to
> the disk, it can't be generated as the final output. So, the merge process
> goes to the next round that just creates only one partition again and tries
> to generate some as final output. But, it can't. Thus, an infinite cycle
> begins.
> The resolution is that if A > B, we should not set the number of partition as
> one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)