[ https://issues.apache.org/jira/browse/ASTERIXDB-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15476210#comment-15476210 ]
ASF subversion and git services commented on ASTERIXDB-1628: ------------------------------------------------------------ Commit b0dc27e8dfb8a0cc909874fe1bbaaffa97ddfc29 in asterixdb's branch refs/heads/master from [~wangsaeu] [ https://git-wip-us.apache.org/repos/asf?p=asterixdb.git;h=b0dc27e ] ASTERIXDB-1628: Fixed an issue in External Hash Group by - The number of partitions in External Hash Group By is now properly calculated by considering a corner case. Change-Id: I8901d2b64659fb0d2b97d73f45a9fe113232e860 Reviewed-on: https://asterix-gerrit.ics.uci.edu/1144 Tested-by: Jenkins <jenk...@fulliautomatix.ics.uci.edu> Integration-Tests: Jenkins <jenk...@fulliautomatix.ics.uci.edu> Reviewed-by: Taewoo Kim <wangs...@yahoo.com> > The number of partitions in External Hash-Groupby is calculated improperly > for smaller data size. > ------------------------------------------------------------------------------------------------- > > Key: ASTERIXDB-1628 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1628 > Project: Apache AsterixDB > Issue Type: Bug > Reporter: Taewoo Kim > Assignee: Taewoo Kim > Labels: soon > > If the number of frames required for a data (e.g., external file), say A, is > slightly larger than the number of available frames (= memory budget), say B, > then the number of partitions may be calculated as 1 and it will cause the > infinite cycles during the merge phase. > If the number of partition is 1, the current code assumes that there is no > spilling due to the out of memory budget and the output of the build phase is > directly generated as the final output. > But, if A > B, then a spill would happen and once a partition is spilled to > the disk, it can't be generated as the final output. So, the merge process > goes to the next round that just creates only one partition again and tries > to generate some as final output. But, it can't. Thus, an infinite cycle > begins. > The resolution is that if A > B, we should not set the number of partition as > one. -- This message was sent by Atlassian JIRA (v6.3.4#6332)