[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421927#comment-15421927
 ] 

Taewoo Kim commented on ASTERIXDB-1556:
---------------------------------------

I just uploaded a patch set that solves the priority #1 issue - let the 
system-admin specify the hash-table size in bytes that are used in external 
hash-groupby. If the size is not given by the config file, it will use 50% of 
"compiler.groupmemory" setting for hash-table size as default. If the size is 
greater than "compiler.groupmemory", a run-time exception will occur once the 
external hash-group by operator is opened. In my change, we don't throw an 
exception during the compilation time since there might be optimizations for 
this value during compilation. Also, the sum of frames used for hash-table and 
data-table in external hash-group by now conforms to the budget.

https://asterix-gerrit.ics.uci.edu/#/c/1056/4




> Hash Table used by External hash group-by doesn't conform to the budget.
> ------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1556
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1556
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>         Attachments: 2wayjoin.pdf, 2wayjoin.rtf, 2wayjoinplan.rtf, 
> 3wayjoin.pdf, 3wayjoin.rtf, 3wayjoinplan.rtf
>
>
> When we enable prefix-based fuzzy-join and apply the multi-way fuzzy-join ( > 
> 2), the system generates an out-of-memory exception. 
> Since a fuzzy-join is created using 30-40 lines of AQL codes and this AQL is 
> translated into massive number of operators (more than 200 operators in the 
> plan for a 3-way fuzzy join), it could generate out-of-memory exception.
> /// Update: as the discussion goes, we found that hash table in the external 
> hash group by doesn't conform to the frame limit. So, an out of memory 
> exception happens during the execution of an external hash group by operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to