[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15672016#comment-15672016
 ] 

Taewoo Kim commented on ASTERIXDB-1556:
---------------------------------------

Discussion with [~dtabass] and [[email protected]]: 

1) To solve the atomicity issue, between splitting insertion operation to the 
hash table into two pieces (make space and actual writing) and undoing the last 
insertion to the Data Table, the latter (undoing) is more preferable in terms 
of the code complexity: in case where an insertion into the Data Table succeeds 
and that insertion into the Hash Table fails. 

2) Regarding the finding victim policy in case we can't allocate a frame to the 
Data Table, a spilled data partition should not be chosen again unless the 
incoming tuple is being inserted into that partition since after spilling, 
there will be only one frame for that partition and keep spilling a partial 
frame (rather than writing a full frame to the disk) will not be good. 

> Hash Table used by External hash group-by doesn't conform to the budget.
> ------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1556
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1556
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>            Priority: Critical
>              Labels: soon
>         Attachments: 2wayjoin.pdf, 2wayjoin.rtf, 2wayjoinplan.rtf, 
> 3wayjoin.pdf, 3wayjoin.rtf, 3wayjoinplan.rtf
>
>
> When we enable prefix-based fuzzy-join and apply the multi-way fuzzy-join ( > 
> 2), the system generates an out-of-memory exception. 
> Since a fuzzy-join is created using 30-40 lines of AQL codes and this AQL is 
> translated into massive number of operators (more than 200 operators in the 
> plan for a 3-way fuzzy join), it could generate out-of-memory exception.
> /// Update: as the discussion goes, we found that hash table in the external 
> hash group by doesn't conform to the frame limit. So, an out of memory 
> exception happens during the execution of an external hash group by operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to