[ 
https://issues.apache.org/jira/browse/ASTERIXDB-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15425608#comment-15425608
 ] 

Taewoo Kim commented on ASTERIXDB-1556:
---------------------------------------

[~dtabass] suggested an idea for garbage collection and I totally agree to this 
idea. It is feasible without changing the current structure. The missing puzzle 
filled by [~dtabass] is written in red. Here are the steps for the garbage 
collection:

#1. Allocate a new frame.
#2. Read a content frame of Hash Table.
#3. Read a slot information. Check the number of used count for the slot. If 
it's greater than zero (meaning that it is being used now), we put it in a 
newly allocated frame. And update the corresponding h() value pointer for this 
location in a header frame. {color:red}*We can find the h() value of the slot 
using a first tuple pointer in the slot*.{color} If the number is zero, reset 
the corresponding h() value pointer for this location in a header frame, again 
using the first tuple pointer in the slot. 
#4. Once a content frame is read fully, then deallocate that content frame. 
#5. Repeat #2 - #4 until a newly allocated frame becomes full. Then reallocate 
a new frame and continues.

> Hash Table used by External hash group-by doesn't conform to the budget.
> ------------------------------------------------------------------------
>
>                 Key: ASTERIXDB-1556
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1556
>             Project: Apache AsterixDB
>          Issue Type: Bug
>            Reporter: Taewoo Kim
>            Assignee: Taewoo Kim
>              Labels: soon
>         Attachments: 2wayjoin.pdf, 2wayjoin.rtf, 2wayjoinplan.rtf, 
> 3wayjoin.pdf, 3wayjoin.rtf, 3wayjoinplan.rtf
>
>
> When we enable prefix-based fuzzy-join and apply the multi-way fuzzy-join ( > 
> 2), the system generates an out-of-memory exception. 
> Since a fuzzy-join is created using 30-40 lines of AQL codes and this AQL is 
> translated into massive number of operators (more than 200 operators in the 
> plan for a 3-way fuzzy join), it could generate out-of-memory exception.
> /// Update: as the discussion goes, we found that hash table in the external 
> hash group by doesn't conform to the frame limit. So, an out of memory 
> exception happens during the execution of an external hash group by operator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to