[ https://issues.apache.org/jira/browse/ASTERIXDB-1556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410502#comment-15410502 ]
Michael J. Carey commented on ASTERIXDB-1556: --------------------------------------------- I don't think step (4) makes sense or is needed. If the sum of the space (D+H) exceeds the budget, invoke the algorithm's current spilling logic - end of change. We needn't change the spilling policy itself, not logically - we just have to change the definition of "too full" to consider the space being used to be D+H instead of D alone. The rest of the logic should remain unchanged. Anything more than that seems like unnecessary complexity. (Not sure what it would accomplish.) Steps (1)-(3) make perfect sense and sound good/right to me. If you want to clean this up even more, budget-wise, perhaps you could slightly change the logic to first ask the Hash Table how many frames it would need to add one entry. Its answer could be 0 (which would almost always be the case), 1, or 2. You could then pass that info in to the Data Table buffer manager (i.e., tell it how big the insert will cause the total amount of HT space to be) so that it knows what the total impact of the operation would be on space used - and then it could make the more global decision itself. Could you draw a picture of how memory is used when all this is happening and put it in the docs somewhere? One think I am uncertain about is how memory looks with multiple partitions, and I would like to be sure we've got things under proper control in that respect. (I am wondering how things are set up to make spilling fairly efficient/painless.) > Prefix-based multi-way Fuzzy-join generates an exception. > --------------------------------------------------------- > > Key: ASTERIXDB-1556 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-1556 > Project: Apache AsterixDB > Issue Type: Bug > Reporter: Taewoo Kim > Assignee: Taewoo Kim > Attachments: 2wayjoin.pdf, 2wayjoin.rtf, 2wayjoinplan.rtf, > 3wayjoin.pdf, 3wayjoin.rtf, 3wayjoinplan.rtf > > > When we enable prefix-based fuzzy-join and apply the multi-way fuzzy-join ( > > 2), the system generates an out-of-memory exception. > Since a fuzzy-join is created using 30-40 lines of AQL codes and this AQL is > translated into massive number of operators (more than 200 operators in the > plan for a 3-way fuzzy join), it could generate out-of-memory exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)