[ 
https://issues.apache.org/jira/browse/DERBY-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16097260#comment-16097260
 ] 

Harshvardhan Gupta edited comment on DERBY-6938 at 7/22/17 11:41 AM:
---------------------------------------------------------------------

Hi Bryan,

The Optimizer produces memory and cost estimates for all the possible access 
paths that it can build. Derby will only consider those HASH Joins to go 
through which it _predicts_ will fit into memory and rejects all others 
(including those which will be very efficient for some queries as opposed to 
loop based joins).

However, the memory predictions of Derby may not match memory requirement at 
execution and HASH table may become larger than what Derby predicted. DERBY-106 
was an effort to mitigate the OOM exceptions arising out of this behaviour. So 
although HASH tables do spill to disk but only for those access path go forward 
to execution where Derby predicts that the HASH Tables will not spill to disk.

The above observations of mine match with the comment on DERBY-1259 which was 
written a decade ago.

Coming to the resolution of this problem, we need to investigate how to measure 
the cost of execution when Derby will eventually allow those HASH based access 
paths that it predicts will spill to disk instead of simply ignoring them.

Also, interesting is the observation that Derby currently allows hash based 
joins when users specify their join strategies via Query Hints. 



was (Author: harshvardhan145):
Hi Bryan,

The Optimizer produces memory and cost estimates for all the possible access 
paths that it can build. Derby will only consider those HASH Joins to go 
through which it _predicts _ will fit into memory and rejects all others 
(including those which will be very efficient for some queries as opposed to 
loop based joins).

However, the memory predictions of Derby may not match memory requirement at 
execution and HASH table may become larger than what Derby predicted. DERBY-106 
was an effort to mitigate the OOM exceptions arising out of this behaviour. So 
although HASH tables do spill to disk but only for those access path go forward 
to execution where Derby predicts that the HASH Tables will not spill to disk.

The above observations of mine match with the comment on DERBY-1259 which was 
written a decade ago.

Coming to the resolution of this problem, we need to investigate how to measure 
the cost of execution when Derby will eventually allow those HASH based access 
paths that it predicts will spill to disk instead of simply ignoring them.

Also, interesting is the observation that Derby currently allows hash based 
joins when users specify their join strategies via Query Hints. 


>  Obtain cardinality estimates and true estimates for base tables as well as 
> for intermediate results for queries involving multiple joins. 
> -------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-6938
>                 URL: https://issues.apache.org/jira/browse/DERBY-6938
>             Project: Derby
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Harshvardhan Gupta
>            Assignee: Harshvardhan Gupta
>         Attachments: explain.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to