>> the role reversal happens locally in later iterations of the hash join, right, based on the relative sizes of on-disk hash partitions? That's right.
>>I think the key thing that we need to be consistent about is which is the "basic" outer and inner >> - which in the case of a parallel hash join would be which is the initial build dataset and which is the initial probe dataset. Got it! Thanks! Best, Yingyi On Sun, Jan 10, 2016 at 3:59 PM, Mike Carey <[email protected]> wrote: > Just to be (or get) clear - the role reversal happens locally in later > iterations of the hash join, right, based on the relative sizes of on-disk > hash partitions? I think the key thing that we need to be consistent about > is which is the "basic" outer and inner - which in the case of a parallel > hash join would be which is the initial build dataset and which is the > initial probe dataset. > > > On 1/8/16 9:08 AM, Yingyi Bu wrote: > >> In left-outer hash join, if the the probe branch is locally clustered (or >> sorted) by a column superset of the join key, >> the output will still be locally clustered. >> Inner hash join couldn't maintain that because of the "role reversal" >> optimization in the runtime. >> >> Best, >> Yingyi >> >> On Thu, Jan 7, 2016 at 10:07 PM, Taewoo Kim <[email protected]> wrote: >> >> Interesting. Can you be more specific? >>> >>> Best, >>> Taewoo >>> >>> On Thu, Jan 7, 2016 at 6:36 PM, Yingyi Bu <[email protected]> wrote: >>> >>> Sorry, to be more precise: >>>> Left-outer hash join cannot preserve all local data properties for its >>>> probe branch (because spilling can happen) but can preserve (or >>>> >>> downgrade) >>> >>>> some when certain conditions meet. >>>> >>>> On Thu, Jan 7, 2016 at 6:28 PM, Yingyi Bu <[email protected]> wrote: >>>> >>>> In the logical/physical query plan, I think it is statically >>>>> >>>> determined. >>> >>>> However, that doesn't mean the execution is faithful to that >>>>> >>>> probe/build >>> >>>> decision because we have the "role reversal" optimization for inner >>>>> >>>> hash >>> >>>> joins:-) >>>>> (That's also why our inner hash join cannot maintain any local data >>>>> property from its probe branch, but left-outer hash can preserve that.) >>>>> >>>>> Best, >>>>> Yingyi >>>>> >>>>> >>>>> On Thu, Jan 7, 2016 at 5:34 PM, Mike Carey <[email protected]> wrote: >>>>> >>>>> Also, do we (separately want to make sure that our hash join behavior >>>>>> >>>>> is >>> >>>> comparable - i.e., that the initial build/probe decision is statically >>>>>> determined from the query? (I think we do want that, and I think it >>>>>> >>>>> is >>> >>>> in >>>> >>>>> fact that way - but I'm not 100% sure, as its been awhile since that >>>>>> >>>>> was >>> >>>> discussed, and it's not in-cache for me. :-)) >>>>>> >>>>>> >>>>>> On 1/7/16 3:22 PM, Taewoo Kim wrote: >>>>>> >>>>>> Thanks Yingyi. Yes. If there is an equality condition and if we can't >>>>>>> transform a join into an index-nested loop join, then a hybrid hash >>>>>>> >>>>>> join >>>> >>>>> will be used. >>>>>>> >>>>>>> Best, >>>>>>> Taewoo >>>>>>> >>>>>>> On Thu, Jan 7, 2016 at 3:14 PM, Yingyi Bu <[email protected]> >>>>>>> >>>>>> wrote: >>> >>>> +1! >>>>>>> >>>>>>>> 3. We only try to use applicable indexes from the inner branch. So, >>>>>>>>> >>>>>>>> if >>>> >>>>> there are no applicable indexes from the inner branch, we abort >>>>>>>>>> transforming a join into an index-nested-loop join. >>>>>>>>>> >>>>>>>>>> "index-nested-loop join" -> "hybrid hash join"? >>>>>>>>> >>>>>>>> Thanks! >>>>>>>> >>>>>>>> Yingyi >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jan 7, 2016 at 3:01 PM, Taewoo Kim <[email protected]> >>>>>>>> >>>>>>> wrote: >>>> >>>>> Hello dev, >>>>>>>> >>>>>>>>> Regarding this issue (ASTERIXDB-1249), I would like to make an >>>>>>>>> index-join >>>>>>>>> hint clarification. Let's start with an example query other than a >>>>>>>>> self-join query in this issue. >>>>>>>>> >>>>>>>>> for $c in dataset('Customers') >>>>>>>>> >>>>>>>>> for $o in dataset('Orders') >>>>>>>>> >>>>>>>>> where $c.cid /*+ indexnl */ = $o.cid >>>>>>>>> >>>>>>>>> order by $c.cid, $o.oid >>>>>>>>> >>>>>>>>> return {"cid":$c.cid, "oid": $o.oid} >>>>>>>>> >>>>>>>>> Right now, in the master branch, the first dataset (e.g., >>>>>>>>> >>>>>>>> Customers) >>> >>>> becomes the outer branch and the second dataset (e.g., Orders) >>>>>>>>> >>>>>>>> becomes >>>> >>>>> the >>>>>>>> >>>>>>>> inner branch. And, when the optimizer tries to honor the given >>>>>>>>> >>>>>>>> indexnl >>>> >>>>> hint >>>>>>>> >>>>>>>> (transforming a join into an index-nested-loop join), if there are >>>>>>>>> applicable indexes from the inner branch (e.g., Orders), then it is >>>>>>>>> going >>>>>>>>> to use one of those indexes. If there are no applicable indexes >>>>>>>>> >>>>>>>> from >>> >>>> the >>>>>>>>> inner branch, it tries to use indexes from the outer branch (e.g., >>>>>>>>> Customers). We are going to change the last part; we will not use >>>>>>>>> indexes >>>>>>>>> from the outer branch. So, the following are refined rules for >>>>>>>>> >>>>>>>>> transforming >>>>>>>> >>>>>>>> a join into an index-nested-loop join. >>>>>>>>> >>>>>>>>> 1. The first dataset in a join (the first parameter of the given >>>>>>>>> >>>>>>>> join) >>>> >>>>> becomes the outer branch. >>>>>>>>> 2. The second dataset in a join (the second parameter of the given >>>>>>>>> join) >>>>>>>>> becomes the inner branch. >>>>>>>>> 3. We only try to use applicable indexes from the inner branch. So, >>>>>>>>> >>>>>>>> if >>>> >>>>> there are no applicable indexes from the inner branch, we abort >>>>>>>>> transforming a join into an index-nested-loop join. >>>>>>>>> 4. Variable order in the given join predicate is not important. It >>>>>>>>> >>>>>>>> can >>>> >>>>> be >>>>>>>>> either outer.fieldA = inner.fieldB or inner.fieldB = outer.fieldA. >>>>>>>>> >>>>>>>>> So, for the left-outer join and inner join altogether, the left >>>>>>>>> >>>>>>>> subtree >>>> >>>>> is >>>>>>>> >>>>>>>> the probing side and the right subtree is the index side. So, this >>>>>>>>> >>>>>>>> can >>>> >>>>> be >>>>>>>>> applied to the self-join case, too just like the following query in >>>>>>>>> this >>>>>>>>> issue. In the following query, $t1 becomes the outer and $t2 >>>>>>>>> >>>>>>>> becomes >>> >>>> the >>>>>>>>> inner. >>>>>>>>> >>>>>>>>> for $t1 in dataset('TweetMessages') >>>>>>>>> >>>>>>>>> for $t2 in dataset('TweetMessages') >>>>>>>>> >>>>>>>>> let $c := $t1.countA + 20 >>>>>>>>> >>>>>>>>> where $c /* +indexnl */= $t2.countB >>>>>>>>> >>>>>>>>> order by $t2.tweetid >>>>>>>>> >>>>>>>>> return {"tweetid2": $t2.tweetid, "count2":$t2.countB}; >>>>>>>>> >>>>>>>>> Thank you. Any opinion would be appreciated before I finalize this >>>>>>>>> >>>>>>>> fix. >>>> >>>>> Best, >>>>>>>>> Taewoo >>>>>>>>> >>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>> From: Yingyi Bu (JIRA) <[email protected]> >>>>>>>>> Date: Wed, Jan 6, 2016 at 10:23 AM >>>>>>>>> Subject: [jira] [Commented] (ASTERIXDB-1249) Self index join >>>>>>>>> >>>>>>>> chooses >>> >>>> wrong >>>>>>>> >>>>>>>> probe/index branch >>>>>>>>> To: [email protected] >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> [ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>> https://issues.apache.org/jira/browse/ASTERIXDB-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085992#comment-15085992 >>> >>>> ] >>>>>>>>> >>>>>>>>> Yingyi Bu commented on ASTERIXDB-1249: >>>>>>>>> -------------------------------------- >>>>>>>>> >>>>>>>>> That's right. Basically in AcessMethod implementations, e.g.: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>> https://github.com/apache/incubator-asterixdb/blob/master/asterix-algebra/src/main/java/org/apache/asterix/optimizer/rules/am/RTreeAccessMethod.java#L124 >>> >>>> Choosing probe/index branch is only based on dataset names, instead >>>>>>>>> >>>>>>>> of >>>> >>>>> being based on the join condition. >>>>>>>>> >>>>>>>>> Self index join chooses wrong probe/index branch >>>>>>>>> >>>>>>>>>> ------------------------------------------------ >>>>>>>>>> >>>>>>>>>> Key: ASTERIXDB-1249 >>>>>>>>>> URL: >>>>>>>>>> >>>>>>>>>> https://issues.apache.org/jira/browse/ASTERIXDB-1249 >>>>>>>>> >>>>>>>>> Project: Apache AsterixDB >>>>>>>>>> Issue Type: Bug >>>>>>>>>> Components: Optimizer >>>>>>>>>> Reporter: Yingyi Bu >>>>>>>>>> Assignee: Taewoo Kim >>>>>>>>>> >>>>>>>>>> DDLs: >>>>>>>>>> {noformat} >>>>>>>>>> drop dataverse test if exists; >>>>>>>>>> create dataverse test; >>>>>>>>>> use dataverse test; >>>>>>>>>> create type TwitterUserType as closed { >>>>>>>>>> screen-name: string, >>>>>>>>>> lang: string, >>>>>>>>>> friends-count: int64, >>>>>>>>>> statuses-count: int64, >>>>>>>>>> name: string, >>>>>>>>>> followers-count: int64 >>>>>>>>>> } >>>>>>>>>> create type TweetMessageType as closed { >>>>>>>>>> tweetid: int64, >>>>>>>>>> user: TwitterUserType, >>>>>>>>>> sender-location: point, >>>>>>>>>> send-time: datetime, >>>>>>>>>> referred-topics: {{ string }}, >>>>>>>>>> message-text: string, >>>>>>>>>> countA: int64, >>>>>>>>>> countB: int64 >>>>>>>>>> } >>>>>>>>>> create dataset TweetMessages(TweetMessageType) >>>>>>>>>> primary key tweetid; >>>>>>>>>> create index twmSndLocIx on TweetMessages(sender-location) type >>>>>>>>>> >>>>>>>>> rtree; >>>> >>>>> create index msgCountAIx on TweetMessages(countA) type btree; >>>>>>>>>> create index msgCountBIx on TweetMessages(countB) type btree; >>>>>>>>>> create index msgTextIx on TweetMessages(message-text) type >>>>>>>>>> >>>>>>>>> keyword; >>> >>>> {noformat} >>>>>>>>>> Query: >>>>>>>>>> {noformat} >>>>>>>>>> for $t1 in dataset('TweetMessages') >>>>>>>>>> for $t2 in dataset('TweetMessages') >>>>>>>>>> let $n := create-circle($t1.sender-location, 0.5) >>>>>>>>>> where spatial-intersect($t2.sender-location, $n) >>>>>>>>>> order by $t2.tweetid >>>>>>>>>> return {"tweetid2":$t2.tweetid, "loc2":$t2.sender-location}; >>>>>>>>>> {noformat} >>>>>>>>>> Optimized plan: >>>>>>>>>> {noformat} >>>>>>>>>> distribute result [%0->$$10] >>>>>>>>>> -- DISTRIBUTE_RESULT |PARTITIONED| >>>>>>>>>> exchange >>>>>>>>>> -- ONE_TO_ONE_EXCHANGE |PARTITIONED| >>>>>>>>>> project ([$$10]) >>>>>>>>>> -- STREAM_PROJECT |PARTITIONED| >>>>>>>>>> assign [$$10] <- [function-call: >>>>>>>>>> >>>>>>>>>> asterix:closed-record-constructor, >>>>>>>>> Args:[AString: {tweetid2}, %0->$$15, AString: {loc2}, %0->$$13]] >>>>>>>>> >>>>>>>>> -- ASSIGN |PARTITIONED| >>>>>>>>>> exchange >>>>>>>>>> -- SORT_MERGE_EXCHANGE [$$15(ASC) ] |PARTITIONED| >>>>>>>>>> order (ASC, %0->$$15) >>>>>>>>>> -- STABLE_SORT [$$15(ASC)] |PARTITIONED| >>>>>>>>>> exchange >>>>>>>>>> -- ONE_TO_ONE_EXCHANGE |PARTITIONED| >>>>>>>>>> project ([$$13, $$15]) >>>>>>>>>> -- STREAM_PROJECT |PARTITIONED| >>>>>>>>>> select (function-call: >>>>>>>>>> asterix:spatial-intersect, >>>>>>>>>> >>>>>>>>>> Args:[%0->$$13, function-call: asterix:create-circle, >>>>>>>>> >>>>>>>>> Args:[function-call: >>>>>>>> >>>>>>>> asterix:field-access-by-index, Args:[%0->$$0, AInt32: {2}], >>>>>>>>> >>>>>>>> ADouble: >>> >>>> {0.5}]]) >>>>>>>>> >>>>>>>>> -- STREAM_SELECT |PARTITIONED| >>>>>>>>>> project ([$$0, $$13, $$15]) >>>>>>>>>> -- STREAM_PROJECT |PARTITIONED| >>>>>>>>>> exchange >>>>>>>>>> -- ONE_TO_ONE_EXCHANGE |PARTITIONED| >>>>>>>>>> unnest-map [$$14, $$0] <- function-call: >>>>>>>>>> >>>>>>>>>> asterix:index-search, Args:[AString: {TweetMessages}, AInt32: {0}, >>>>>>>>> >>>>>>>>> AString: >>>>>>>> >>>>>>>> {test}, AString: {TweetMessages}, ABoolean: {true}, ABoolean: >>>>>>>>> >>>>>>>> {false}, >>>> >>>>> ABoolean: {false}, AInt32: {1}, %0->$$27, AInt32: {1}, %0->$$27, >>>>>>>>> >>>>>>>> TRUE, >>>> >>>>> TRUE, TRUE] >>>>>>>>> >>>>>>>>> -- BTREE_SEARCH |PARTITIONED| >>>>>>>>>> exchange >>>>>>>>>> -- ONE_TO_ONE_EXCHANGE |PARTITIONED| >>>>>>>>>> order (ASC, %0->$$27) >>>>>>>>>> -- STABLE_SORT [$$27(ASC)] >>>>>>>>>> >>>>>>>>> |PARTITIONED| >>> >>>> exchange >>>>>>>>>> -- ONE_TO_ONE_EXCHANGE >>>>>>>>>> |PARTITIONED| >>>>>>>>>> project ([$$27, $$13, $$15]) >>>>>>>>>> -- STREAM_PROJECT |PARTITIONED| >>>>>>>>>> exchange >>>>>>>>>> -- ONE_TO_ONE_EXCHANGE >>>>>>>>>> >>>>>>>>> |PARTITIONED| >>>> >>>>> unnest-map [$$23, $$24, $$25, >>>>>>>>>> >>>>>>>>> $$26, >>>> >>>>> $$27] <- function-call: asterix:index-search, Args:[AString: >>>>>>>>> >>>>>>>>> {twmSndLocIx}, >>>>>>>> >>>>>>>> AInt32: {1}, AString: {test}, AString: {TweetMessages}, ABoolean: >>>>>>>>> {true}, >>>>>>>>> ABoolean: {false}, ABoolean: {true}, AInt32: {4}, %0->$$19, >>>>>>>>> >>>>>>>> %0->$$20, >>> >>>> %0->$$21, %0->$$22] >>>>>>>>> >>>>>>>>> -- RTREE_SEARCH |PARTITIONED| >>>>>>>>>> exchange >>>>>>>>>> -- BROADCAST_EXCHANGE >>>>>>>>>> >>>>>>>>>> |PARTITIONED| >>>>>>>>> assign [$$19, $$20, $$21, >>>>>>>>> >>>>>>>> $$22] >>>> >>>>> <- >>>>>>>>> [function-call: asterix:create-mbr, Args:[%0->$$13, AInt32: {2}, >>>>>>>>> AInt32: >>>>>>>>> {0}], function-call: asterix:create-mbr, Args:[%0->$$13, AInt32: >>>>>>>>> >>>>>>>> {2}, >>> >>>> AInt32: {1}], function-call: asterix:create-mbr, Args:[%0->$$13, >>>>>>>>> AInt32: >>>>>>>>> {2}, AInt32: {2}], function-call: asterix:create-mbr, >>>>>>>>> >>>>>>>> Args:[%0->$$13, >>> >>>> AInt32: {2}, AInt32: {3}]] >>>>>>>>> >>>>>>>>> -- ASSIGN |PARTITIONED| >>>>>>>>>> project ([$$13, $$15]) >>>>>>>>>> -- STREAM_PROJECT >>>>>>>>>> >>>>>>>>>> |PARTITIONED| >>>>>>>>> assign [$$13] <- >>>>>>>>> [function-call: asterix:field-access-by-index, Args:[%0->$$1, >>>>>>>>> >>>>>>>> AInt32: >>> >>>> {2}]] >>>>>>>> >>>>>>>> -- ASSIGN |PARTITIONED| >>>>>>>>> >>>>>>>>>> exchange >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>> ONE_TO_ONE_EXCHANGE >>> >>>> |PARTITIONED| >>>>>>>>> >>>>>>>>> data-scan >>>>>>>>>> >>>>>>>>> []<-[$$15, >>> >>>> $$1] >>>>>>>>>> >>>>>>>>>> <- test:TweetMessages >>>>>>>>> >>>>>>>>> -- DATASOURCE_SCAN >>>>>>>>>> >>>>>>>>>> |PARTITIONED| >>>>>>>>> >>>>>>>>> exchange >>>>>>>>>> -- >>>>>>>>>> ONE_TO_ONE_EXCHANGE >>>>>>>>>> >>>>>>>>>> |PARTITIONED| >>>>>>>>> >>>>>>>>> empty-tuple-source >>>> >>>>> -- >>>>>>>>>> EMPTY_TUPLE_SOURCE >>>>>>>>>> >>>>>>>>>> |PARTITIONED| >>>>>>>>> >>>>>>>>> {noformat} >>>>>>>>>> The optimized plan is incorrect --- the index search doesn't use >>>>>>>>>> >>>>>>>>> the >>> >>>> right join condition and hence the result is different from >>>>>>>>> >>>>>>>> expected. >>> >>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> This message was sent by Atlassian JIRA >>>>>>>>> (v6.3.4#6332) >>>>>>>>> >>>>>>>>> >>>>>>>>> >
