If I remember correctly we eliminated FieldAccessNested function in favor in chained FieldAccessByName\ByIndex. @Steven, correct me if I am wrong.
> On Jun 24, 2017, at 18:00, Yingyi Bu <[email protected]> wrote: > > Hi Wail, > > $22 should be a harmless bug -- it's related to the ordering of rules. > For $19: we could potentially have a rule for that. > > Best, > Yingyi > > On Sat, Jun 24, 2017 at 5:50 PM, Wail Alkowaileet <[email protected]> > wrote: > >> Hi Devs, >> >> I have few questions about the query optimizer. >> >> *- Given the query:* >> use dataverse TwitterDataverse >> >> for $x in dataset Tweets >> where $x.name = "trump" >> let $geo := $x.geo >> group by $name:=$x.name with $geo >> return {"name": $name, "geo":$geo[0].coordinates.coordinates} >> >> *- Logical Plan:* >> distribute result [$$10] -- |UNPARTITIONED| >> project ([$$10]) -- |UNPARTITIONED| >> assign [$$10] <- [{"name": $$name, "geo": get-item($$9, >> 0).getField("coordinates").getField("coordinates")}] -- |UNPARTITIONED| >> group by ([$$name := $$x.getField("name")]) decor ([]) { >> aggregate [$$9] <- [listify($$geo)] -- |UNPARTITIONED| >> nested tuple source -- |UNPARTITIONED| >> } -- |UNPARTITIONED| >> assign [$$geo] <- [$$x.getField("geo")] -- |UNPARTITIONED| >> select (eq($$x.getField("name"), "Alice")) -- |UNPARTITIONED| >> unnest $$x <- dataset("Tweets") -- |UNPARTITIONED| >> empty-tuple-source -- |UNPARTITIONED| >> >> *- Optimized Logical Plan:* >> distribute result [$$10] >> -- DISTRIBUTE_RESULT |PARTITIONED| >> exchange >> -- ONE_TO_ONE_EXCHANGE |PARTITIONED| >> project ([$$10]) >> -- STREAM_PROJECT |PARTITIONED| >> assign [$$10] <- [{"name": $$name, "geo": >> $$19.getField("coordinates") >> }] >> -- ASSIGN |PARTITIONED| >> project ([$$name, $$19]) >> -- STREAM_PROJECT |PARTITIONED| >> assign [$$19, $$22] <- [get-item($$9, >> 0).getField("coordinates"), get-item($$9, >> 0)] >> -- ASSIGN |PARTITIONED| >> exchange >> -- ONE_TO_ONE_EXCHANGE |PARTITIONED| >> group by ([$$name := $$15]) decor ([]) { >> aggregate [$$9] <- [listify($$geo)] >> -- AGGREGATE |LOCAL| >> nested tuple source >> -- NESTED_TUPLE_SOURCE |LOCAL| >> } >> -- PRE_CLUSTERED_GROUP_BY[$$15] |PARTITIONED| >> exchange >> -- ONE_TO_ONE_EXCHANGE |PARTITIONED| >> order (ASC, $$15) >> -- STABLE_SORT [$$15(ASC)] |PARTITIONED| >> exchange >> -- HASH_PARTITION_EXCHANGE [$$15] |PARTITIONED| >> select (eq($$15, "Alice")) >> -- STREAM_SELECT |PARTITIONED| >> project ([$$geo, $$15]) >> -- STREAM_PROJECT |PARTITIONED| >> assign [$$geo, $$15] <- [$$x.getField("geo"), >> $$x.getField("name")] >> -- ASSIGN |PARTITIONED| >> project ([$$x]) >> -- STREAM_PROJECT |PARTITIONED| >> exchange >> -- ONE_TO_ONE_EXCHANGE |PARTITIONED| >> data-scan []<-[$$16, $$x] <- >> TwitterDataverse.Tweets >> -- DATASOURCE_SCAN |PARTITIONED| >> exchange >> -- ONE_TO_ONE_EXCHANGE |PARTITIONED| >> empty-tuple-source >> -- EMPTY_TUPLE_SOURCE |PARTITIONED| >> >> *- Questions:* >> $$22: >> >> - Why the variable $22 is produced ? Although there is no use for it. Is >> it just a harmless bug or there's some intuition I might be missing? >> >> $$19: >> >> - It seems (sometimes) getField function calls are splitted. Is there a >> reason why is that the case? (There's another example that reproduces >> the >> same behavior) >> - That leads to my next question, I see no rule for "FieldAccessNested" >> which can be exploited here to save few function calls. Can this >> function >> interfere with other functions/access methods? >> >> >> -- >> >> *Regards,.* >> Wail Alkowaileet >> Best regards, Ildar
