Bikramjeet Vig has posted comments on this change.

Change subject: IMPALA-5788: Fix agg node crash when grouping by 
nondeterministic exprs
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7714/3/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

Line 1156: 
> nit: to get more code on the screen, i think we can do without blank lines 
will do in the next patch set.
Thanks


Line 1160:   // partition index.
> in the case where we do repartition (and prior to IMPALA-2708), what happen
Yes, in case of repartition, the rows do get shuffled around in each iteration. 
I think the non-deterministic nature of the group by exprs justify this 
behavior. Although it makes it harder to be specific about what the meaningful 
way is.

I discussed this with Tim, and he pointed out that it might become prohibitive 
to figure out if the query is non-deterministic, specially in case a 
non-deterministic UDF is used as a group by expr. We can also put a check to 
identify if the rows hash to a different partition on each iteration, but that 
would add code to the hot path and doesn't seem worth it.

For the last part of your question, I am not aware of any queries that do this, 
but this special case might very well exist, if not intentionally, in someone's 
workload. This way we ensure it continues to behave the same.

- Request Tim to pitch in if I missed something. Thanks


http://gerrit.cloudera.org:8080/#/c/7714/3/be/src/exec/partitioned-aggregation-node.h
File be/src/exec/partitioned-aggregation-node.h:

PS3, Line 338: all pointers in this vector will point to a single
             :   /// in-memory partition
> outside of the context of this bug fix, it's hard to see why things would b
You are absolutely right. This is very specific to our bug fix. would it make 
sense if I add the Jira ID in the comment? This would be able to provide more 
context in case some one trying to understand this part needs it.


-- 
To view, visit http://gerrit.cloudera.org:8080/7714
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibdb09239577b3f0a19d710b0d148e882b0b73e23
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to