Tim Armstrong has posted comments on this change. Change subject: IMPALA-5788: Fix agg node crash when grouping by nondeterministic exprs ......................................................................
Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/7714/3/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: Line 1160: // partition index. > Yes, in case of repartition, the rows do get shuffled around in each iterat Repartitioning should work ok in that each repartitioning step will reduce the size of partitions - each row ends up in an arbitrary partition. The end result is pretty much arbitrary though. I thought this was a reasonable solution. I would be ok with a solution that failed queries with nondeterministic grouping functions but I couldn't think of another simpler solution that met the following two constraints: * Guarantees that a nondeterministic UDF or builtin can't crash Impala * Doesn't impose overhead on the "fast path". Runtime checks for whether the row mapped to the right partition would impose some runtime overhead. We could avoid that by codegen'ing a different version of the probe function for the case when we're processing a single spilled partition, but then that would add codegen overhead. -- To view, visit http://gerrit.cloudera.org:8080/7714 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibdb09239577b3f0a19d710b0d148e882b0b73e23 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Bikramjeet Vig <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
