HeartSaVioR opened a new pull request #25830: [SPARK-29140][SQL] Handle BinaryType of parameter properly in HashAggregateExec URL: https://github.com/apache/spark/pull/25830 ### What changes were proposed in this pull request? This patch fixes the issue brought by [SPARK-21870](http://issues.apache.org/jira/browse/SPARK-21870): when generating code for parameter type, it doesn't consider Array type in point of JVM's view. At least we have one, Spark should generate code for BinaryType as `byte[]`, but Spark create the code for BinaryType as `[B` and generated code fails compilation. Below is the generated code which failed compilation (Line 380): ``` /* 380 */ private void agg_doAggregate_count_0([B agg_expr_1_1, boolean agg_exprIsNull_1_1, org.apache.spark.sql.catalyst.InternalRow agg_unsafeRowAggBuffer_1) throws java.io.IOException { /* 381 */ // evaluate aggregate function for count /* 382 */ boolean agg_isNull_26 = false; /* 383 */ long agg_value_28 = -1L; /* 384 */ if (!false && agg_exprIsNull_1_1) { /* 385 */ long agg_value_31 = agg_unsafeRowAggBuffer_1.getLong(1); /* 386 */ agg_isNull_26 = false; /* 387 */ agg_value_28 = agg_value_31; /* 388 */ } else { /* 389 */ long agg_value_33 = agg_unsafeRowAggBuffer_1.getLong(1); /* 390 */ /* 391 */ long agg_value_32 = -1L; /* 392 */ /* 393 */ agg_value_32 = agg_value_33 + 1L; /* 394 */ agg_isNull_26 = false; /* 395 */ agg_value_28 = agg_value_32; /* 396 */ } /* 397 */ // update unsafe row buffer /* 398 */ agg_unsafeRowAggBuffer_1.setLong(1, agg_value_28); /* 399 */ } ``` There wasn't any test for HashAggregateExec specifically testing this, but randomized test in ObjectHashAggregateSuite could encounter this and that's why ObjectHashAggregateSuite is flaky. ### Why are the changes needed? Without fix, generated code from HashAggregateExec may fail compilation. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Added new UT.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
