[
https://issues.apache.org/jira/browse/SPARK-20184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976897#comment-15976897
]
Kazuaki Ishizaki commented on SPARK-20184:
------------------------------------------
The root cause is overhead in Java code generated by {{HashAggregateExec}}.
When I changed {{AggUtil.scal} to always use {{SortAggregateExec}}, the elapsed
time is reduced to the same as that w/o whole-stage codegen.
A quick workaround is to disable {{HashAggregateExec}} when the number of
aggregation operations exceeds pre-defined threshold. Is this an appropriate
way? Or, are there better better approaches?
cc: [~liancheng]
{code}
/* 217 */ private void agg_doAggregateWithKeys() throws java.io.IOException {
/* 218 */ agg_hashMap = agg_plan.createHashMap();
/* 219 */
/* 220 */ while (inputadapter_input.hasNext() && !stopEarly()) {
/* 221 */ InternalRow inputadapter_row = (InternalRow)
inputadapter_input.next();
/* 222 */ long inputadapter_value = inputadapter_row.getLong(0);
/* 223 */ long inputadapter_value1 = inputadapter_row.getLong(1);
/* 224 */ boolean inputadapter_isNull2 = inputadapter_row.isNullAt(2);
/* 225 */ long inputadapter_value2 = inputadapter_isNull2 ? -1L :
(inputadapter_row.getLong(2));
/* 226 */ boolean inputadapter_isNull3 = inputadapter_row.isNullAt(3);
/* 227 */ long inputadapter_value3 = inputadapter_isNull3 ? -1L :
(inputadapter_row.getLong(3));
/* 228 */ boolean inputadapter_isNull4 = inputadapter_row.isNullAt(4);
...
/* 422 */ if (agg_fastAggBuffer != null) {
/* 423 */ // update fast row
/* 424 */
/* 425 */ } else {
/* 426 */ // update unsafe row
/* 427 */
/* 428 */ // common sub-expressions
/* 429 */
/* 430 */ // evaluate aggregate function
/* 431 */ boolean agg_isNull168 = true;
/* 432 */ long agg_value248 = -1L;
/* 433 */
/* 434 */ boolean agg_isNull170 = agg_unsafeRowAggBuffer.isNullAt(0);
/* 435 */ long agg_value250 = agg_isNull170 ? -1L :
(agg_unsafeRowAggBuffer.getLong(0));
/* 436 */ boolean agg_isNull169 = agg_isNull170;
/* 437 */ long agg_value249 = agg_value250;
/* 438 */ if (agg_isNull169) {
/* 439 */ boolean agg_isNull171 = false;
/* 440 */ long agg_value251 = -1L;
/* 441 */ if (!false) {
/* 442 */ agg_value251 = (long) 0;
/* 443 */ }
/* 444 */ if (!agg_isNull171) {
/* 445 */ agg_isNull169 = false;
/* 446 */ agg_value249 = agg_value251;
/* 447 */ }
/* 448 */ }
/* 449 */
/* 450 */ if (!inputadapter_isNull2) {
/* 451 */ agg_isNull168 = false; // resultCode could change
nullability.
/* 452 */ agg_value248 = agg_value249 + inputadapter_value2;
/* 453 */
/* 454 */ }
/* 455 */ boolean agg_isNull167 = agg_isNull168;
/* 456 */ long agg_value247 = agg_value248;
/* 457 */ if (agg_isNull167) {
/* 458 */ boolean agg_isNull174 = agg_unsafeRowAggBuffer.isNullAt(0);
/* 459 */ long agg_value254 = agg_isNull174 ? -1L :
(agg_unsafeRowAggBuffer.getLong(0));
/* 460 */ if (!agg_isNull174) {
/* 461 */ agg_isNull167 = false;
/* 462 */ agg_value247 = agg_value254;
/* 463 */ }
/* 464 */ }
/* 465 */ boolean agg_isNull176 = true;
/* 466 */ long agg_value256 = -1L;
...
/* 3136 */ if (!inputadapter_isNull81) {
/* 3137 */ agg_isNull800 = false; // resultCode could change
nullability.
/* 3138 */ agg_value880 = agg_value881 + inputadapter_value81;
/* 3139 */
/* 3140 */ }
/* 3141 */ boolean agg_isNull799 = agg_isNull800;
/* 3142 */ long agg_value879 = agg_value880;
/* 3143 */ if (agg_isNull799) {
/* 3144 */ boolean agg_isNull806 =
agg_unsafeRowAggBuffer.isNullAt(79);
/* 3145 */ long agg_value886 = agg_isNull806 ? -1L :
(agg_unsafeRowAggBuffer.getLong(79));
/* 3146 */ if (!agg_isNull806) {
/* 3147 */ agg_isNull799 = false;
/* 3148 */ agg_value879 = agg_value886;
/* 3149 */ }
/* 3150 */ }
/* 3151 */ // update unsafe row buffer
/* 3152 */ if (!agg_isNull167) {
/* 3153 */ agg_unsafeRowAggBuffer.setLong(0, agg_value247);
/* 3154 */ } else {
/* 3155 */ agg_unsafeRowAggBuffer.setNullAt(0);
/* 3156 */ }
...
/* 3626 */ if (!agg_isNull799) {
/* 3627 */ agg_unsafeRowAggBuffer.setLong(79, agg_value879);
/* 3628 */ } else {
/* 3629 */ agg_unsafeRowAggBuffer.setNullAt(79);
/* 3630 */ }
/* 3631 */
/* 3632 */ }
/* 3633 */ if (shouldStop()) return;
/* 3634 */ }
/* 3635 */
/* 3636 */ agg_mapIter = agg_plan.finishAggregate(agg_hashMap, agg_sorter,
agg_peakMemory, agg_spillSize);
/* 3637 */ }
{code}
> performance regression for complex/long sql when enable whole stage codegen
> ---------------------------------------------------------------------------
>
> Key: SPARK-20184
> URL: https://issues.apache.org/jira/browse/SPARK-20184
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 1.6.0, 2.1.0
> Reporter: Fei Wang
>
> The performance of following SQL get much worse in spark 2.x in contrast
> with codegen off.
> SELECT
> sum(COUNTER_57)
> ,sum(COUNTER_71)
> ,sum(COUNTER_3)
> ,sum(COUNTER_70)
> ,sum(COUNTER_66)
> ,sum(COUNTER_75)
> ,sum(COUNTER_69)
> ,sum(COUNTER_55)
> ,sum(COUNTER_63)
> ,sum(COUNTER_68)
> ,sum(COUNTER_56)
> ,sum(COUNTER_37)
> ,sum(COUNTER_51)
> ,sum(COUNTER_42)
> ,sum(COUNTER_43)
> ,sum(COUNTER_1)
> ,sum(COUNTER_76)
> ,sum(COUNTER_54)
> ,sum(COUNTER_44)
> ,sum(COUNTER_46)
> ,DIM_1
> ,DIM_2
> ,DIM_3
> FROM aggtable group by DIM_1, DIM_2, DIM_3 limit 100;
> Num of rows of aggtable is about 35000000.
> whole stage codegen on(spark.sql.codegen.wholeStage = true): 40s
> whole stage codegen off(spark.sql.codegen.wholeStage = false): 6s
> After some analysis i think this is related to the huge java method(a java
> method of thousand lines) which generated by codegen.
> And If i config -XX:-DontCompileHugeMethods the performance get much
> better(about 7s).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]