[jira] [Commented] (SPARK-20184) performance regression for complex/long sql when enable whole stage codegen

Kazuaki Ishizaki (JIRA) Thu, 20 Apr 2017 08:26:29 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-20184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976897#comment-15976897
 ]


Kazuaki Ishizaki commented on SPARK-20184:
------------------------------------------

The root cause is overhead in Java code generated by {{HashAggregateExec}}. 
When I changed {{AggUtil.scal} to always use {{SortAggregateExec}}, the elapsed 
time is reduced to the same as that w/o whole-stage codegen.

A quick workaround is to disable {{HashAggregateExec}} when the number of 
aggregation operations exceeds pre-defined threshold. Is this an appropriate 
way? Or, are there better better approaches?
cc: [~liancheng]


{code}
/* 217 */   private void agg_doAggregateWithKeys() throws java.io.IOException {
/* 218 */     agg_hashMap = agg_plan.createHashMap();
/* 219 */
/* 220 */     while (inputadapter_input.hasNext() && !stopEarly()) {
/* 221 */       InternalRow inputadapter_row = (InternalRow) 
inputadapter_input.next();
/* 222 */       long inputadapter_value = inputadapter_row.getLong(0);
/* 223 */       long inputadapter_value1 = inputadapter_row.getLong(1);
/* 224 */       boolean inputadapter_isNull2 = inputadapter_row.isNullAt(2);
/* 225 */       long inputadapter_value2 = inputadapter_isNull2 ? -1L : 
(inputadapter_row.getLong(2));
/* 226 */       boolean inputadapter_isNull3 = inputadapter_row.isNullAt(3);
/* 227 */       long inputadapter_value3 = inputadapter_isNull3 ? -1L : 
(inputadapter_row.getLong(3));
/* 228 */       boolean inputadapter_isNull4 = inputadapter_row.isNullAt(4);
...
/* 422 */       if (agg_fastAggBuffer != null) {
/* 423 */         // update fast row
/* 424 */
/* 425 */       } else {
/* 426 */         // update unsafe row
/* 427 */
/* 428 */         // common sub-expressions
/* 429 */
/* 430 */         // evaluate aggregate function
/* 431 */         boolean agg_isNull168 = true;
/* 432 */         long agg_value248 = -1L;
/* 433 */
/* 434 */         boolean agg_isNull170 = agg_unsafeRowAggBuffer.isNullAt(0);
/* 435 */         long agg_value250 = agg_isNull170 ? -1L : 
(agg_unsafeRowAggBuffer.getLong(0));
/* 436 */         boolean agg_isNull169 = agg_isNull170;
/* 437 */         long agg_value249 = agg_value250;
/* 438 */         if (agg_isNull169) {
/* 439 */           boolean agg_isNull171 = false;
/* 440 */           long agg_value251 = -1L;
/* 441 */           if (!false) {
/* 442 */             agg_value251 = (long) 0;
/* 443 */           }
/* 444 */           if (!agg_isNull171) {
/* 445 */             agg_isNull169 = false;
/* 446 */             agg_value249 = agg_value251;
/* 447 */           }
/* 448 */         }
/* 449 */
/* 450 */         if (!inputadapter_isNull2) {
/* 451 */           agg_isNull168 = false; // resultCode could change 
nullability.
/* 452 */           agg_value248 = agg_value249 + inputadapter_value2;
/* 453 */
/* 454 */         }
/* 455 */         boolean agg_isNull167 = agg_isNull168;
/* 456 */         long agg_value247 = agg_value248;
/* 457 */         if (agg_isNull167) {
/* 458 */           boolean agg_isNull174 = agg_unsafeRowAggBuffer.isNullAt(0);
/* 459 */           long agg_value254 = agg_isNull174 ? -1L : 
(agg_unsafeRowAggBuffer.getLong(0));
/* 460 */           if (!agg_isNull174) {
/* 461 */             agg_isNull167 = false;
/* 462 */             agg_value247 = agg_value254;
/* 463 */           }
/* 464 */         }
/* 465 */         boolean agg_isNull176 = true;
/* 466 */         long agg_value256 = -1L;
...
/* 3136 */         if (!inputadapter_isNull81) {
/* 3137 */           agg_isNull800 = false; // resultCode could change 
nullability.
/* 3138 */           agg_value880 = agg_value881 + inputadapter_value81;
/* 3139 */
/* 3140 */         }
/* 3141 */         boolean agg_isNull799 = agg_isNull800;
/* 3142 */         long agg_value879 = agg_value880;
/* 3143 */         if (agg_isNull799) {
/* 3144 */           boolean agg_isNull806 = 
agg_unsafeRowAggBuffer.isNullAt(79);
/* 3145 */           long agg_value886 = agg_isNull806 ? -1L : 
(agg_unsafeRowAggBuffer.getLong(79));
/* 3146 */           if (!agg_isNull806) {
/* 3147 */             agg_isNull799 = false;
/* 3148 */             agg_value879 = agg_value886;
/* 3149 */           }
/* 3150 */         }
/* 3151 */         // update unsafe row buffer
/* 3152 */         if (!agg_isNull167) {
/* 3153 */           agg_unsafeRowAggBuffer.setLong(0, agg_value247);
/* 3154 */         } else {
/* 3155 */           agg_unsafeRowAggBuffer.setNullAt(0);
/* 3156 */         }
...
/* 3626 */         if (!agg_isNull799) {
/* 3627 */           agg_unsafeRowAggBuffer.setLong(79, agg_value879);
/* 3628 */         } else {
/* 3629 */           agg_unsafeRowAggBuffer.setNullAt(79);
/* 3630 */         }
/* 3631 */
/* 3632 */       }
/* 3633 */       if (shouldStop()) return;
/* 3634 */     }
/* 3635 */
/* 3636 */     agg_mapIter = agg_plan.finishAggregate(agg_hashMap, agg_sorter, 
agg_peakMemory, agg_spillSize);
/* 3637 */   }
{code}

> performance regression for complex/long sql when enable whole stage codegen
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-20184
>                 URL: https://issues.apache.org/jira/browse/SPARK-20184
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 1.6.0, 2.1.0
>            Reporter: Fei Wang
>
> The performance of following SQL get much worse in spark 2.x  in contrast 
> with codegen off.
>     SELECT
>        sum(COUNTER_57) 
>         ,sum(COUNTER_71) 
>         ,sum(COUNTER_3)  
>         ,sum(COUNTER_70) 
>         ,sum(COUNTER_66) 
>         ,sum(COUNTER_75) 
>         ,sum(COUNTER_69) 
>         ,sum(COUNTER_55) 
>         ,sum(COUNTER_63) 
>         ,sum(COUNTER_68) 
>         ,sum(COUNTER_56) 
>         ,sum(COUNTER_37) 
>         ,sum(COUNTER_51) 
>         ,sum(COUNTER_42) 
>         ,sum(COUNTER_43) 
>         ,sum(COUNTER_1)  
>         ,sum(COUNTER_76) 
>         ,sum(COUNTER_54) 
>         ,sum(COUNTER_44) 
>         ,sum(COUNTER_46) 
>         ,DIM_1 
>         ,DIM_2 
>               ,DIM_3
>     FROM aggtable group by DIM_1, DIM_2, DIM_3 limit 100;
> Num of rows of aggtable is about 35000000.
> whole stage codegen on(spark.sql.codegen.wholeStage = true):    40s
> whole stage codegen  off(spark.sql.codegen.wholeStage = false):    6s
> After some analysis i think this is related to the huge java method(a java 
> method of thousand lines) which generated by codegen.
> And If i config -XX:-DontCompileHugeMethods the performance get much 
> better(about 7s).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-20184) performance regression for complex/long sql when enable whole stage codegen

Reply via email to