lgbo-ustc opened a new issue, #6923:
URL: https://github.com/apache/incubator-gluten/issues/6923

   ### Backend
   
   CH (ClickHouse)
   
   ### Bug description
   
   [Expected behavior] and [actual behavior].
   Following plan got stuck at broadcast join
   ```
   Execute InsertIntoHadoopFsRelationCommand (41)
   +- FakeRowAdaptor (40)
      +- AdaptiveSparkPlan (39)
         +- == Current Plan ==
            Project (21)
            +- BroadcastHashJoin LeftOuter BuildRight (20)
               :- Project (10)
               :  +- Filter (9)
               :     +- HashAggregate (8)
               :        +- ShuffleQueryStage (7)
               :           +- ColumnarExchange (6)
               :              +- ^ CHHashAggregateExecTransformer (4)
               :                 +- ^ ProjectExecTransformer (3)
               :                    +- ^ FilterExecTransformer (2)
               :                       +- ^ Scan orc 
algo.bigolive_tieba_meta_actions (1)
               +- BroadcastQueryStage (19)
                  +- ColumnarBroadcastExchange (18)
                     +- AQEShuffleRead (17)
                        +- ShuffleQueryStage (16), Statistics(sizeInBytes=0.0 
B, rowCount=3.85E+8)
                           +- ColumnarExchange (15)
                              +- ^ ProjectExecTransformer (13)
                                 +- ^ FilterExecTransformer (12)
                                    +- ^ Scan orc 
bigolive.tbl_tieba_post_index_orc (11)
   ```
   
   the jstack indicates that `CHSparkPlanExecApi#createBroadcastRelation` 
cannot finish.
   ```
   "broadcast-exchange-0" #1239 daemon prio=5 os_prio=0 tid=0x00007f91e04d4000 
nid=0x39c74 runnable [0x00007f917bed9000]
      java.lang.Thread.State: RUNNABLE
        at 
scala.collection.mutable.ArrayBuilder$ofByte.ensureSize(ArrayBuilder.scala:156)
        at 
scala.collection.mutable.ArrayBuilder$ofByte.$plus$plus$eq(ArrayBuilder.scala:170)
        at 
scala.collection.mutable.ArrayBuilder$ofByte.$plus$plus$eq(ArrayBuilder.scala:132)
        at 
scala.collection.mutable.ArrayOps.$anonfun$flatten$2(ArrayOps.scala:97)
        at 
scala.collection.mutable.ArrayOps$$Lambda$4888/334145315.apply(Unknown Source)
        at 
scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
        at 
scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
        at scala.collection.mutable.ArrayOps.flatten(ArrayOps.scala:96)
        at scala.collection.mutable.ArrayOps.flatten$(ArrayOps.scala:93)
        at scala.collection.mutable.ArrayOps$ofRef.flatten(ArrayOps.scala:198)
        at 
org.apache.gluten.backendsapi.clickhouse.CHSparkPlanExecApi.createBroadcastRelation(CHSparkPlanExecApi.scala:549)
        at 
org.apache.spark.sql.execution.ColumnarBroadcastExchangeExec.$anonfun$relationFuture$2(ColumnarBroadcastExchangeExec.scala:77)
        at 
org.apache.spark.sql.execution.ColumnarBroadcastExchangeExec$$Lambda$4773/16750917.apply(Unknown
 Source)
        at org.apache.gluten.utils.Arm$.withResource(Arm.scala:25)
        at 
org.apache.gluten.metrics.GlutenTimeMetric$.millis(GlutenTimeMetric.scala:37)
        at 
org.apache.spark.sql.execution.ColumnarBroadcastExchangeExec.$anonfun$relationFuture$1(ColumnarBroadcastExchangeExec.scala:65)
        at 
org.apache.spark.sql.execution.ColumnarBroadcastExchangeExec$$Lambda$4771/1427892145.apply(Unknown
 Source)
        at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withThreadLocalCaptured$1(SQLExecution.scala:191)
        at 
org.apache.spark.sql.execution.SQLExecution$$$Lambda$4772/2090377899.call(Unknown
 Source)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
   ```
   
   ```scala
       val countsAndBytes =
         CHExecUtil.buildSideRDD(dataSize, newChild).collect
   
       val batches = countsAndBytes.map(_._2)
   ```
   trace all batches above, get following result
   ```
   24/08/19 17:30:36.497 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batches 65
   24/08/19 17:30:36.498 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57210043
   24/08/19 17:30:36.498 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57215399
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57189072
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57187601
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57248053
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57190351
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57201526
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57177384
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57178695
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57216690
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57134557
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57209132
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57170909
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57230892
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57182134
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57203839
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57208612
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57185228
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55939583
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57229297
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55963855
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54730324
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54787825
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54733666
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54734577
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54754959
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54735946
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55988446
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57205039
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54765405
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54724117
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54752422
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55978202
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 56011322
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54780992
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54776088
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55949531
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54739639
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54770459
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54744992
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55944937
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 56018268
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57224039
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54770620
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54780548
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54757812
   24/08/19 17:30:36.499 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54778430
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55957218
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55963794
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55907256
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 54761124
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 56019156
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57210387
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57173746
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 56014911
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57242276
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 55996245
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57254362
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57234617
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 56010776
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57194361
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57191818
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 56010023
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 56009939
   24/08/19 17:30:36.500 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
batch len 57167904
   24/08/19 17:30:36.501 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57210043
   24/08/19 17:30:37.480 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57215399
   24/08/19 17:30:38.567 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57189072
   24/08/19 17:30:39.674 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57187601
   24/08/19 17:30:40.564 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57248053
   24/08/19 17:30:41.918 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57190351
   24/08/19 17:30:42.803 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57201526
   24/08/19 17:30:43.699 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57177384
   24/08/19 17:30:44.590 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57178695
   24/08/19 17:30:45.749 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57216690
   24/08/19 17:30:47.684 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57134557
   24/08/19 17:30:49.329 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57209132
   24/08/19 17:30:51.023 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57170909
   24/08/19 17:30:52.667 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57230892
   24/08/19 17:30:54.341 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57182134
   24/08/19 17:30:55.991 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57203839
   24/08/19 17:30:57.649 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57208612
   24/08/19 17:30:59.333 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 57185228
   24/08/19 17:31:01.043 ERROR [broadcast-exchange-0] CHSparkPlanExecApi: xxx 
add batch 55939583
   ```
   
   `spark.sql.autoBroadcastJoinThreshold` seems not work
   
   ### Spark version
   
   None
   
   ### Spark configurations
   
   _No response_
   
   ### System information
   
   _No response_
   
   ### Relevant logs
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to