zhztheplayer opened a new issue, #5362: URL: https://github.com/apache/incubator-gluten/issues/5362
### Backend VL (Velox) ### Bug description Starts from https://github.com/apache/incubator-gluten/pull/5360 The error log: ``` - TPC-H q2 *** FAILED *** 2024-04-11T04:56:12.3100998Z org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 38.0 failed 1 times, most recent failure: Lost task 0.0 in stage 38.0 (TID 1881) (6497b7236305 executor driver): org.apache.gluten.exception.GlutenException: java.lang.RuntimeException: Exception: VeloxRuntimeError 2024-04-11T04:56:12.3103588Z Error Source: RUNTIME 2024-04-11T04:56:12.3104054Z Error Code: INVALID_STATE 2024-04-11T04:56:12.3104494Z Reason: (1 vs. 0) 2024-04-11T04:56:12.3104830Z Retriable: False 2024-04-11T04:56:12.3105386Z Expression: partitionKey.size() == partitionKeysHandle.size() 2024-04-11T04:56:12.3108115Z Context: Split [Hive: file:/__w/incubator-gluten/incubator-gluten/gluten-iceberg/target/scala-2.12/test-classes/tpch-data-iceberg-velox/default/part/data/p_brand=Brand%2313/00000-2-eba45061-8a65-4cf0-b705-e731e6e86a9a-00004.parquet 4 - 3136] Task Gluten_Stage_38_TID_1881 2024-04-11T04:56:12.3110247Z Top-Level Context: Same as context. 2024-04-11T04:56:12.3110714Z Function: testFilters 2024-04-11T04:56:12.3111921Z File: /__w/incubator-gluten/incubator-gluten/ep/build-velox/build/velox_ep/velox/connectors/hive/HiveConnectorUtil.cpp 2024-04-11T04:56:12.3112976Z Line: 579 2024-04-11T04:56:12.3113279Z Stack trace: 2024-04-11T04:56:12.3113699Z # 0 _ZN8facebook5velox7process10StackTraceC1Ei 2024-04-11T04:56:12.3114866Z # 1 _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_ 2024-04-11T04:56:12.3116273Z # 2 _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_ 2024-04-11T04:56:12.3118140Z # 3 _ZN8facebook5velox9connector4hive11testFiltersEPKNS0_6common8ScanSpecEPKNS0_4dwio6common6ReaderERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt13unordered_mapISH_St8optionalISH_ESt4hashISH_ESt8equal_toISH_ESaISt4pairISI_SM_EEERKSK_ISH_St10shared_ptrINS2_16HiveColumnHandleEESO_SQ_SaISR_ISI_SZ_EEE 2024-04-11T04:56:12.3119865Z # 4 _ZN8facebook5velox9connector4hive11SplitReader19checkIfSplitIsEmptyERNS0_4dwio6common17RuntimeStatisticsE 2024-04-11T04:56:12.3121084Z # 5 _ZN8facebook5velox9connector4hive7iceberg18IcebergSplitReader12prepareSplitESt10shared_ptrINS0_6common14MetadataFilterEERNS0_4dwio6common17RuntimeStatisticsE 2024-04-11T04:56:12.3122079Z 24/04/11 04:56:12 ERROR TaskResources: Task 1887 failed by error: 2024-04-11T04:56:12.3122794Z # 6 _ZN8facebook5velox9connector4hive14HiveDataSource8addSplitESt10shared_ptrINS1_14ConnectorSplitEE 2024-04-11T04:56:12.3123449Z # 7 _ZN8facebook5velox4exec9TableScan9getOutputEv 2024-04-11T04:56:12.3124143Z # 8 _ZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEE 2024-04-11T04:56:12.3124947Z # 9 _ZN8facebook5velox4exec6Driver4nextERSt10shared_ptrINS1_13BlockingStateEE 2024-04-11T04:56:12.3125592Z # 10 _ZN8facebook5velox4exec4Task4nextEPN5folly10SemiFutureINS3_4UnitEEE 2024-04-11T04:56:12.3126166Z # 11 _ZN6gluten24WholeStageResultIterator4nextEv 2024-04-11T04:56:12.3126693Z # 12 Java_org_apache_gluten_vectorized_ColumnarBatchOutIterator_nativeHasNext 2024-04-11T04:56:12.3127154Z # 13 0x00007f1295af42e8 2024-04-11T04:56:12.3127315Z 2024-04-11T04:56:12.3127663Z at org.apache.gluten.vectorized.GeneralOutIterator.hasNext(GeneralOutIterator.java:39) 2024-04-11T04:56:12.3128431Z at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45) 2024-04-11T04:56:12.3129158Z at org.apache.gluten.utils.InvocationFlowProtection.hasNext(Iterators.scala:135) 2024-04-11T04:56:12.3129862Z at org.apache.gluten.utils.IteratorCompleter.hasNext(Iterators.scala:69) 2024-04-11T04:56:12.3130376Z org.apache.spark.TaskKilledException 2024-04-11T04:56:12.3130940Z at org.apache.spark.TaskContextImpl.killTaskIfInterrupted(TaskContextImpl.scala:216) 2024-04-11T04:56:12.3131695Z at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:36) 2024-04-11T04:56:12.3132352Z at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) 2024-04-11T04:56:12.3133084Z at org.apache.spark.shuffle.ColumnarShuffleWriter.internalWrite(ColumnarShuffleWriter.scala:117) 2024-04-11T04:56:12.3133931Z at org.apache.spark.shuffle.ColumnarShuffleWriter.write(ColumnarShuffleWriter.scala:235) 2024-04-11T04:56:12.3134833Z at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) 2024-04-11T04:56:12.3135603Z at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) 2024-04-11T04:56:12.3136301Z at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) 2024-04-11T04:56:12.3136888Z at org.apache.spark.scheduler.Task.run(Task.scala:131) 2024-04-11T04:56:12.3137476Z at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) 2024-04-11T04:56:12.3138125Z at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491) 2024-04-11T04:56:12.3138724Z at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) 2024-04-11T04:56:12.3139462Z at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 2024-04-11T04:56:12.3140204Z at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 2024-04-11T04:56:12.3140756Z at java.lang.Thread.run(Thread.java:750) 2024-04-11T04:56:12.3141283Z W20240411 04:56:12.305471 311047 WholeStageResultIterator.cc:374] Not found node id: 0 2024-04-11T04:56:12.3141987Z at org.apache.gluten.utils.PayloadCloser.hasNext(Iterators.scala:35) 2024-04-11T04:56:12.3142665Z at org.apache.gluten.utils.PipelineTimeAccumulator.hasNext(Iterators.scala:98) 2024-04-11T04:56:12.3143400Z at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) 2024-04-11T04:56:12.3144052Z at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) 2024-04-11T04:56:12.3144780Z at org.apache.spark.shuffle.ColumnarShuffleWriter.internalWrite(ColumnarShuffleWriter.scala:117) 2024-04-11T04:56:12.3145627Z at org.apache.spark.shuffle.ColumnarShuffleWriter.write(ColumnarShuffleWriter.scala:235) 2024-04-11T04:56:12.3146439Z at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59) 2024-04-11T04:56:12.3147196Z at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99) 2024-04-11T04:56:12.3147900Z at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52) 2024-04-11T04:56:12.3148494Z at org.apache.spark.scheduler.Task.run(Task.scala:131) 2024-04-11T04:56:12.3149085Z at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) 2024-04-11T04:56:12.3149733Z at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1491) 2024-04-11T04:56:12.3150336Z at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) 2024-04-11T04:56:12.3151026Z at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 2024-04-11T04:56:12.3151760Z at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 2024-04-11T04:56:12.3152296Z at java.lang.Thread.run(Thread.java:750) 2024-04-11T04:56:12.3152769Z Caused by: java.lang.RuntimeException: Exception: VeloxRuntimeError ``` ### Spark version None ### Spark configurations _No response_ ### System information _No response_ ### Relevant logs _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
