ayushi-agarwal commented on issue #7656: URL: https://github.com/apache/incubator-gluten/issues/7656#issuecomment-2546290228
The query is using velox writer to write parquet data and we are hitting this error. Executor Configs: 1 executor per node memoryOverhead: [amount: 1024] cores: [amount: 16] memory: [amount: 13312] offHeap: [amount: 80896] This is how the plan looks like for the stage. Many scan, filter project fallbacks to spark as from_json is not supported yet. There are 23000 tasks spawned for this stage. It is a non partitioned write. Need suggestions on how configurations can be tweaked to make this job work?  Executor logs: I20241216 17:07:15.881199 292073 WholeStageResultIterator.cc:234] Spill[root/root]: successfully reclaimed total 0B with shrunken 0B and spilled 0B. I20241216 17:07:15.888181 292073 WholeStageResultIterator.cc:230] Spill[root/root]: trying to request spill for 8.00MB. I20241216 17:07:15.888319 292073 WholeStageResultIterator.cc:234] Spill[root/root]: successfully reclaimed total 0B with shrunken 0B and spilled 0B. I20241216 17:07:15.888373 292073 WholeStageResultIterator.cc:230] Spill[root/root]: trying to request spill for 8.00MB. I20241216 17:07:15.888398 292073 WholeStageResultIterator.cc:234] Spill[root/root]: successfully reclaimed total 0B with shrunken 0B and spilled 0B. I20241216 17:07:15.888561 292073 WholeStageResultIterator.cc:230] Spill[root/root]: trying to request spill for 509.40MB. I20241216 17:07:15.888585 292073 WholeStageResultIterator.cc:234] Spill[root/root]: successfully reclaimed total 0B with shrunken 0B and spilled 0B. I20241216 17:07:15.888605 292073 WholeStageResultIterator.cc:230] Spill[root/root]: trying to request spill for 509.40MB. I20241216 17:07:15.888621 292073 WholeStageResultIterator.cc:234] Spill[root/root]: successfully reclaimed total 0B with shrunken 0B and spilled 0B. 24/12/16 17:07:15 ERROR ManagedReservationListener: Error reserving memory from target org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget$OutOfMemoryException: Not enough spark off-heap execution memory. Acquired: 8.0 MiB, granted: 2.0 MiB. Try tweaking config option spark.memory.offHeap.size to get larger space to run this application (if spark.gluten.memory.dynamic.offHeap.sizing.enabled is not enabled). Current config settings: spark.gluten.memory.offHeap.size.in.bytes=75.0 GiB spark.gluten.memory.task.offHeap.size.in.bytes=4.7 GiB spark.gluten.memory.conservative.task.offHeap.size.in.bytes=2.3 GiB spark.memory.offHeap.enabled=true spark.gluten.memory.dynamic.offHeap.sizing.enabled=false Memory consumer stats: Task.28642: Current used bytes: 4.7 GiB, peak bytes: N/A \- Gluten.Tree.1251: Current used bytes: 4.7 GiB, peak bytes: 4.7 GiB \- root.1251: Current used bytes: 4.7 GiB, peak bytes: 4.7 GiB +- ArrowContextInstance.272: Current used bytes: 2.9 GiB, peak bytes: 4.3 GiB +- RowToColumnar.272: Current used bytes: 1696.0 MiB, peak bytes: 1698.0 MiB | \- single: Current used bytes: 1696.0 MiB, peak bytes: 1696.0 MiB | +- root: Current used bytes: 1695.8 MiB, peak bytes: 1696.0 MiB | | \- default_leaf: Current used bytes: 1695.8 MiB, peak bytes: 1695.8 MiB | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- VeloxWriter.272: Current used bytes: 112.0 MiB, peak bytes: 184.0 MiB | \- single: Current used bytes: 112.0 MiB, peak bytes: 184.0 MiB | +- root: Current used bytes: 111.5 MiB, peak bytes: 184.0 MiB | | +- datasource.272: Current used bytes: 111.5 MiB, peak bytes: 184.0 MiB | | | \- .general: Current used bytes: 111.5 MiB, peak bytes: 176.7 MiB | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- ColumnarToRow.393: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | \- single: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | +- root: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | | \- default_leaf: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- NativePlanEvaluator-1332.0: Current used bytes: 4.0 MiB, peak bytes: 16.0 MiB | \- single: Current used bytes: 4.0 MiB, peak bytes: 16.0 MiB | +- root: Current used bytes: 900.9 KiB, peak bytes: 15.0 MiB | | +- task.Gluten_Stage_26_TID_28642_VTID_1332: Current used bytes: 899.4 KiB, peak bytes: 14.0 MiB | | | +- node.1: Current used bytes: 482.5 KiB, peak bytes: 2.0 MiB | | | | \- op.1.0.0.FilterProject: Current used bytes: 482.5 KiB, peak bytes: 1443.3 KiB | | | +- node.3: Current used bytes: 294.4 KiB, peak bytes: 11.0 MiB | | | | \- op.3.0.0.FilterProject: Current used bytes: 294.4 KiB, peak bytes: 10.9 MiB | | | +- node.2: Current used bytes: 122.5 KiB, peak bytes: 1024.0 KiB | | | | \- op.2.0.0.FilterProject: Current used bytes: 122.5 KiB, peak bytes: 380.0 KiB | | | \- node.0: Current used bytes: 0.0 B, peak bytes: 0.0 B | | | \- op.0.0.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 1536.0 B, peak bytes: 1664.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- NativePlanEvaluator-1331.0: Current used bytes: 2.0 MiB, peak bytes: 8.0 MiB | \- single: Current used bytes: 2.0 MiB, peak bytes: 8.0 MiB | +- root: Current used bytes: 120.0 KiB, peak bytes: 2.0 MiB | | +- task.Gluten_Stage_26_TID_28642_VTID_1331: Current used bytes: 120.0 KiB, peak bytes: 2.0 MiB | | | +- node.1: Current used bytes: 96.0 KiB, peak bytes: 1024.0 KiB | | | | \- op.1.0.0.Unnest: Current used bytes: 96.0 KiB, peak bytes: 96.0 KiB | | | +- node.2: Current used bytes: 24.0 KiB, peak bytes: 1024.0 KiB | | | | \- op.2.0.0.FilterProject: Current used bytes: 24.0 KiB, peak bytes: 24.0 KiB | | | \- node.0: Current used bytes: 0.0 B, peak bytes: 0.0 B | | | \- op.0.0.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- IteratorMetrics.1155.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B +- VeloxWriter.272.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 55.2 MiB +- RowToColumnar.272.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 391.2 MiB +- NativePlanEvaluator-1331.0.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 2.4 MiB +- IndicatorVectorBase#init.1251.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B +- ColumnarToRow.393.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 19.2 MiB +- IteratorMetrics.1155: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- single: Current used bytes: 0.0 B, peak bytes: 0.0 B | +- root: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- NativePlanEvaluator-1332.0.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 4.8 MiB \- IndicatorVectorBase#init.1251: Current used bytes: 0.0 B, peak bytes: 0.0 B \- single: Current used bytes: 0.0 B, peak bytes: 0.0 B +- root: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B at org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.borrow(ThrowOnOomMemoryTarget.java:105) at org.apache.gluten.memory.listener.ManagedReservationListener.reserve(ManagedReservationListener.java:49) at org.apache.gluten.vectorized.NativeRowToColumnarJniWrapper.nativeConvertRowToColumnar(Native Method) Task logs: org.apache.spark.SparkException: Task failed while writing rows. at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:654) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:448) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$22(FileFormatWriter.scala:346) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1505) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Exception: VeloxRuntimeError Error Source: RUNTIME Error Code: INVALID_STATE Reason: Operator::getOutput failed for [operator: ValueStream, plan node ID: 0]: Error during calling Java code from native code: org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget$OutOfMemoryException: Not enough spark off-heap execution memory. Acquired: 3.9 GiB, granted: 2.7 GiB. Try tweaking config option spark.memory.offHeap.size to get larger space to run this application (if spark.gluten.memory.dynamic.offHeap.sizing.enabled is not enabled). Current config settings: spark.gluten.memory.offHeap.size.in.bytes=79.0 GiB spark.gluten.memory.task.offHeap.size.in.bytes=4.9 GiB spark.gluten.memory.conservative.task.offHeap.size.in.bytes=2.5 GiB spark.memory.offHeap.enabled=true spark.gluten.memory.dynamic.offHeap.sizing.enabled=false Memory consumer stats: Task.28426: Current used bytes: 2.3 GiB, peak bytes: N/A \- Gluten.Tree.1304: Current used bytes: 2.3 GiB, peak bytes: 4.9 GiB \- root.1304: Current used bytes: 2.3 GiB, peak bytes: 4.9 GiB +- ArrowContextInstance.276: Current used bytes: 2000.0 MiB, peak bytes: 4.6 GiB +- RowToColumnar.276: Current used bytes: 152.0 MiB, peak bytes: 1952.0 MiB | \- single: Current used bytes: 152.0 MiB, peak bytes: 1952.0 MiB | +- root: Current used bytes: 151.6 MiB, peak bytes: 1952.0 MiB | | \- default_leaf: Current used bytes: 151.6 MiB, peak bytes: 1950.3 MiB | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- VeloxWriter.276: Current used bytes: 88.0 MiB, peak bytes: 176.0 MiB | \- single: Current used bytes: 88.0 MiB, peak bytes: 176.0 MiB | +- root: Current used bytes: 82.2 MiB, peak bytes: 176.0 MiB | | +- datasource.276: Current used bytes: 82.2 MiB, peak bytes: 176.0 MiB | | | \- .general: Current used bytes: 82.2 MiB, peak bytes: 175.8 MiB | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- ColumnarToRow.413: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | \- single: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | +- root: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | | \- default_leaf: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- NativePlanEvaluator-1381.0: Current used bytes: 4.0 MiB, peak bytes: 16.0 MiB | \- single: Current used bytes: 4.0 MiB, peak bytes: 16.0 MiB | +- root: Current used bytes: 722.1 KiB, peak bytes: 16.0 MiB | | +- task.Gluten_Stage_26_TID_28426_VTID_1381: Current used bytes: 720.6 KiB, peak bytes: 15.0 MiB | | | +- node.1: Current used bytes: 369.0 KiB, peak bytes: 2.0 MiB | | | | \- op.1.0.0.FilterProject: Current used bytes: 369.0 KiB, peak bytes: 1523.0 KiB | | | +- node.3: Current used bytes: 229.1 KiB, peak bytes: 12.0 MiB | | | | \- op.3.0.0.FilterProject: Current used bytes: 229.1 KiB, peak bytes: 11.2 MiB | | | +- node.2: Current used bytes: 122.5 KiB, peak bytes: 1024.0 KiB | | | | \- op.2.0.0.FilterProject: Current used bytes: 122.5 KiB, peak bytes: 380.0 KiB | | | \- node.0: Current used bytes: 0.0 B, peak bytes: 0.0 B | | | \- op.0.0.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 1536.0 B, peak bytes: 1664.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- NativePlanEvaluator-1380.0: Current used bytes: 2.0 MiB, peak bytes: 8.0 MiB | \- single: Current used bytes: 2.0 MiB, peak bytes: 8.0 MiB | +- root: Current used bytes: 120.0 KiB, peak bytes: 2.0 MiB | | +- task.Gluten_Stage_26_TID_28426_VTID_1380: Current used bytes: 120.0 KiB, peak bytes: 2.0 MiB | | | +- node.1: Current used bytes: 96.0 KiB, peak bytes: 1024.0 KiB | | | | \- op.1.0.0.Unnest: Current used bytes: 96.0 KiB, peak bytes: 96.0 KiB | | | +- node.2: Current used bytes: 24.0 KiB, peak bytes: 1024.0 KiB | | | | \- op.2.0.0.FilterProject: Current used bytes: 24.0 KiB, peak bytes: 24.0 KiB | | | \- node.0: Current used bytes: 0.0 B, peak bytes: 0.0 B | | | \- op.0.0.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- IndicatorVectorBase#init.1304: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- single: Current used bytes: 0.0 B, peak bytes: 0.0 B | +- root: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- VeloxWriter.276.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 52.8 MiB +- NativePlanEvaluator-1381.0.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 4.8 MiB +- IteratorMetrics.1204.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B +- ColumnarToRow.413.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 19.2 MiB +- RowToColumnar.276.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 585.6 MiB +- IteratorMetrics.1204: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- single: Current used bytes: 0.0 B, peak bytes: 0.0 B | +- root: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- IndicatorVectorBase#init.1304.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B \- NativePlanEvaluator-1380.0.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 2.4 MiB at org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.borrow(ThrowOnOomMemoryTarget.java:105) at org.apache.gluten.memory.arrow.alloc.ManagedAllocationListener.onPreAllocation(ManagedAllocationListener.java:61) at org.apache.gluten.shaded.org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:300) at org.apache.gluten.shaded.org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29) at org.apache.gluten.shaded.org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:280) at org.apache.gluten.shaded.org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29) at org.apache.gluten.execution.RowToVeloxColumnarExec$$anon$1.next(RowToVeloxColumnarExec.scala:200) at org.apache.gluten.execution.RowToVeloxColumnarExec$$anon$1.next(RowToVeloxColumnarExec.scala:138) at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.next(IteratorsV1.scala:178) at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.next(IteratorsV1.scala:79) at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.next(IteratorsV1.scala:41) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:33) at org.apache.gluten.vectorized.ColumnarBatchInIterator.next(ColumnarBatchInIterator.java:39) at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method) at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNext0(ColumnarBatchOutIterator.java:57) at org.apache.gluten.iterator.ClosableIterator.hasNext(ClosableIterator.java:39) at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45) at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.hasNext(IteratorsV1.scala:159) at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.hasNext(IteratorsV1.scala:71) at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.hasNext(IteratorsV1.scala:37) at org.apache.gluten.iterator.IteratorsV1$LifeTimeAccumulator.hasNext(IteratorsV1.scala:100) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithIterator(FileFormatDataWriter.scala:95) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:429) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1539) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:438) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$22(FileFormatWriter.scala:346) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1505) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Retriable: False Function: operator() File: /home/abc/incubator-gluten/ep/build-velox/build/velox_ep/velox/exec/Driver.cpp Line: 601 Stack trace: 0 _ZN8facebook5velox7process10StackTraceC1Ei 1 _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_ 2 _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_ 3 _ZZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEEENKUlvE3_clEv.cold 4 _ZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEE 5 _ZN8facebook5velox4exec6Driver4nextEPN5folly10SemiFutureINS3_4UnitEEE 6 _ZN8facebook5velox4exec4Task4nextEPN5folly10SemiFutureINS3_4UnitEEE 7 _ZN6gluten24WholeStageResultIterator4nextEv 8 Java_org_apache_gluten_vectorized_ColumnarBatchOutIterator_nativeHasNext 9 0x00007f36695f9a7b at org.apache.gluten.iterator.ClosableIterator.hasNext(ClosableIterator.java:41) at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45) at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.hasNext(IteratorsV1.scala:159) at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.hasNext(IteratorsV1.scala:71) at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.hasNext(IteratorsV1.scala:37) at org.apache.gluten.iterator.IteratorsV1$LifeTimeAccumulator.hasNext(IteratorsV1.scala:100) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithIterator(FileFormatDataWriter.scala:95) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:429) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1539) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:438) ... 9 more Caused by: org.apache.gluten.exception.GlutenException: Exception: VeloxRuntimeError Error Source: RUNTIME Error Code: INVALID_STATE Reason: Operator::getOutput failed for [operator: ValueStream, plan node ID: 0]: Error during calling Java code from native code: org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget$OutOfMemoryException: Not enough spark off-heap execution memory. Acquired: 3.9 GiB, granted: 2.7 GiB. Try tweaking config option spark.memory.offHeap.size to get larger space to run this application (if spark.gluten.memory.dynamic.offHeap.sizing.enabled is not enabled). Current config settings: spark.gluten.memory.offHeap.size.in.bytes=79.0 GiB spark.gluten.memory.task.offHeap.size.in.bytes=4.9 GiB spark.gluten.memory.conservative.task.offHeap.size.in.bytes=2.5 GiB spark.memory.offHeap.enabled=true spark.gluten.memory.dynamic.offHeap.sizing.enabled=false Memory consumer stats: Task.28426: Current used bytes: 2.3 GiB, peak bytes: N/A \- Gluten.Tree.1304: Current used bytes: 2.3 GiB, peak bytes: 4.9 GiB \- root.1304: Current used bytes: 2.3 GiB, peak bytes: 4.9 GiB +- ArrowContextInstance.276: Current used bytes: 2000.0 MiB, peak bytes: 4.6 GiB +- RowToColumnar.276: Current used bytes: 152.0 MiB, peak bytes: 1952.0 MiB | \- single: Current used bytes: 152.0 MiB, peak bytes: 1952.0 MiB | +- root: Current used bytes: 151.6 MiB, peak bytes: 1952.0 MiB | | \- default_leaf: Current used bytes: 151.6 MiB, peak bytes: 1950.3 MiB | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- VeloxWriter.276: Current used bytes: 88.0 MiB, peak bytes: 176.0 MiB | \- single: Current used bytes: 88.0 MiB, peak bytes: 176.0 MiB | +- root: Current used bytes: 82.2 MiB, peak bytes: 176.0 MiB | | +- datasource.276: Current used bytes: 82.2 MiB, peak bytes: 176.0 MiB | | | \- .general: Current used bytes: 82.2 MiB, peak bytes: 175.8 MiB | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- ColumnarToRow.413: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | \- single: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | +- root: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | | \- default_leaf: Current used bytes: 64.0 MiB, peak bytes: 64.0 MiB | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- NativePlanEvaluator-1381.0: Current used bytes: 4.0 MiB, peak bytes: 16.0 MiB | \- single: Current used bytes: 4.0 MiB, peak bytes: 16.0 MiB | +- root: Current used bytes: 722.1 KiB, peak bytes: 16.0 MiB | | +- task.Gluten_Stage_26_TID_28426_VTID_1381: Current used bytes: 720.6 KiB, peak bytes: 15.0 MiB | | | +- node.1: Current used bytes: 369.0 KiB, peak bytes: 2.0 MiB | | | | \- op.1.0.0.FilterProject: Current used bytes: 369.0 KiB, peak bytes: 1523.0 KiB | | | +- node.3: Current used bytes: 229.1 KiB, peak bytes: 12.0 MiB | | | | \- op.3.0.0.FilterProject: Current used bytes: 229.1 KiB, peak bytes: 11.2 MiB | | | +- node.2: Current used bytes: 122.5 KiB, peak bytes: 1024.0 KiB | | | | \- op.2.0.0.FilterProject: Current used bytes: 122.5 KiB, peak bytes: 380.0 KiB | | | \- node.0: Current used bytes: 0.0 B, peak bytes: 0.0 B | | | \- op.0.0.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 1536.0 B, peak bytes: 1664.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- NativePlanEvaluator-1380.0: Current used bytes: 2.0 MiB, peak bytes: 8.0 MiB | \- single: Current used bytes: 2.0 MiB, peak bytes: 8.0 MiB | +- root: Current used bytes: 120.0 KiB, peak bytes: 2.0 MiB | | +- task.Gluten_Stage_26_TID_28426_VTID_1380: Current used bytes: 120.0 KiB, peak bytes: 2.0 MiB | | | +- node.1: Current used bytes: 96.0 KiB, peak bytes: 1024.0 KiB | | | | \- op.1.0.0.Unnest: Current used bytes: 96.0 KiB, peak bytes: 96.0 KiB | | | +- node.2: Current used bytes: 24.0 KiB, peak bytes: 1024.0 KiB | | | | \- op.2.0.0.FilterProject: Current used bytes: 24.0 KiB, peak bytes: 24.0 KiB | | | \- node.0: Current used bytes: 0.0 B, peak bytes: 0.0 B | | | \- op.0.0.0.ValueStream: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- IndicatorVectorBase#init.1304: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- single: Current used bytes: 0.0 B, peak bytes: 0.0 B | +- root: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- VeloxWriter.276.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 52.8 MiB +- NativePlanEvaluator-1381.0.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 4.8 MiB +- IteratorMetrics.1204.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B +- ColumnarToRow.413.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 19.2 MiB +- RowToColumnar.276.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 585.6 MiB +- IteratorMetrics.1204: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- single: Current used bytes: 0.0 B, peak bytes: 0.0 B | +- root: Current used bytes: 0.0 B, peak bytes: 0.0 B | | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B | \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B +- IndicatorVectorBase#init.1304.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B \- NativePlanEvaluator-1380.0.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 2.4 MiB at org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.borrow(ThrowOnOomMemoryTarget.java:105) at org.apache.gluten.memory.arrow.alloc.ManagedAllocationListener.onPreAllocation(ManagedAllocationListener.java:61) at org.apache.gluten.shaded.org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:300) at org.apache.gluten.shaded.org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29) at org.apache.gluten.shaded.org.apache.arrow.memory.BaseAllocator.buffer(BaseAllocator.java:280) at org.apache.gluten.shaded.org.apache.arrow.memory.RootAllocator.buffer(RootAllocator.java:29) at org.apache.gluten.execution.RowToVeloxColumnarExec$$anon$1.next(RowToVeloxColumnarExec.scala:200) at org.apache.gluten.execution.RowToVeloxColumnarExec$$anon$1.next(RowToVeloxColumnarExec.scala:138) at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.next(IteratorsV1.scala:178) at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.next(IteratorsV1.scala:79) at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.next(IteratorsV1.scala:41) at scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:33) at org.apache.gluten.vectorized.ColumnarBatchInIterator.next(ColumnarBatchInIterator.java:39) at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method) at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNext0(ColumnarBatchOutIterator.java:57) at org.apache.gluten.iterator.ClosableIterator.hasNext(ClosableIterator.java:39) at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:45) at org.apache.gluten.iterator.IteratorsV1$InvocationFlowProtection.hasNext(IteratorsV1.scala:159) at org.apache.gluten.iterator.IteratorsV1$IteratorCompleter.hasNext(IteratorsV1.scala:71) at org.apache.gluten.iterator.IteratorsV1$PayloadCloser.hasNext(IteratorsV1.scala:37) at org.apache.gluten.iterator.IteratorsV1$LifeTimeAccumulator.hasNext(IteratorsV1.scala:100) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithIterator(FileFormatDataWriter.scala:95) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:429) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1539) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:438) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$22(FileFormatWriter.scala:346) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:136) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1505) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Retriable: False Function: operator() File: /home/abc/incubator-gluten/ep/build-velox/build/velox_ep/velox/exec/Driver.cpp Line: 601 Stack trace: 0 _ZN8facebook5velox7process10StackTraceC1Ei 1 _ZN8facebook5velox14VeloxExceptionC1EPKcmS3_St17basic_string_viewIcSt11char_traitsIcEES7_S7_S7_bNS1_4TypeES7_ 2 _ZN8facebook5velox6detail14veloxCheckFailINS0_17VeloxRuntimeErrorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEEvRKNS1_18VeloxCheckFailArgsET0_ 3 _ZZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEEENKUlvE3_clEv.cold 4 _ZN8facebook5velox4exec6Driver11runInternalERSt10shared_ptrIS2_ERS3_INS1_13BlockingStateEERS3_INS0_9RowVectorEE 5 _ZN8facebook5velox4exec6Driver4nextEPN5folly10SemiFutureINS3_4UnitEEE 6 _ZN8facebook5velox4exec4Task4nextEPN5folly10SemiFutureINS3_4UnitEEE 7 _ZN6gluten24WholeStageResultIterator4nextEv 8 Java_org_apache_gluten_vectorized_ColumnarBatchOutIterator_nativeHasNext 9 0x00007f36695f9a7b at org.apache.gluten.vectorized.ColumnarBatchOutIterator.nativeHasNext(Native Method) at org.apache.gluten.vectorized.ColumnarBatchOutIterator.hasNext0(ColumnarBatchOutIterator.java:57) at org.apache.gluten.iterator.ClosableIterator.hasNext(ClosableIterator.java:39) ... 19 more -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
