kecookier opened a new issue, #8268:
URL: https://github.com/apache/incubator-gluten/issues/8268

   ### Backend
   
   VL (Velox)
   
   ### Bug description
   
   Backtrace of Exception.
   ```
   24/12/18 17:07:41 INFO Executor task launch worker for task 78584 
ColumnarShuffleWriter: Gluten shuffle writer: Spilled 4250790592 / 8388608 
bytes of data
   24/12/18 17:07:41 INFO Executor task launch worker for task 78584 
RetryOnOomMemoryTarget: Retrying spill require:1436968550 got:1431306240
   24/12/18 17:07:41 INFO Executor task launch worker for task 78584 
ColumnarShuffleWriter: Gluten shuffle writer: Trying to spill 
9223372036854775807 bytes of data
   24/12/18 17:07:41 ERROR Executor task launch worker for task 78584 
ManagedReservationListener: Error unreserving memory from target
   java.lang.IllegalStateException
        at 
org.apache.gluten.shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:133)
        at 
org.apache.gluten.memory.memtarget.OverAcquire.repay(OverAcquire.java:77)
        at 
org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.repay(ThrowOnOomMemoryTarget.java:124)
        at 
org.apache.gluten.memory.listener.ManagedReservationListener.unreserve(ManagedReservationListener.java:63)
        at 
org.apache.gluten.vectorized.ShuffleWriterJniWrapper.nativeEvict(Native Method)
        at 
org.apache.spark.shuffle.ColumnarShuffleWriter$$anon$1.spill(ColumnarShuffleWriter.scala:170)
        at 
org.apache.gluten.memory.memtarget.Spillers$AppendableSpillerList.spill(Spillers.java:86)
        at 
org.apache.gluten.memory.memtarget.Spillers$WithMinSpillSize.spill(Spillers.java:66)
        at 
org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:80)
        at 
org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:55)
        at 
org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:73)
        at 
org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:55)
        at 
org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:73)
        at 
org.apache.gluten.memory.memtarget.TreeMemoryTargets.spillTree(TreeMemoryTargets.java:55)
        at 
org.apache.gluten.memory.memtarget.RetryOnOomMemoryTarget.retryingSpill(RetryOnOomMemoryTarget.java:60)
        at 
org.apache.gluten.memory.memtarget.RetryOnOomMemoryTarget.borrow(RetryOnOomMemoryTarget.java:40)
        at 
org.apache.gluten.memory.memtarget.OverAcquire.borrow(OverAcquire.java:63)
        at 
org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget.borrow(ThrowOnOomMemoryTarget.java:40)
        at 
org.apache.gluten.memory.listener.ManagedReservationListener.reserve(ManagedReservationListener.java:49)
        at org.apache.gluten.vectorized.ShuffleWriterJniWrapper.write(Native 
Method)
        at 
org.apache.spark.shuffle.ColumnarShuffleWriter.internalWrite(ColumnarShuffleWriter.scala:177)
        at 
org.apache.spark.shuffle.ColumnarShuffleWriter.write(ColumnarShuffleWriter.scala:232)
        at 
org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
        at org.apache.spark.scheduler.Task.run(Task.scala:134)
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:479)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1448)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:482)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
   ```
   
   After https://github.com/apache/incubator-gluten/pull/8132, 
`overTarget.borrow(overSize)` may trigger a retrying spill. If so, it calls 
`OverAcquire.repay()` during the spill procedure, which checks 
`Preconditions.checkState(overTarget.usedBytes() == 0);`. However, currently, 0 
< overTarget.usedBytes() < overSize. We can remove this precondition in 
`repay()`, and only keep the precondition in `borrow()`.
   
   ### Spark version
   
   None
   
   ### Spark configurations
   
   _No response_
   
   ### System information
   
   _No response_
   
   ### Relevant logs
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to