jinchengchenghh commented on issue #8851:
URL: 
https://github.com/apache/incubator-gluten/issues/8851#issuecomment-2921527884

   Q95 passed in single thread, result mismatch for multiple threads because 
memory issue, spark-rapids uses Semaphore to control if this task can execute 
now, the config `spark.rapids.sql.concurrentGpuTasks` default value is 2.
   So I assume we could only access gpu memory by one thread, but GPU memory 
RMM supports concurrently access the memory,  the reason why memory corrupted 
is not clear.
   ```
   # Example: Parallel data processing
   stream1 = stream_pool.get_stream()
   stream2 = stream_pool.get_stream()
   
   # Process data asynchronously
   df1 = cudf.read_csv("data1.csv", stream=stream1)
   df2 = cudf.read_csv("data2.csv", stream=stream2)
   ```
   ```
   The RAPIDS Accelerator can further limit the number of tasks that are 
actively sharing the GPU. It does this using a semaphore. When metrics or 
documentation refers to the GPU semaphore it is referring to this. This 
restriction is useful for avoiding GPU out of memory errors while still 
allowing full concurrency for the portions of the job that are not executing on 
the GPU. Care is taken to try and avoid doing I/O or other CPU operations while 
the GPU semaphore is held. But in the case of a join two batches are required 
for processing, and it is not always possible to avoid this case.
   ```
   
https://docs.nvidia.com/spark-rapids/user-guide/23.10/tuning-guide.html#number-of-concurrent-tasks-per-gpu


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to