romainfrancois commented on pull request #9615:
URL: https://github.com/apache/arrow/pull/9615#issuecomment-828472890


   Marking this as ready to review. I've changed the approach this week so that 
it does not need to resort to locking. 
   
   This introduces the `RTasks` class that factors out handling of tasks that 
can be run in parallel and tasks that cannot (because they might touch the R 
central resource, e.g. protect an R object ...). It has `void Append(bool 
parallel, Task&& task)` to add a task. Based on `parallel` the task is either 
added to the parallel task group, and potentially started immediately, or 
delayed to run until all the tasks have been added. 
   
   Then it has `Finish()` which 1) runs the tasks that have been delayed, and 
then waits for the parallel tasks to finish. 
   
   With this, the `RConverter` class gained `virtual void DelayedExtend(SEXP 
values, int64_t size, RTasks& tasks)`. The idea is that an implementation might 
first do some setup work that has to happen on the main thread because it uses 
central R resources, but then the bulk of the work is either run in parallel if 
possible or delayed. 
   
   The `RStructConverter` implementation is a good example that has to do some 
work upfront but then can still benefit from parallel ingestion of its columns. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to