Weston Pace created ARROW-13590:
-----------------------------------

             Summary: Ensure dataset writing applies back pressure
                 Key: ARROW-13590
                 URL: https://issues.apache.org/jira/browse/ARROW-13590
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Weston Pace
             Fix For: 6.0.0


Dataset writing via exec plan (ARROW-13542) does not apply back pressure 
currently and will take up far more RAM than it should when writing a large 
dataset.  The node should be applying back pressure.  However, the preferred 
back pressure method (via scheduling) will need to wait for ARROW-13576.

Once those two are finished this can be studied in more detail.  Also, the 
vm.dirty_ratio might be experimented with.  In theory we should be applying our 
own back pressure and have no need of dirty pages.  In practice, it may be more 
work than we want to tackle right now and we just let it do its thing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to