Weston Pace created ARROW-14175:
-----------------------------------

             Summary: [C++][Dataset] Add more fine-grained error for existing 
data to dataset writer
                 Key: ARROW-14175
                 URL: https://issues.apache.org/jira/browse/ARROW-14175
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Weston Pace


ARROW-13650 is adding different behaviors for handling existing data in the 
dataset writer.  One of these is a very coarse "error" behavior which will 
return an error if the destination directory has any files in it at all.

However, during the discussion of the PR, we decided it would be helpful to 
have a more fine grained error behavior that only returned an error if it 
encountered a file that was going to be overwritten.  This would allow someone 
to safely do a write that should only append new data.

However, it is a bit tricky, because the files to be written to will not be 
known ahead of time.  So this error may be encountered after we have already 
started writing data.  The data already written would need to be rolled back 
somehow.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to