[jira] [Commented] (ARROW-6300) [C++] Add io::OutputStream::Abort()
[ https://issues.apache.org/jira/browse/ARROW-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922562#comment-16922562 ] Wes McKinney commented on ARROW-6300: - I think that continuing to call {{Close}} on non-closed files in the dtor makes sense. This logic could also be moved up into the binding layer, too > [C++] Add io::OutputStream::Abort() > --- > > Key: ARROW-6300 > URL: https://issues.apache.org/jira/browse/ARROW-6300 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > > This method would abort the current output stream without trying to flush or > commit any pending internal data. This makes sense mostly for buffered > streams. For other streams it could simply synonymous to Close(). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6300) [C++] Add io::OutputStream::Abort()
[ https://issues.apache.org/jira/browse/ARROW-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16922465#comment-16922465 ] Antoine Pitrou commented on ARROW-6300: --- Hmm, so if we call Abort() from the destructor, this will be a behaviour change, as we currently call Close() (flushing any pending data). Perhaps some code paths in third-party code rely on that. Also, this means that on the Python side, files created using e.g. {{pa.output_stream()}} wouldn't flush properly when destroyed. This would break expectations of Python users. So perhaps we should keep calling Close() from the destructor. What do you think? > [C++] Add io::OutputStream::Abort() > --- > > Key: ARROW-6300 > URL: https://issues.apache.org/jira/browse/ARROW-6300 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > > This method would abort the current output stream without trying to flush or > commit any pending internal data. This makes sense mostly for buffered > streams. For other streams it could simply synonymous to Close(). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6300) [C++] Add io::OutputStream::Abort()
[ https://issues.apache.org/jira/browse/ARROW-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16913120#comment-16913120 ] Antoine Pitrou commented on ARROW-6300: --- > generally, I thought it was frowned upon to work that could fail in the > destructor anyways Probably, but I think it's unavoidable to handle cases where the user didn't clean up explicitly. > Are we writing temporary files and then moving them (i.e. letting the OS > cleanup the temporary file)? No, we aren't doing that in any of the IO classes currently. > [C++] Add io::OutputStream::Abort() > --- > > Key: ARROW-6300 > URL: https://issues.apache.org/jira/browse/ARROW-6300 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > > This method would abort the current output stream without trying to flush or > commit any pending internal data. This makes sense mostly for buffered > streams. For other streams it could simply synonymous to Close(). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6300) [C++] Add io::OutputStream::Abort()
[ https://issues.apache.org/jira/browse/ARROW-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912950#comment-16912950 ] Micah Kornfield commented on ARROW-6300: It sounds like it makes sense, I think not just hiding it in the destructor make sense (generally, I thought it was frowned upon to work that could fail in the destructor anyways). >> 2. For regulard file systems, would it make sense to delete the underlying >> file (they both seem to be about clearing underlying resources?) >I don't think so. The aim is to clear any temporary runtime resources >associated to the stream. We don't claim that Abort() will rollback any >persistent changes. Are we writing temporary files and then moving them (i.e. letting the OS cleanup the temporary file)? > [C++] Add io::OutputStream::Abort() > --- > > Key: ARROW-6300 > URL: https://issues.apache.org/jira/browse/ARROW-6300 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > > This method would abort the current output stream without trying to flush or > commit any pending internal data. This makes sense mostly for buffered > streams. For other streams it could simply synonymous to Close(). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6300) [C++] Add io::OutputStream::Abort()
[ https://issues.apache.org/jira/browse/ARROW-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16912119#comment-16912119 ] Antoine Pitrou commented on ARROW-6300: --- > 1. What is the use-case for such a method? Are we expected to abort midway > through? In some cases yes. Suppose you're writing a file as part of a larger pipeline. But another part of the pipeline fails, and you want to signal the stream that it can abort its operations. > 2. For regulard file systems, would it make sense to delete the underlying > file (they both seem to be about clearing underlying resources?) I don't think so. The aim is to clear any temporary runtime resources associated to the stream. We don't claim that Abort() will rollback any persistent changes. Perhaps this is not necessary and we can just rely on the destructor, since C++ destruction is deterministic. > [C++] Add io::OutputStream::Abort() > --- > > Key: ARROW-6300 > URL: https://issues.apache.org/jira/browse/ARROW-6300 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > > This method would abort the current output stream without trying to flush or > commit any pending internal data. This makes sense mostly for buffered > streams. For other streams it could simply synonymous to Close(). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6300) [C++] Add io::OutputStream::Abort()
[ https://issues.apache.org/jira/browse/ARROW-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911928#comment-16911928 ] Micah Kornfield commented on ARROW-6300: Two questions: 1. What is the use-case for such a method? Are we expected to abort midway through? 2. For regulard file systems, would it make sense to delete the underlying file (they both seem to be about clearing underlying resources?) > [C++] Add io::OutputStream::Abort() > --- > > Key: ARROW-6300 > URL: https://issues.apache.org/jira/browse/ARROW-6300 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > > This method would abort the current output stream without trying to flush or > commit any pending internal data. This makes sense mostly for buffered > streams. For other streams it could simply synonymous to Close(). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6300) [C++] Add io::OutputStream::Abort()
[ https://issues.apache.org/jira/browse/ARROW-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911427#comment-16911427 ] Wes McKinney commented on ARROW-6300: - Seems reasonable to have the default implementation be a no-op > [C++] Add io::OutputStream::Abort() > --- > > Key: ARROW-6300 > URL: https://issues.apache.org/jira/browse/ARROW-6300 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > > This method would abort the current output stream without trying to flush or > commit any pending internal data. This makes sense mostly for buffered > streams. For other streams it could simply synonymous to Close(). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6300) [C++] Add io::OutputStream::Abort()
[ https://issues.apache.org/jira/browse/ARROW-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911406#comment-16911406 ] Antoine Pitrou commented on ARROW-6300: --- [~emkornfi...@gmail.com] opinions? > [C++] Add io::OutputStream::Abort() > --- > > Key: ARROW-6300 > URL: https://issues.apache.org/jira/browse/ARROW-6300 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > > This method would abort the current output stream without trying to flush or > commit any pending internal data. This makes sense mostly for buffered > streams. For other streams it could simply synonymous to Close(). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Commented] (ARROW-6300) [C++] Add io::OutputStream::Abort()
[ https://issues.apache.org/jira/browse/ARROW-6300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16911402#comment-16911402 ] Antoine Pitrou commented on ARROW-6300: --- Context: S3 multipart uploads must be either completed or aborted: https://docs.aws.amazon.com/AmazonS3/latest/dev/mpuoverview.html Aborting can be done implicitly in the destructor but it may be nice to have an explicit method call for that (especially to catch any errors). > [C++] Add io::OutputStream::Abort() > --- > > Key: ARROW-6300 > URL: https://issues.apache.org/jira/browse/ARROW-6300 > Project: Apache Arrow > Issue Type: Wish > Components: C++ >Reporter: Antoine Pitrou >Priority: Major > > This method would abort the current output stream without trying to flush or > commit any pending internal data. This makes sense mostly for buffered > streams. For other streams it could simply synonymous to Close(). -- This message was sent by Atlassian Jira (v8.3.2#803003)