nurpax opened a new issue, #40800:
URL: https://github.com/apache/arrow/issues/40800
### Describe the usage question you have. Please include as many useful
details as possible.
Suppose I have some code like:
```
parquet_path = f"dbg/index.parquet"
with pq.ParquetWriter(parquet_path, schema, compression='snappy',
filesystem=s3_fs) as w:
for batch in batches:
b = pa.RecordBatch.from_pydict(batch, schema=schema)
w.write_batch(b)
```
What if an exception is thrown in the "batches" loop, say the program is
sigkilled or the write fails. Will the result file behind the filesystem be
partially written?
I'm asking because on a local file system I guess I'd expect this to be a
corrupt file. In these cases I write to a tmp file and rename on successful
writes.
But I'm now planning on writing the parquet to s3. Will the data be flushed
to s3 atomatically, ie., I can trust that either a successful write creates a
new object or nothing at all?
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]