kou commented on code in PR #13677:
URL: https://github.com/apache/arrow/pull/13677#discussion_r957026581
##########
python/pyarrow/tests/test_dataset.py:
##########
@@ -4192,27 +4192,27 @@ def test_write_table_multiple_fragments(tempdir):
# Table with multiple batches written as single Fragment by default
base_dir = tempdir / 'single'
ds.write_dataset(table, base_dir, format="feather")
- assert set(base_dir.rglob("*")) == set([base_dir / "part-0.feather"])
+ assert set(base_dir.rglob("*")) == set([base_dir / "part-0.arrow"])
assert ds.dataset(base_dir, format="ipc").to_table().equals(table)
# Same for single-element list of Table
base_dir = tempdir / 'single-list'
ds.write_dataset([table], base_dir, format="feather")
- assert set(base_dir.rglob("*")) == set([base_dir / "part-0.feather"])
+ assert set(base_dir.rglob("*")) == set([base_dir / "part-0.arrow"])
assert ds.dataset(base_dir, format="ipc").to_table().equals(table)
# Provide list of batches to write multiple fragments
base_dir = tempdir / 'multiple'
ds.write_dataset(table.to_batches(), base_dir, format="feather")
assert set(base_dir.rglob("*")) == set(
- [base_dir / "part-0.feather"])
+ [base_dir / "part-0.arrow"])
Review Comment:
> For example, we use the `pyarrow.feather` module to handle IPC files with
pyarrow. (not `pyarrow.ipc` or `pyarrow.arrow`.)
No. Normally, users use `pyarrow.ipc.open_file()`/`pyarrow.ipc.new_file()`
for it. See also:
*
https://arrow.apache.org/docs/python/ipc.html#writing-and-reading-random-access-files
* https://arrow.apache.org/cookbook/py/io.html#saving-arrow-arrays-to-disk
> For now, it may make sense here to leave the `.feather` extension for the
`"feather"` case, and warn in the future?
@westonpace What do you think about this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]