lidavidm commented on code in PR #14414:
URL: https://github.com/apache/arrow/pull/14414#discussion_r996968927
##########
python/pyarrow/tests/test_dataset.py:
##########
@@ -3241,6 +3241,40 @@ def test_feather_format(tempdir, dataset_reader):
dataset_reader.to_table(ds.dataset(basedir, format="feather"))
[email protected]
[email protected]("compression", [
+ "lz4",
+ "zstd",
+ "brotli" # not supported
+])
+def test_feather_format_compressed(tempdir, compression, dataset_reader):
+ table = pa.table({'a': pa.array([1, 2, 3], type="int8"),
+ 'b': pa.array([.1, .2, .3], type="float64")})
+
+ basedir = tempdir / "feather_dataset"
+ basedir.mkdir()
+ file_format = ds.IpcFileFormat()
+ if compression == "brotli":
+ with pytest.raises(ValueError, match="Compression type"):
+ write_options =
file_format.make_write_options(compression=compression)
+ with pytest.raises(ValueError, match="Compression type"):
+ codec = pa.Codec(compression)
+ write_options = file_format.make_write_options(compression=codec)
+ return
+
+ write_options = file_format.make_write_options(compression=compression)
+ ds.write_dataset(
+ table,
+ str(basedir / "data.arrow"),
+ format=file_format,
+ file_options=write_options
+ )
Review Comment:
It would be nice if we could confirm that the compression made it through,
but I don't see any way to actually get this metadata within PyArrow,
unfortunately.
##########
python/pyarrow/tests/test_dataset.py:
##########
@@ -3241,6 +3241,40 @@ def test_feather_format(tempdir, dataset_reader):
dataset_reader.to_table(ds.dataset(basedir, format="feather"))
[email protected]
[email protected]("compression", [
+ "lz4",
+ "zstd",
+ "brotli" # not supported
+])
+def test_feather_format_compressed(tempdir, compression, dataset_reader):
Review Comment:
I think there are additional marks needed so that the test is not run when a
particular codec is not enabled: pytest.mark.lz4, pytest.mark.zstd at least. (I
don't see one for Brotli). Or, use `Codec.is_available` and `pytest.skip`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]