Re: [PR] GH-47728: [Python] Check the source argument in parquet.read_table [arrow]

via GitHub Thu, 13 Nov 2025 01:28:16 -0800


raulcd commented on code in PR #48008:
URL: https://github.com/apache/arrow/pull/48008#discussion_r2522543213



##########
python/pyarrow/tests/parquet/test_basic.py:
##########
@@ -993,3 +993,13 @@ def test_checksum_write_to_dataset(tempdir):
     # checksum verification enabled raises an exception
     with pytest.raises(OSError, match="CRC checksum verification"):
         _ = pq.read_table(corrupted_file_path, page_checksum_verification=True)
+
+
[email protected](
+    "source", ["/tmp/", ["/tmp/file1.parquet", "/tmp/file2.parquet"]])
+def test_read_table_raises_value_error_when_ds_is_unavailable(monkeypatch, 
source):
+    # GH-47728
+    monkeypatch.setitem(sys.modules, "pyarrow.dataset", None)
+
+    with pytest.raises(ValueError, match="the 'source' argument"):
+        pq.read_table(source=source)

Review Comment:
   @AlenkaF @rok I was wondering, should we validate this behavior with an s3 
(minio) file test?
   
   We don't seem to have tests covering read_table with s3 buckets for parquet 
files, there's some coverage for `pq.read_table` when dataset is available.
   
   Maybe we could create a follow up issue to increase coverage on cloud 
filesystems?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-47728: [Python] Check the source argument in parquet.read_table [arrow]

Reply via email to