milesgranger commented on code in PR #13821:
URL: https://github.com/apache/arrow/pull/13821#discussion_r943083763


##########
python/pyarrow/tests/parquet/test_parquet_file.py:
##########
@@ -277,3 +278,77 @@ def test_pre_buffer(pre_buffer):
     buf.seek(0)
     pf = pq.ParquetFile(buf, pre_buffer=pre_buffer)
     assert pf.read().num_rows == N
+
+
+def test_parquet_file_explicitly_closed(tmpdir):
+    """
+    Unopened files should be closed explicitly after use,
+    and previously opened files should be left open.
+    Applies to read_table, ParquetDataset, and ParquetFile
+    """
+    # create test parquet file
+    df = pd.DataFrame([{'col1': 0, 'col2': 0}, {'col1': 1, 'col2': 1}])
+    fn = str(tmpdir.join('file.parquet'))
+    df.to_parquet(fn)
+
+    pytest.importorskip('fsspec')

Review Comment:
   Nice! Good to be aware of this. Although after looking at it more closely, I 
think the test with fs was a bit redundant from the other unopened file version 
of it. We patch the `ParquetFile` to ensure close was called on it. Unable to 
patch `NativeFile`:
   
   > TypeError: can't set attributes of built-in/extension type 
'pyarrow.lib.NativeFile'
   
   but I think ensuring ParquetFile is closed suffices, especially later on 
when we check it reflects the underlying source closed status.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to