Aldrin created ARROW-2683: ----------------------------- Summary: Resource Warning (Unclosed File) when using pyarrow.parquet.read_table() Key: ARROW-2683 URL: https://issues.apache.org/jira/browse/ARROW-2683 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.9.0 Reporter: Aldrin
pyarrow version from python repl: {noformat} >>> import pyarrow >>> pyarrow.__version__ '0.9.0.post1'{noformat} python interpreter information: {noformat} Python 3.6.5 (default, Mar 30 2018, 06:42:10) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)] on darwin{noformat} arbitrary, potentially relevant system information: {noformat} OS : macOS High Sierra (10.13.4) homebrew package : python: stable 3.6.5 (bottled), devel 3.7.0b4, HEAD pip version : pip 10.0.1 pipenv version : pipenv, version 2018.05.18 pyarrow version (via pip): pyarrow 0.9.0.post1 cython version (via pip) : Cython 0.28.2{noformat} Issue Description: I see a ResourceWarning, which doesn't seem to be an error, but seems important enough (a.k.a. annoying enough) that I thought it would be worth asking about. [~xhochy] was nice enough to respond in #general in the arrow slack. The main problem is as follows: # with this code in a python unittest: {noformat} def test_arrow_from_parquet(self): table = parquet.read_table(<path as str>){noformat} I see this warning: {noformat} ResourceWarning: unclosed file <_io.BufferedReader name=<path_to_file>{noformat} # I tried adding the following, per Uwe's request: {noformat} warnings.simplefilter("error"){noformat} # I then see this information: {noformat} test_arrow_from_parquet (tests.datalayer_test.TestFileReader) ... Exception ignored in: <_io.FileIO name=<path_to_file> mode='rb' closefd=True> ResourceWarning: unclosed file <_io.BufferedReader name=<path_to_file>>{noformat} # Uwe's thoughts: {noformat} That could be a valid error. We don’t seem to close the file we open in `ParquetFile.__init__`{noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)