[
https://issues.apache.org/jira/browse/ARROW-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16339468#comment-16339468
]
ASF GitHub Bot commented on ARROW-2029:
---------------------------------------
wesm closed pull request #1502: ARROW-2029: [Python] NativeFile.tell errors
after close
URL: https://github.com/apache/arrow/pull/1502
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git a/python/pyarrow/io.pxi b/python/pyarrow/io.pxi
index 5449872ff..bb363bacc 100644
--- a/python/pyarrow/io.pxi
+++ b/python/pyarrow/io.pxi
@@ -91,20 +91,20 @@ cdef class NativeFile:
self._assert_writeable()
file[0] = <shared_ptr[OutputStream]> self.wr_file
+ def _assert_open(self):
+ if not self.is_open:
+ raise ValueError("I/O operation on closed file")
+
def _assert_readable(self):
+ self._assert_open()
if not self.is_readable:
raise IOError("only valid on readonly files")
- if not self.is_open:
- raise IOError("file not open")
-
def _assert_writeable(self):
+ self._assert_open()
if not self.is_writeable:
raise IOError("only valid on writeable files")
- if not self.is_open:
- raise IOError("file not open")
-
def size(self):
"""
Return file size
@@ -120,6 +120,7 @@ cdef class NativeFile:
Return current stream position
"""
cdef int64_t position
+ self._assert_open()
with nogil:
if self.is_readable:
check_status(self.rd_file.get().Tell(&position))
diff --git a/python/pyarrow/tests/test_io.py b/python/pyarrow/tests/test_io.py
index e60dd35de..3f7aa2e1c 100644
--- a/python/pyarrow/tests/test_io.py
+++ b/python/pyarrow/tests/test_io.py
@@ -257,7 +257,7 @@ def test_inmemory_write_after_closed():
f.write(b'ok')
f.get_result()
- with pytest.raises(IOError):
+ with pytest.raises(ValueError):
f.write(b'not ok')
@@ -503,3 +503,27 @@ def test_native_file_modes(tmpdir):
with pa.memory_map(path, 'r+b') as f:
assert f.mode == 'rb+'
+
+
+def test_native_file_raises_ValueError_after_close(tmpdir):
+ path = os.path.join(str(tmpdir), guid())
+ with open(path, 'wb') as f:
+ f.write(b'foooo')
+
+ with pa.OSFile(path, mode='rb') as os_file:
+ pass
+
+ with pa.memory_map(path, mode='rb') as mmap_file:
+ pass
+
+ files = [os_file,
+ mmap_file]
+
+ methods = [('tell', ()),
+ ('seek', (0,)),
+ ('size', ())]
+
+ for f in files:
+ for method, args in methods:
+ with pytest.raises(ValueError):
+ getattr(f, method)(*args)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> [Python] Program crash on `HdfsFile.tell` if file is closed
> -----------------------------------------------------------
>
> Key: ARROW-2029
> URL: https://issues.apache.org/jira/browse/ARROW-2029
> Project: Apache Arrow
> Issue Type: Bug
> Reporter: Jim Crist
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
> Of all the `NativeFile` methods, `tell` is the only one that doesn't check if
> the file is still open before running. This can lead to crashes when using
> hdfs:
>
> {code:java}
> >>> import pyarrow as pa
> >>> h = pa.hdfs.connect()
> 18/01/24 22:31:35 WARN util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 18/01/24 22:31:36 WARN shortcircuit.DomainSocketFactory: The short-circuit
> local reads feature cannot be used because libhadoop cannot be loaded.
> >>> with h.open("/tmp/test.txt", mode='wb') as f:
> ... pass
> ...
> >>> f.tell()
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # SIGSEGV (0xb) at pc=0x00007f52ccb6733d, pid=14868, tid=0x00007f52de2b9700
> #
> # JRE version: OpenJDK Runtime Environment (8.0_151-b12) (build
> 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12)
> # Java VM: OpenJDK 64-Bit Server VM (25.151-b12 mixed mode linux-amd64
> compressed oops)
> # Problematic frame:
> # V [libjvm.so+0x67c33d]
> #
> # Failed to write core dump. Core dumps have been disabled. To enable core
> dumping, try "ulimit -c unlimited" before starting Java again
> #
> # An error report file with more information is saved as:
> # /working/python/hs_err_pid14868.log
> #
> # If you would like to submit a bug report, please visit:
> # http://bugreport.java.com/bugreport/crash.jsp
> #
> Aborted
> {code}
> In python, most file-like objects raise a `ValueError` if the file is closed:
> {code:java}
> >>> f = open("test.py", mode='wb')
> >>> f.close()
> >>> f.tell()
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ValueError: I/O operation on closed file
> >>> import io
> >>> buf = io.BytesIO()
> >>> buf.close()
> >>> buf.tell()
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> ValueError: I/O operation on closed file.{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)