Dave Hirschfeld created ARROW-4823:
--------------------------------------
Summary: read_csv shouldn't close file handles it doesn't own
Key: ARROW-4823
URL: https://issues.apache.org/jira/browse/ARROW-4823
Project: Apache Arrow
Issue Type: Bug
Components: Python
Affects Versions: 0.12.1
Reporter: Dave Hirschfeld
If a file-handle is passed into `read_csv` it is automatically closed:
```python
In [47]: csv =
io.BytesIO(b'''issue_date_utc,variable_name,station_name,station_id,value_date_utc,value
...: 2019-02-26 22:00:00,TEMPERATURE,ARCHERFIELD,040211,2019-02-27
03:00,29.1
...: ''')
In [48]: pa.csv.read_csv(csv, convert_options=opts)
Out[48]:
pyarrow.Table
issue_date_utc: timestamp[ns]
variable_name: string
station_name: string
station_id: int64
value_date_utc: string
value: double
In [49]: csv.seek(0)
Traceback (most recent call last):
File "<ipython-input-50-0644e6e50712>", line 1, in <module>
csv.seek(0)
ValueError: I/O operation on closed file.
```
This behaviour is in contrast to pandas which leaves the file handle open.
Since the function didn't create the file handle I don't think it should close
it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)