Dave Hirschfeld created ARROW-4823:
--------------------------------------

             Summary: read_csv shouldn't close file handles it doesn't own
                 Key: ARROW-4823
                 URL: https://issues.apache.org/jira/browse/ARROW-4823
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.12.1
            Reporter: Dave Hirschfeld


If a file-handle is passed into `read_csv` it is automatically closed:

```python

In [47]: csv = 
io.BytesIO(b'''issue_date_utc,variable_name,station_name,station_id,value_date_utc,value
    ...: 2019-02-26 22:00:00,TEMPERATURE,ARCHERFIELD,040211,2019-02-27 
03:00,29.1
    ...: ''')

In [48]: pa.csv.read_csv(csv, convert_options=opts)
Out[48]: 
pyarrow.Table
issue_date_utc: timestamp[ns]
variable_name: string
station_name: string
station_id: int64
value_date_utc: string
value: double

In [49]: csv.seek(0)
Traceback (most recent call last):

  File "<ipython-input-50-0644e6e50712>", line 1, in <module>
    csv.seek(0)

ValueError: I/O operation on closed file.

```

This behaviour is in contrast to pandas which leaves the file handle open.

Since the function didn't create the file handle I don't think it should close 
it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to