[ 
https://issues.apache.org/jira/browse/ARROW-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Hirschfeld updated ARROW-4823:
-----------------------------------
    Description: 
If a file-handle is passed into `read_csv` it is automatically closed:

```python

{{In [47]: csv = 
io.BytesIO(b'''issue_date_utc,variable_name,station_name,station_id,value_date_utc,value}}
{{     ...: 2019-02-26 22:00:00,TEMPERATURE,ARCHERFIELD,040211,2019-02-27 
03:00,29.1}}
{{     ...: ''')}}{{In [48]: pa.csv.read_csv(csv, convert_options=opts)}}
{{ Out[48]: }}
{{ pyarrow.Table}}
{{ issue_date_utc: timestamp[ns]}}
{{ variable_name: string}}
{{ station_name: string}}
{{ station_id: int64}}
{{ value_date_utc: string}}
{{ value: double}}{{In [49]: csv.seek(0)}}
{{ Traceback (most recent call last):}}{{  File 
"<ipython-input-50-0644e6e50712>", line 1, in <module>}}
{{     csv.seek(0)}}{{ValueError: I/O operation on closed file.}}

```

This behaviour is in contrast to pandas which leaves the file handle open.

Since the function didn't create the file handle I don't think it should close 
it.

  was:
If a file-handle is passed into `read_csv` it is automatically closed:

```python

In [47]: csv = 
io.BytesIO(b'''issue_date_utc,variable_name,station_name,station_id,value_date_utc,value
    ...: 2019-02-26 22:00:00,TEMPERATURE,ARCHERFIELD,040211,2019-02-27 
03:00,29.1
    ...: ''')

In [48]: pa.csv.read_csv(csv, convert_options=opts)
Out[48]: 
pyarrow.Table
issue_date_utc: timestamp[ns]
variable_name: string
station_name: string
station_id: int64
value_date_utc: string
value: double

In [49]: csv.seek(0)
Traceback (most recent call last):

  File "<ipython-input-50-0644e6e50712>", line 1, in <module>
    csv.seek(0)

ValueError: I/O operation on closed file.

```

This behaviour is in contrast to pandas which leaves the file handle open.

Since the function didn't create the file handle I don't think it should close 
it.


> read_csv shouldn't close file handles it doesn't own
> ----------------------------------------------------
>
>                 Key: ARROW-4823
>                 URL: https://issues.apache.org/jira/browse/ARROW-4823
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.12.1
>            Reporter: Dave Hirschfeld
>            Priority: Minor
>
> If a file-handle is passed into `read_csv` it is automatically closed:
> ```python
> {{In [47]: csv = 
> io.BytesIO(b'''issue_date_utc,variable_name,station_name,station_id,value_date_utc,value}}
> {{     ...: 2019-02-26 22:00:00,TEMPERATURE,ARCHERFIELD,040211,2019-02-27 
> 03:00,29.1}}
> {{     ...: ''')}}{{In [48]: pa.csv.read_csv(csv, convert_options=opts)}}
> {{ Out[48]: }}
> {{ pyarrow.Table}}
> {{ issue_date_utc: timestamp[ns]}}
> {{ variable_name: string}}
> {{ station_name: string}}
> {{ station_id: int64}}
> {{ value_date_utc: string}}
> {{ value: double}}{{In [49]: csv.seek(0)}}
> {{ Traceback (most recent call last):}}{{  File 
> "<ipython-input-50-0644e6e50712>", line 1, in <module>}}
> {{     csv.seek(0)}}{{ValueError: I/O operation on closed file.}}
> ```
> This behaviour is in contrast to pandas which leaves the file handle open.
> Since the function didn't create the file handle I don't think it should 
> close it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to