[ 
https://issues.apache.org/jira/browse/ARROW-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17579614#comment-17579614
 ] 

Miles Granger edited comment on ARROW-16421 at 8/15/22 9:46 AM:
----------------------------------------------------------------

[~westonpace] while working on ARROW-13763, it appears that closing files on 
C++ side doesn't occur in places one might initially expect. For example 
file_ipc.cc 
[IpcFileFormat::Inspect|https://github.com/apache/arrow/blob/bc1a16cd0eceeffe67893a7e8000d2dd28dcf3f1/cpp/src/arrow/dataset/file_ipc.cc#L134]
 could close the `reader` then return the schema, but doesn't.  Also in 
[SerializedFile::Close|https://github.com/apache/arrow/blob/bc1a16cd0eceeffe67893a7e8000d2dd28dcf3f1/cpp/src/parquet/file_reader.cc#L293]
 (ParquetFileReader::Contents) doesn't close the file, but deals with 
decryption keys. It is possible to add a {{~source_.get()->Close()~}} 
immediately after however to close the RandomAccessFile.

Is my understanding correct? If so, it seems like it could be related to this 
issue, but suppose there are reasons for not doing such?


was (Author: JIRAUSER293894):
[~westonpace] while working on ARROW-13763, it appears that closing files on 
C++ side doesn't occur in places one might initially expect. For example 
file_ipc.cc 
[IpcFileFormat::Inspect|https://github.com/apache/arrow/blob/bc1a16cd0eceeffe67893a7e8000d2dd28dcf3f1/cpp/src/arrow/dataset/file_ipc.cc#L134]
 could close the `reader` then return the schema, but doesn't.  Also in 
[SerializedFile::Close|#L293] (ParquetFileReader::Contents) doesn't close the 
file, but deals with decryption keys. It is possible to add a 
{{~source_.get()->Close()~}} immediately after however to close the 
RandomAccessFile.

Is my understanding correct? If so, it seems like it could be related to this 
issue, but suppose there are reasons for not doing such?

> [R] Permission error on Windows when deleting file in dataset
> -------------------------------------------------------------
>
>                 Key: ARROW-16421
>                 URL: https://issues.apache.org/jira/browse/ARROW-16421
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>    Affects Versions: 7.0.0
>            Reporter: Will Jones
>            Assignee: Will Jones
>            Priority: Major
>
> On Windows this fails:
> {code:R}
> library(arrow)
> write_dataset(iris, "test_dataset")
> # Original example was with DuckDB, but that's not necessarily the issue
> # con <- open_dataset("test_dataset") |> to_duckdb()
> con <- open_dataset("test_dataset")$NewScan()$Finish()$ToRecordBatchReader()
> file.remove("test_dataset/part-0.parquet")
> #> Warning in file.remove("test_dataset/part-0.parquet"): cannot remove file
> #> 'test_dataset/part-0.parquet', reason 'Permission denied'
> #> [1] FALSE
> {code}
> But on MacOS it does not:
> {code:r}
> library(arrow)
> write_dataset(iris, "test_dataset")
> # Original example was with DuckDB, but that's not necessarily the issue
> # con <- open_dataset("test_dataset") |> to_duckdb()
> con <- open_dataset("test_dataset")$NewScan()$Finish()$ToRecordBatchReader()
> file.remove("test_dataset/part-0.parquet")
> #> [1] TRUE
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to