[
https://issues.apache.org/jira/browse/ARROW-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17531410#comment-17531410
]
Weston Pace commented on ARROW-16421:
-------------------------------------
It's rather spread out and not at all obvious. For example, if we are scanning
parquet files, then we will call
{{parquet::arrow::FileReader::GetRecordBatchGenerator}} in `file_parquet.cc`.
This creates a generator which is an instance of
{{parquet::arrow::RowGroupGenerator}}. When
{{parquet::arrow::RowGroupGenerator}} is destroyed its members are destroyed.
One of them is a {{shared_ptr<parquet::arrow::<unnamed>::FileReaderImpl>}}.
This in turn has an owned reference to a {{parquet::ParquetFileReader}} and the
destructor of {{parquet::ParquetFileReader}} closes the file.
> [R] Permission error on Windows when deleting file in dataset
> -------------------------------------------------------------
>
> Key: ARROW-16421
> URL: https://issues.apache.org/jira/browse/ARROW-16421
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Affects Versions: 7.0.0
> Reporter: Will Jones
> Assignee: Will Jones
> Priority: Major
>
> On Windows this fails:
> {code:R}
> library(arrow)
> write_dataset(iris, "test_dataset")
> # Original example was with DuckDB, but that's not necessarily the issue
> # con <- open_dataset("test_dataset") |> to_duckdb()
> con <- open_dataset("test_dataset")$NewScan()$Finish()$ToRecordBatchReader()
> file.remove("test_dataset/part-0.parquet")
> #> Warning in file.remove("test_dataset/part-0.parquet"): cannot remove file
> #> 'test_dataset/part-0.parquet', reason 'Permission denied'
> #> [1] FALSE
> {code}
> But on MacOS it does not:
> {code:r}
> library(arrow)
> write_dataset(iris, "test_dataset")
> # Original example was with DuckDB, but that's not necessarily the issue
> # con <- open_dataset("test_dataset") |> to_duckdb()
> con <- open_dataset("test_dataset")$NewScan()$Finish()$ToRecordBatchReader()
> file.remove("test_dataset/part-0.parquet")
> #> [1] TRUE
> {code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)