[
https://issues.apache.org/jira/browse/ARROW-16421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17531409#comment-17531409
]
Will Jones commented on ARROW-16421:
------------------------------------
{quote}Windows is notoriously stubborn about deleting files that have any kind
of open handle so that makes sense. Another possibility that is often
frustrating when deleting recently created files is that the file is picked up
by a search indexer or an antivirus scanner of some sort. I suspect that is why
MinIO/S3 tests sporadically fail on Windows as well.
{quote}
I tested the above with a 20 second sleep in between garbage collection and the
deletion, and got the exact same result, so I don't think it's this.
{quote}The scanner does need to close its files. It takes care of this itself
as it finishes scanning.
{quote}
[~westonpace] Could you point me to the code where that happens? I'm having
trouble finding it.
> [R] Permission error on Windows when deleting file in dataset
> -------------------------------------------------------------
>
> Key: ARROW-16421
> URL: https://issues.apache.org/jira/browse/ARROW-16421
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Affects Versions: 7.0.0
> Reporter: Will Jones
> Assignee: Will Jones
> Priority: Major
>
> On Windows this fails:
> {code:R}
> library(arrow)
> write_dataset(iris, "test_dataset")
> # Original example was with DuckDB, but that's not necessarily the issue
> # con <- open_dataset("test_dataset") |> to_duckdb()
> con <- open_dataset("test_dataset")$NewScan()$Finish()$ToRecordBatchReader()
> file.remove("test_dataset/part-0.parquet")
> #> Warning in file.remove("test_dataset/part-0.parquet"): cannot remove file
> #> 'test_dataset/part-0.parquet', reason 'Permission denied'
> #> [1] FALSE
> {code}
> But on MacOS it does not:
> {code:r}
> library(arrow)
> write_dataset(iris, "test_dataset")
> # Original example was with DuckDB, but that's not necessarily the issue
> # con <- open_dataset("test_dataset") |> to_duckdb()
> con <- open_dataset("test_dataset")$NewScan()$Finish()$ToRecordBatchReader()
> file.remove("test_dataset/part-0.parquet")
> #> [1] TRUE
> {code}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)