PMassicotte opened a new issue, #38382:
URL: https://github.com/apache/arrow/issues/38382
### Describe the bug, including details regarding any error messages,
version, and platform.
``` r
# " I have this folowing code:
library(tidyverse)
library(arrow)
#>
#> Attaching package: 'arrow'
#> The following object is masked from 'package:lubridate':
#>
#> duration
#> The following object is masked from 'package:utils':
#>
#> timestamp
bb <- s3_bucket(
bucket = "cdoc",
endpoint_override = "s3.valeria.science",
anonymous = TRUE
)
open_dataset(bb) |>
to_duckdb() |>
summarise(mean_doc = mean(doc, na.rm = TRUE), .by = ecosystem) |>
collect()
#> # A tibble: 5 × 2
#> ecosystem mean_doc
#> <chr> <dbl>
#> 1 lake 1323.
#> 2 coastal 253.
#> 3 river 527.
#> 4 ocean 60.2
#> 5 estuary 235.
```
When I quit R I get this message:
``` r
Warning messages:
Connection is garbage-collected, use dbDisconnect() to avoid this.
Database is garbage-collected, use dbDisconnect(con, shutdown=TRUE) or duckd
duckdb_shutdown(drv) to avoid this.
```
``` r
# " One way to avoid this is to explicitly use a connection (credit:
#
https://discord.com/channels/909674491309850675/921100826884341781/1165053445657608222):
library(DBI)
library(duckdb)
drv <- duckdb()
con <- dbConnect(drv)
open_dataset(bb) |>
to_duckdb(con = con) |>
summarise(mean_doc = mean(doc, na.rm = TRUE), .by = ecosystem) |>
collect()
#> # A tibble: 5 × 2
#> ecosystem mean_doc
#> <chr> <dbl>
#> 1 lake 1323.
#> 2 river 527.
#> 3 coastal 253.
#> 4 ocean 60.2
#> 5 estuary 235.
dbDisconnect(con)
duckdb_shutdown(drv)
```
Is this expected or it should be done automatically?
<sup>Created on 2023-10-21 with [reprex
v2.0.2](https://reprex.tidyverse.org)</sup>
### Component(s)
R
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]