romainfrancois commented on pull request #8533:
URL: https://github.com/apache/arrow/pull/8533#issuecomment-718021313
I also had, in a branch that builds on top of #8256 ways to prematurely
invalidate objects when we know they won't be used anymore. For example, in
this function:
```r
collect.arrow_dplyr_query <- function(x, as_data_frame = TRUE, ...) {
x <- ensure_group_vars(x)
# Pull only the selected rows and cols into R
if (query_on_dataset(x)) {
# See dataset.R for Dataset and Scanner(Builder) classes
tab <- Scanner$create(x)$ToTable()
} else {
# This is a Table/RecordBatch. See record-batch.R for the [ method
tab <- x$.data[x$filtered_rows, x$selected_columns, keep_na = FALSE]
}
if (as_data_frame) {
df <- as.data.frame(tab)
tab$invalidate()
restore_dplyr_features(df, x)
} else {
restore_dplyr_features(tab, x)
}
}
```
inside the `if (as_data_frame)` as soon as `tab` is converted to a
`data.frame` we will no longer need or use `tab`, so calling `$invalidate()` on
it calls the destructor of the shared pointer held by the external pointer that
lives in `tab`.
Is this still worth having ? And in that case should I push this to #8256 cc
@nealrichardson
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]