nealrichardson commented on a change in pull request #41:
URL: https://github.com/apache/arrow-cookbook/pull/41#discussion_r692182956
##########
File path: r/content/reading_and_writing_data.Rmd
##########
@@ -60,11 +60,11 @@ Given a Parquet file, it can be read back in by using
`arrow::read_parquet()`.
```{r, read_parquet}
parquet_tbl <- read_parquet("my_table.parquet")
-head(parquet_tbl)
+parquet_tbl
```
```{r, test_read_parquet, opts.label = "test"}
test_that("read_parquet works as expected", {
- expect_equivalent(dplyr::collect(parquet_tbl), tibble::tibble(group = c("A",
"B", "C"), score = c(99, 97, 99)))
+ expect_identical(as.data.frame(parquet_tbl), tibble::tibble(group = c("A",
"B", "C"), score = c(99, 97, 99)))
Review comment:
Ah, I know what it is: `parquet_tbl` is already a `tbl_df` so
`dplyr::collect()` literally does nothing, and `as.data.frame.tbl_df` removes
the attributes. So it's not that we should do as.data.frame() instead of
collect(), we should do neither (and consider renaming the variable if `_tbl`
is ambiguous).
```suggestion
expect_identical(parquet_tbl, tibble::tibble(group = c("A", "B", "C"),
score = c(99, 97, 99)))
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]