[GitHub] [arrow] nealrichardson commented on a change in pull request #7524: ARROW-8899 [R] Add R metadata like pandas metadata for round-trip fidelity

2020-06-23 Thread GitBox


nealrichardson commented on a change in pull request #7524:
URL: https://github.com/apache/arrow/pull/7524#discussion_r444311774



##
File path: r/R/table.R
##
@@ -202,7 +210,27 @@ Table$create <- function(..., schema = NULL) {
 
 #' @export
 as.data.frame.Table <- function(x, row.names = NULL, optional = FALSE, ...) {
-  Table__to_dataframe(x, use_threads = option_use_threads())
+  df <- Table__to_dataframe(x, use_threads = option_use_threads())
+
+  if (!is.null(r_metadata <- x$metadata$r)) {
+r_metadata <- .arrow_unserialize(r_metadata)
+
+df_metadata <- r_metadata[[1L]]

Review comment:
   IMO using a named list would be clearer than relying on position in an 
unnamed list

##
File path: r/src/table.cpp
##
@@ -172,11 +172,20 @@ std::shared_ptr Table__from_dots(SEXP lst, 
SEXP schema_sxp) {
   std::shared_ptr schema;
 
   if (Rf_isNull(schema_sxp)) {
-// infer the schema from the ...
+// infer the schema from the `...`

Review comment:
   I wonder if this whole block of code should be factored out so it can be 
used whether you're creating a Table or a RecordBatch. It's the same either 
way: return a Schema corresponding to the data given.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow] nealrichardson commented on a change in pull request #7524: ARROW-8899 [R] Add R metadata like pandas metadata for round-trip fidelity

2020-06-23 Thread GitBox


nealrichardson commented on a change in pull request #7524:
URL: https://github.com/apache/arrow/pull/7524#discussion_r444306795



##
File path: r/tests/testthat/test-Table.R
##
@@ -334,5 +334,5 @@ test_that("Table metadata", {
 
 test_that("Table handles null type (ARROW-7064)", {
   tab <- Table$create(a = 1:10, n = vctrs::unspecified(10))
-  expect_equal(tab$schema,  schema(a = int32(), n = null()))
+  expect_true(tab$schema$Equals(schema(a = int32(), n = null()), FALSE))

Review comment:
   I think you can use `expect_equivalent()` instead of `expect_equal` and 
it will skip the metadata comparison; cf. 
https://github.com/apache/arrow/blob/master/r/R/arrow-package.R#L111 
   
   (of course, we'll have to fix this for the new version of `testthat` that 
doesn't use all.equal under the hood)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org