[ 
https://issues.apache.org/jira/browse/ARROW-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401138#comment-17401138
 ] 

Charlie Gao commented on ARROW-13661:
-------------------------------------

Strictly speaking, the order of attributes does not matter as documented in R. 
But how else do you ensure you get the same thing out as you put in?

The practical significance is that currently I am hashing the object before I 
write, storing the hash and then hashing the restored object to verify 
integrity. Currently I am having to deal with the attribute order in the code 
on my side, but I am thinking there is no good reason for the attributes to be 
out of order in the first place. Thanks.

> [R] Objects Written to Feather Not Restored Exactly
> ---------------------------------------------------
>
>                 Key: ARROW-13661
>                 URL: https://issues.apache.org/jira/browse/ARROW-13661
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 5.0.0
>         Environment: R4.1.1, Ubuntu 20.04
>            Reporter: Charlie Gao
>            Priority: Major
>              Labels: arrow, feather
>
> Rather simple - write the standard 'iris' dataset to feather, then read it 
> back.
> At first glance everything looks the same, but setting 'attrib.as.set = 
> FALSE' to identical() will return FALSE.
> Using Waldo to compare, you can see that the order of attributes is different 
> on the restored object. "class" should be the second attribute after "names" 
> but before "row.names".
> This should be a simple fix to the 'read_feather' function to set the correct 
> order of attributes.
> ---
> iris <- iris
>  arrow::write_feather(iris, file <- tempfile())
>  iris2 <- arrow::read_feather(file)
>  unlink(file)
>  identical(iris, iris2)
>  #> [1] TRUE
>  identical(iris, iris2, attrib.as.set = FALSE)
>  #> [1] FALSE
>  waldo::compare(attributes(iris), attributes(iris2))
>  #> `names(old)`: "names" "class" "row.names" 
>  #> `names(new)`: "names" "row.names" "class"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to