[GitHub] [arrow] jonkeane commented on pull request #8650: ARROW-10570: [R] Use Converter API to convert SEXP to Array/ChunkedArray

GitBox Fri, 26 Feb 2021 15:03:57 -0800


jonkeane commented on pull request #8650:
URL: https://github.com/apache/arrow/pull/8650#issuecomment-786940734



   I've been helping out benchmark these changes against the 3.0 release and 
everything I'm seeing is in line: no performance regressions. I've used a 
handful of our real-world datasets along with some synthetic datasets made up 
of individual types and they are all in line with the 3.0 release performance 
wise, which is great. I've been adding the benchmarks to arrowbench and are 
currently [in a PR there](https://github.com/ursa-labs/arrowbench/pull/9) in 
case you're curious about them.
   
   One thing I did notice is that simple feature columns are having issues that 
aren't there in the release.
   
   Here's a test that exercises the issue (I dug a bit to see if I could find 
the bug but haven't yet). The structure of the list column is meant to be 
minimal but is based off of the failure I saw with a real sf tibble (see below)
   
   ```
   test_that("sf-like list columns", {
     df <- tibble::tibble(col = list(structure(list(1), class = c("one"))))
     expect_array_roundtrip(df)
   })
   ``` 
   
   the error+traceback is:
   
   ```
   <error/vctrs_error_scalar_type>
   Input must be a vector, not a `one` object.
   Backtrace:
       █
    1. ├─Table$create(df)
    2. │ └─arrow:::Table__from_dots(dots, schema)
    3. └─vctrs:::stop_scalar_type(...)
    4.   └─vctrs:::stop_vctrs(msg, "vctrs_error_scalar_type", actual = x)
   ```
   
   A more naturalistic example of this is the following which works in 3.0, but 
not on this branch
   
   ```
   df_simple <- sf::read_sf(system.file("shape/nc.shp", package = "sf"))
   tab_simple <- Table$create(df_simple)
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow] jonkeane commented on pull request #8650: ARROW-10570: [R] Use Converter API to convert SEXP to Array/ChunkedArray

Reply via email to