Hi Aldrin, Please try this:
sample_schema <- schema(!!!schema_fields) The schema() function now uses rlang functions to evaluate its arguments, so variable names need to be unquoted and spliced with !!! Ian On Tue, Aug 17, 2021 at 5:22 PM Aldrin <[email protected]> wrote: > Hello! > > I am pretty confused by the schema factory function in R, because I think > what I'm doing should work, but it doesn't seem to. I have inlined the code > below, but if there's an alternate way to setting the data types of a > schema in R, then I would welcome recommendations for those as well. > > Anyways, the brief overview is that I want to create tables from matrices > that will have anywhere from hundreds of columns to thousands, and > specifying the schema inline is not going to be useful. I figure I should > be able to create a named list and then pass it to the schema factory > function, but I always get an error when trying to do so ("Error: > !is.null(nms <- names(.list)) is not TRUE"). > > I could update to arrow 5.0.0, but I assume that my problem shouldn't be a > problem in arrow 4.0.1. > > Thanks for any help! > > Working code: > > Create an example data frame: > sample_df <- data.frame( > SRR12=c(0) > ,SRR20=c(0) > ,SRR24=c(4) > ,SRR27=c(223) > ,row.names=c('ENSG3') > ) > > sample_df > >> SRR12 SRR20 SRR24 SRR27 >> ENSG3 0 0 4 223 > > > Create an arrow table, specify the schema inline: > sample_table <- Table$create( > sample_df > ,schema=schema( > SRR12=uint16() > ,SRR20=uint16() > ,SRR24=uint16() > ,SRR27=uint16() > ) > ) > > sample_table > >> Table >> 1 rows x 4 columns >> $SRR12 <uint16> >> $SRR20 <uint16> >> $SRR24 <uint16> >> $SRR27 <uint16> >> > > Create a schema from a list, because we want > 1000 columns sometimes: > schema_fields <- list(SRR12=uint16(), SRR20=uint16(), SRR24=uint16(), > SRR27=uint16()) > sample_schema <- schema(schema_fields) > >> Error: !is.null(nms <- names(.list)) is not TRUE >> > > schema_fields > >> $SRR12 >> UInt16 >> uint16 >> >> $SRR20 >> UInt16 >> uint16 >> >> $SRR24 >> UInt16 >> uint16 >> >> $SRR27 >> UInt16 >> uint16 > > > > Package information (system is macbook M1): > > brew info apache-arrow > > apache-arrow: stable 5.0.0 (bottled), HEAD > Columnar in-memory analytics layer designed to accelerate big data > https://arrow.apache.org/ > /opt/homebrew/Cellar/apache-arrow/4.0.1_2 (534 files, 92.9MB) * > Poured from bottle on 2021-07-07 at 16:10:51 > From: > https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/apache-arrow.rb > License: Apache-2.0 > ==> Dependencies > Build: boost ✔, cmake ✘, llvm ✘ > Required: brotli ✔, glog ✔, grpc ✘, lz4 ✔, numpy ✘, [email protected] ✔, > protobuf ✔, [email protected] ✔, rapidjson ✔, re2 ✘, snappy ✔, thrift ✔, > utf8proc ✔, zstd ✔ > ==> Options > --HEAD > Install HEAD version > ==> Analytics > install: 1,715 (30 days), 5,687 (90 days), 18,191 (365 days) > install-on-request: 994 (30 days), 3,232 (90 days), 10,314 (365 days) > build-error: 0 (30 days) > > > > arrow::arrow_info() > > Arrow package version: 4.0.1 > > Capabilities: > > dataset TRUE > parquet TRUE > s3 FALSE > utf8proc TRUE > re2 TRUE > snappy TRUE > gzip TRUE > brotli TRUE > zstd TRUE > lz4 TRUE > lz4_frame TRUE > lzo FALSE > bz2 TRUE > jemalloc TRUE > mimalloc FALSE > > Memory: > > Allocator jemalloc > Current 256 bytes > Max 2.31 Kb > > Runtime: > > SIMD Level none > Detected SIMD Level none > > > > Aldrin Montana > Computer Science PhD Student > UC Santa Cruz >
