[jira] [Created] (ARROW-9112) [R] Update autobrew script location
Neal Richardson created ARROW-9112: -- Summary: [R] Update autobrew script location Key: ARROW-9112 URL: https://issues.apache.org/jira/browse/ARROW-9112 Project: Apache Arrow Issue Type: Task Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Jeroen is moving it to a different location. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9083) [R] collect int64 as R integer type if not out of bounds
Neal Richardson created ARROW-9083: -- Summary: [R] collect int64 as R integer type if not out of bounds Key: ARROW-9083 URL: https://issues.apache.org/jira/browse/ARROW-9083 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson {{bit64::integer64}} can be awkward to work with in R (one example: https://github.com/apache/arrow/issues/7385). Often in Arrow we get {{int64}} types from [compute methods|https://github.com/apache/arrow/pull/7308] or other translation methods that auto-promote to the largest integer type, but they would fit fine in a 32-bit integer, which is R's native type. When calling {{Array__as_vector}} on an int64, we could first call the minmax function on the array, and if the extrema are within the range of a 32-bit int, return a regular R integer vector. This would add a little bit of ambiguity as to what R type you'll get from an Arrow type, but I wonder if the benefits are worth it since you can't do much with an integer64 in R. (We could also make this optional, similar to ARROW-7657, so you could specify a "strict" mode if you are in a use case where roundtrip fidelity is more important than R usability.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9070) [C++] StructScalar needs field accessor methods
Neal Richardson created ARROW-9070: -- Summary: [C++] StructScalar needs field accessor methods Key: ARROW-9070 URL: https://issues.apache.org/jira/browse/ARROW-9070 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Fix For: 1.0.0 The minmax compute function returns a struct with fields "min" and "max". So to write an R binding for the {{min()}} method on arrow objects, I call "minmax" and then take the "min" field from the result. However, at least from my reading of scalar.h compared with array_nested.h, there are no field/GetFieldByName/etc. methods for StructScalar, so I can't get it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9069) [C++] MakeArrayFromScalar can't handle struct
Neal Richardson created ARROW-9069: -- Summary: [C++] MakeArrayFromScalar can't handle struct Key: ARROW-9069 URL: https://issues.apache.org/jira/browse/ARROW-9069 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Fix For: 1.0.0 The R bindings translate data to/from Scalars by using the Array methods already implemented: to go from R object to a Scalar, it creates a length-1 Array and then slices out the 0th element with GetScalar(); to go from Scalar to R object, it calls MakeArrayFromScalar and then the as.vector method on that Array (in R, there is no scalar type anyway, only length-1 vectors). This generally works fine but if I get a Struct scalar (as the minmax compute function returns), I can't do anything with it because MakeArrayFromScalar doesn't work with structs. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9056) [C++] Aggregation methods for Scalars?
Neal Richardson created ARROW-9056: -- Summary: [C++] Aggregation methods for Scalars? Key: ARROW-9056 URL: https://issues.apache.org/jira/browse/ARROW-9056 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Fix For: 1.0.0 See discussion on https://github.com/apache/arrow/pull/7308. Many/most would no-op (sum, mean, min, max), but maybe they should exist and not error? Maybe they're not needed, but I could see how you might invoke a function on the result of a previous aggregation or something. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9055) [C++] Add sum/mean kernels for Boolean type
Neal Richardson created ARROW-9055: -- Summary: [C++] Add sum/mean kernels for Boolean type Key: ARROW-9055 URL: https://issues.apache.org/jira/browse/ARROW-9055 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Fix For: 1.0.0 See https://github.com/apache/arrow/pull/7308 (ARROW-6978) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9054) [C++] Add ScalarAggregateOptions
Neal Richardson created ARROW-9054: -- Summary: [C++] Add ScalarAggregateOptions Key: ARROW-9054 URL: https://issues.apache.org/jira/browse/ARROW-9054 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Fix For: 1.0.0 See discussion on https://github.com/apache/arrow/pull/7308. MinMax has an option for null behavior, but Sum and Mean do not. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9046) [C++][R] Put more things in type_fwds
Neal Richardson created ARROW-9046: -- Summary: [C++][R] Put more things in type_fwds Key: ARROW-9046 URL: https://issues.apache.org/jira/browse/ARROW-9046 Project: Apache Arrow Issue Type: Improvement Components: C++, R Reporter: Neal Richardson Assignee: Ben Kietzman Fix For: 1.0.0 Hopefully to reduce compile time. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8984) [R] Revise install guides now that Windows conda package exists
Neal Richardson created ARROW-8984: -- Summary: [R] Revise install guides now that Windows conda package exists Key: ARROW-8984 URL: https://issues.apache.org/jira/browse/ARROW-8984 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8976) [C++] compute::CallFunction can't Filter/Take with ChunkedArray
Neal Richardson created ARROW-8976: -- Summary: [C++] compute::CallFunction can't Filter/Take with ChunkedArray Key: ARROW-8976 URL: https://issues.apache.org/jira/browse/ARROW-8976 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Fix For: 1.0.0 Followup to ARROW-8938 {{Invalid: Kernel does not support chunked array arguments}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8899) [R] Add R metadata like pandas metadata for round-trip fidelity
Neal Richardson created ARROW-8899: -- Summary: [R] Add R metadata like pandas metadata for round-trip fidelity Key: ARROW-8899 URL: https://issues.apache.org/jira/browse/ARROW-8899 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Fix For: 1.0.0 Arrow Schema and Field objects have custom_metadata fields to store arbitrary strings in a key-value store. Pandas stores JSON in a "pandas" key and uses that to improve the fidelity of round-tripping data to Arrow/Parquet/Feather and back. https://pandas.pydata.org/docs/dev/development/developer.html#storing-pandas-dataframe-objects-in-apache-parquet-format describes this a bit. You can see this pandas metadata in the sample Parquet file: {code:r} tab <- read_parquet(system.file("v0.7.1.parquet", package="arrow"), as_data_frame = FALSE) tab # Table # 10 rows x 11 columns # $carat # $cut # $color # $clarity # $depth # $table # $price # $x # $y # $z # $__index_level_0__ tab$metadata # $pandas # [1] "{\"index_columns\": [\"__index_level_0__\"], \"column_indexes\": [{\"name\": null, \"pandas_type\": \"string\", \"numpy_type\": \"object\", \"metadata\": null}], \"columns\": [{\"name\": \"carat\", \"pandas_type\": \"float64\", \"numpy_type\": \"float64\", \"metadata\": null}, {\"name\": \"cut\", \"pandas_type\": \"unicode\", \"numpy_type\": \"object\", \"metadata\": null}, {\"name\": \"color\", \"pandas_type\": \"unicode\", \"numpy_type\": \"object\", \"metadata\": null}, {\"name\": \"clarity\", \"pandas_type\": \"unicode\", \"numpy_type\": \"object\", \"metadata\": null}, {\"name\": \"depth\", \"pandas_type\": \"float64\", \"numpy_type\": \"float64\", \"metadata\": null}, {\"name\": \"table\", \"pandas_type\": \"float64\", \"numpy_type\": \"float64\", \"metadata\": null}, {\"name\": \"price\", \"pandas_type\": \"int64\", \"numpy_type\": \"int64\", \"metadata\": null}, {\"name\": \"x\", \"pandas_type\": \"float64\", \"numpy_type\": \"float64\", \"metadata\": null}, {\"name\": \"y\", \"pandas_type\": \"float64\", \"numpy_type\": \"float64\", \"metadata\": null}, {\"name\": \"z\", \"pandas_type\": \"float64\", \"numpy_type\": \"float64\", \"metadata\": null}, {\"name\": \"__index_level_0__\", \"pandas_type\": \"int64\", \"numpy_type\": \"int64\", \"metadata\": null}], \"pandas_version\": \"0.20.1\"}" {code} We should do something similar in R: store the "attributes" for each column in a data.frame when we convert to Arrow, and restore those attributes when we read from Arrow. Since ARROW-8703, you could naively do this all in R, something like: {code:r} tab$metadata$r <- lapply(df, attributes) {code} on the conversion to Arrow, and in as.data.frame(), do {code:r} if (!is.null(tab$metadata$r)) { df[] <- mapply(function(col, meta) { attributes(col) <- meta }, col = df, meta = tab$metadata$r) } {code} However, it's trickier than this because: * {{tab$metadata$r}} needs to be serialized to string and deserialized on the way back. Pandas uses JSON but arrow doesn't currently have a JSON R dependency. The C++ build does include rapidjson, maybe we could tap into that? Alternatively, we could {{dput()}} to dump the R attributes, which might have higher fidelity in addition to zero dependencies, but there are tradeoffs. * We'll need to do the same for all places where Tables and RecordBatches are created/converted * We'll need to make sure that nested types (structs) get the same coverage * This metadata only is attached to Schemas, meaning that Arrays/ChunkedArrays don't have a place to store extra metadata. So we probably want to attach to the R6 (Chunked)Array objects a metadata/attributes field so that if we convert an R vector to array, or if we extract an array out of a record batch, we don't lose the attributes. Doing this should resolve ARROW-4390 and make ARROW-8867 trivial as well. Finally, a note about this custom metadata vs. extension types. Extension types can be defined by [adding metadata to a Field|https://arrow.apache.org/docs/format/Columnar.html#extension-types] (in a Schema). I think this is out of scope here because we're only concerned with R roundtrip fidelity. If there were a type that (for example) R and Pandas both had that Arrow did not, we could define an extension type so that we could share that across the implementations. But unless/until there is value in establishing that extension type standard, let's not worry with it. (In other words, in R we should ignore pandas metadata; if there's anything that pandas wants to share with R, it will define it somewhere else.) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8885) [R] Don't include everything everywhere
Neal Richardson created ARROW-8885: -- Summary: [R] Don't include everything everywhere Key: ARROW-8885 URL: https://issues.apache.org/jira/browse/ARROW-8885 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 I noticed that we were jamming all of our arrow #includes in one header file in the R bindings and then including that everywhere. Seemed like that was wasteful and probably causing compilation to be slower. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8864) [R] Add methods to Table/RecordBatch for consistency with data.frame
Neal Richardson created ARROW-8864: -- Summary: [R] Add methods to Table/RecordBatch for consistency with data.frame Key: ARROW-8864 URL: https://issues.apache.org/jira/browse/ARROW-8864 Project: Apache Arrow Issue Type: New Feature Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Some methods identified in the Feather package test suite -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8857) [CI] MinGW builds break on system upgrade
Neal Richardson created ARROW-8857: -- Summary: [CI] MinGW builds break on system upgrade Key: ARROW-8857 URL: https://issues.apache.org/jira/browse/ARROW-8857 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, R, Ruby Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 See e.g. https://github.com/apache/arrow/pull/7218/checks?check_run_id=687127263#step:7:69 Started failing sometime today. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8852) [R] Post-0.17.1 adjustments
Neal Richardson created ARROW-8852: -- Summary: [R] Post-0.17.1 adjustments Key: ARROW-8852 URL: https://issues.apache.org/jira/browse/ARROW-8852 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8826) [Crossbow] remote URL should always have .git
Neal Richardson created ARROW-8826: -- Summary: [Crossbow] remote URL should always have .git Key: ARROW-8826 URL: https://issues.apache.org/jira/browse/ARROW-8826 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, Developer Tools Reporter: Neal Richardson Assignee: Neal Richardson In ARROW-7803, I edited the crossbow templates for the homebrew jobs to substitute in the correct fork of arrow and append the current git SHA so that the code under test corresponds to the requested git commit. Unfortunately, this caused the nightly builds to fail. Comparing a successful on-demand run (https://github.com/ursa-labs/crossbow/blob/actions-266-travis-homebrew-r-autobrew/.travis.yml) with a nightly run (https://github.com/ursa-labs/crossbow/blob/nightly-2020-05-16-0-travis-homebrew-cpp/.travis.yml), it appears that the default "remote" URL that crossbow uses when not on a fork/PR does not contain the ".git" suffix. And I suspect that Homebrew requires that in order to identify the source as a git repo in order to clone it correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8804) [R][CI] Followup to Rtools40 upgrade
Neal Richardson created ARROW-8804: -- Summary: [R][CI] Followup to Rtools40 upgrade Key: ARROW-8804 URL: https://issues.apache.org/jira/browse/ARROW-8804 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8768) [R][CI] Fix nightly as-cran spurious failure
Neal Richardson created ARROW-8768: -- Summary: [R][CI] Fix nightly as-cran spurious failure Key: ARROW-8768 URL: https://issues.apache.org/jira/browse/ARROW-8768 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 An extra check we added to ensure that the package doesn't write anything to the user's home directory started failing on one of the 5 as-cran checks. It appears that a new feature of texlive2020, which is apparently invoked on checking that the pdf manual can be built, adds some caching junk to the home dir. It is unlikely that this is a real failure, probably just an artifact of the test environment. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8758) [R] Updates for compatibility with dplyr 1.0
Neal Richardson created ARROW-8758: -- Summary: [R] Updates for compatibility with dplyr 1.0 Key: ARROW-8758 URL: https://issues.apache.org/jira/browse/ARROW-8758 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0, 0.17.1 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8718) [R] Add str() methods to objects
Neal Richardson created ARROW-8718: -- Summary: [R] Add str() methods to objects Key: ARROW-8718 URL: https://issues.apache.org/jira/browse/ARROW-8718 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Apparently this will make the RStudio IDE show useful things in the environment panel. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8717) [CI][Packaging] Add build dependency on boost to homebrew
Neal Richardson created ARROW-8717: -- Summary: [CI][Packaging] Add build dependency on boost to homebrew Key: ARROW-8717 URL: https://issues.apache.org/jira/browse/ARROW-8717 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, Packaging Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 cf. https://github.com/Homebrew/homebrew-core/pull/54287 and revise the Travis jobs to uninstall boost and thrift before checking the formula -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8699) [R] Fix automatic r_to_py conversion
Neal Richardson created ARROW-8699: -- Summary: [R] Fix automatic r_to_py conversion Key: ARROW-8699 URL: https://issues.apache.org/jira/browse/ARROW-8699 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 See https://github.com/rstudio/reticulate/issues/748 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8624) [Packaging] Linux system packages aren't building with ARROW_DATASET=ON
Neal Richardson created ARROW-8624: -- Summary: [Packaging] Linux system packages aren't building with ARROW_DATASET=ON Key: ARROW-8624 URL: https://issues.apache.org/jira/browse/ARROW-8624 Project: Apache Arrow Issue Type: Improvement Components: Packaging Affects Versions: 0.17.0 Reporter: Neal Richardson I've seen a few reports like https://github.com/apache/arrow/issues/7055, where the user reports that they've installed the arrow system packages, we can see that they exist, but {{pkg-config}} reports that it doesn't have them. I think this is because {{-larrow_dataset}} isn't found. As the output on that issue shows, while arrow core headers and libraries are there, arrow_dataset is not. Searching through the packaging scripts (such as https://github.com/apache/arrow/blob/master/dev/tasks/linux-packages/apache-arrow/yum/arrow.spec.in), while there is some metadata about a dataset package, I see that ARROW_DATASET=ON is not set anywhere, so I don't think we're building it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8607) [R][CI] Unbreak builds following R 4.0 release
Neal Richardson created ARROW-8607: -- Summary: [R][CI] Unbreak builds following R 4.0 release Key: ARROW-8607 URL: https://issues.apache.org/jira/browse/ARROW-8607 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Just a tourniquet to get master passing again while I work on ARROW-8604. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8606) [CI] Don't trigger all builds on a change to any file in ci/
Neal Richardson created ARROW-8606: -- Summary: [CI] Don't trigger all builds on a change to any file in ci/ Key: ARROW-8606 URL: https://issues.apache.org/jira/browse/ARROW-8606 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8575) [Developer] Add issue_comment workflow to rebase a PR
Neal Richardson created ARROW-8575: -- Summary: [Developer] Add issue_comment workflow to rebase a PR Key: ARROW-8575 URL: https://issues.apache.org/jira/browse/ARROW-8575 Project: Apache Arrow Issue Type: Improvement Components: Developer Tools Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8569) [CI] Upgrade xcode version for testing homebrew formulae
Neal Richardson created ARROW-8569: -- Summary: [CI] Upgrade xcode version for testing homebrew formulae Key: ARROW-8569 URL: https://issues.apache.org/jira/browse/ARROW-8569 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, Packaging Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 To prevent as many bottles from being built from source. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8550) [CI] Don't run cron GHA jobs on forks
Neal Richardson created ARROW-8550: -- Summary: [CI] Don't run cron GHA jobs on forks Key: ARROW-8550 URL: https://issues.apache.org/jira/browse/ARROW-8550 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Neal Richardson Assignee: Neal Richardson It's wasteful, and I'm tired of seeing them clogging up my Actions tab and notifications. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8549) [R] Assorted post-0.17 release cleanups
Neal Richardson created ARROW-8549: -- Summary: [R] Assorted post-0.17 release cleanups Key: ARROW-8549 URL: https://issues.apache.org/jira/browse/ARROW-8549 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8548) [Website] 0.17 release post
Neal Richardson created ARROW-8548: -- Summary: [Website] 0.17 release post Key: ARROW-8548 URL: https://issues.apache.org/jira/browse/ARROW-8548 Project: Apache Arrow Issue Type: Improvement Components: Website Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8538) [Packaging] Remove boost from homebrew formula
Neal Richardson created ARROW-8538: -- Summary: [Packaging] Remove boost from homebrew formula Key: ARROW-8538 URL: https://issues.apache.org/jira/browse/ARROW-8538 Project: Apache Arrow Issue Type: Improvement Components: C++, Packaging Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8489) [Developer] Autotune more things
Neal Richardson created ARROW-8489: -- Summary: [Developer] Autotune more things Key: ARROW-8489 URL: https://issues.apache.org/jira/browse/ARROW-8489 Project: Apache Arrow Issue Type: Improvement Components: Developer Tools, Python Reporter: Neal Richardson ARROW-7801 added the "autotune" comment bot to fix linting errors and rebuild some generated files. cmake-format was left off because of Python problems (see description on https://github.com/apache/arrow/pull/6932). And there's probably other things we want to add (autopep8 for python, and similar for other languages?) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8475) [CI][Crossbow] Rehabilitate (or delete) hiveserver2 nightly job
Neal Richardson created ARROW-8475: -- Summary: [CI][Crossbow] Rehabilitate (or delete) hiveserver2 nightly job Key: ARROW-8475 URL: https://issues.apache.org/jira/browse/ARROW-8475 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Neal Richardson Disabled in ARROW-8474 cc [~wesm] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8474) [CI][Crossbow] Skip some nightlies we don't need to run
Neal Richardson created ARROW-8474: -- Summary: [CI][Crossbow] Skip some nightlies we don't need to run Key: ARROW-8474 URL: https://issues.apache.org/jira/browse/ARROW-8474 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8449) [R] Use CMAKE_UNITY_BUILD everywhere
Neal Richardson created ARROW-8449: -- Summary: [R] Use CMAKE_UNITY_BUILD everywhere Key: ARROW-8449 URL: https://issues.apache.org/jira/browse/ARROW-8449 Project: Apache Arrow Issue Type: Improvement Components: Packaging, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8433) [R] Add feather alias for ipc format in dataset API
Neal Richardson created ARROW-8433: -- Summary: [R] Add feather alias for ipc format in dataset API Key: ARROW-8433 URL: https://issues.apache.org/jira/browse/ARROW-8433 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 cf. ARROW-8416 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8390) [R] Expose schema unification features
Neal Richardson created ARROW-8390: -- Summary: [R] Expose schema unification features Key: ARROW-8390 URL: https://issues.apache.org/jira/browse/ARROW-8390 Project: Apache Arrow Issue Type: New Feature Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8379) [R] Investigate/fix thread safety issues (esp. Windows)
Neal Richardson created ARROW-8379: -- Summary: [R] Investigate/fix thread safety issues (esp. Windows) Key: ARROW-8379 URL: https://issues.apache.org/jira/browse/ARROW-8379 Project: Apache Arrow Issue Type: New Feature Components: R Reporter: Neal Richardson There have been a number of issues where the R bindings' multithreading has been implicated in unstable behavior (ARROW-7844 for example). In ARROW-8375 I disabled {{use_threads}} in the Windows tests, and it appeared that the mysterious Windows segfaults stopped. We should fix whatever the underlying issues are. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8377) [CI][C++][R] Build and run C++ tests on Rtools build
Neal Richardson created ARROW-8377: -- Summary: [CI][C++][R] Build and run C++ tests on Rtools build Key: ARROW-8377 URL: https://issues.apache.org/jira/browse/ARROW-8377 Project: Apache Arrow Issue Type: New Feature Components: C++, Continuous Integration, R Reporter: Neal Richardson Maybe this will better identify our unexplained segfaults -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8376) [R] Add experimental interface to ScanTask/RecordBatch iterators
Neal Richardson created ARROW-8376: -- Summary: [R] Add experimental interface to ScanTask/RecordBatch iterators Key: ARROW-8376 URL: https://issues.apache.org/jira/browse/ARROW-8376 Project: Apache Arrow Issue Type: New Feature Components: R Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8375) [CI][R] Make Windows tests more verbose in case of segfault
Neal Richardson created ARROW-8375: -- Summary: [CI][R] Make Windows tests more verbose in case of segfault Key: ARROW-8375 URL: https://issues.apache.org/jira/browse/ARROW-8375 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8369) [CI] Fix crossbow R group
Neal Richardson created ARROW-8369: -- Summary: [CI] Fix crossbow R group Key: ARROW-8369 URL: https://issues.apache.org/jira/browse/ARROW-8369 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration Reporter: Neal Richardson Assignee: Neal Richardson This was broken in ARROW-8356 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8353) [C++] is_nullable maybe not initialized in parquet writer
Neal Richardson created ARROW-8353: -- Summary: [C++] is_nullable maybe not initialized in parquet writer Key: ARROW-8353 URL: https://issues.apache.org/jira/browse/ARROW-8353 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Neal Richardson >From the Rtools build: {code} [ 84%] Building CXX object src/parquet/CMakeFiles/parquet_static.dir/column_reader.cc.obj In file included from D:/a/arrow/arrow/cpp/src/arrow/io/concurrency.h:23:0, from D:/a/arrow/arrow/cpp/src/arrow/io/memory.h:25, from D:/a/arrow/arrow/cpp/src/parquet/platform.h:25, from D:/a/arrow/arrow/cpp/src/parquet/arrow/writer.h:23, from D:/a/arrow/arrow/cpp/src/parquet/arrow/writer.cc:18: D:/a/arrow/arrow/cpp/src/arrow/result.h: In member function 'virtual arrow::Status parquet::arrow::FileWriterImpl::WriteColumnChunk(const std::shared_ptr&, int64_t, int64_t)': D:/a/arrow/arrow/cpp/src/arrow/result.h:428:28: warning: 'is_nullable' may be used uninitialized in this function [-Wmaybe-uninitialized] auto result_name = (rexpr); \ ^ D:/a/arrow/arrow/cpp/src/parquet/arrow/writer.cc:430:10: note: 'is_nullable' was declared here bool is_nullable; ^ {code} I'd give it a default value, but IDK that it's that simple. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8352) [R] Add install_pyarrow()
Neal Richardson created ARROW-8352: -- Summary: [R] Add install_pyarrow() Key: ARROW-8352 URL: https://issues.apache.org/jira/browse/ARROW-8352 Project: Apache Arrow Issue Type: New Feature Reporter: Neal Richardson Assignee: Neal Richardson To facilitate installing for use with reticulate, including handling how to use the nightly packages. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8351) [R][CI] Store the Rtools-built Arrow C++ library as a build artifact
Neal Richardson created ARROW-8351: -- Summary: [R][CI] Store the Rtools-built Arrow C++ library as a build artifact Key: ARROW-8351 URL: https://issues.apache.org/jira/browse/ARROW-8351 Project: Apache Arrow Issue Type: New Feature Reporter: Neal Richardson Assignee: Neal Richardson To help with debugging unexplained segfaults. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8346) [CI][Ruby] GLib/Ruby macOS build fails on zlib
Neal Richardson created ARROW-8346: -- Summary: [CI][Ruby] GLib/Ruby macOS build fails on zlib Key: ARROW-8346 URL: https://issues.apache.org/jira/browse/ARROW-8346 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, GLib Reporter: Neal Richardson Fix For: 0.17.0 See https://github.com/apache/arrow/runs/564610412 for example. {code} Using 'PKG_CONFIG_PATH' from environment with value: '/usr/local/lib/pkgconfig' Run-time dependency gobject-2.0 found: YES 2.64.1 Run-time dependency gio-2.0 found: NO (tried framework and cmake) c_glib/arrow-glib/meson.build:210:0: ERROR: Could not generate cargs for gio-2.0: Package zlib was not found in the pkg-config search path. Perhaps you should add the directory containing `zlib.pc' to the PKG_CONFIG_PATH environment variable Package 'zlib', required by 'gio-2.0', not found A full log can be found at /Users/runner/runners/2.168.0/work/arrow/arrow/build/c_glib/meson-logs/meson-log.txt {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8337) [Release] Verify release candidate wheels without using conda
Neal Richardson created ARROW-8337: -- Summary: [Release] Verify release candidate wheels without using conda Key: ARROW-8337 URL: https://issues.apache.org/jira/browse/ARROW-8337 Project: Apache Arrow Issue Type: Improvement Components: Developer Tools Reporter: Neal Richardson See final comments on ARROW-2880 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8335) [Release] Add crossbow jobs to run release verification
Neal Richardson created ARROW-8335: -- Summary: [Release] Add crossbow jobs to run release verification Key: ARROW-8335 URL: https://issues.apache.org/jira/browse/ARROW-8335 Project: Apache Arrow Issue Type: Improvement Components: Developer Tools Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 Workflow: edit version number and rc number in template in {{dev/release/github.verify.yml}}, make PR, and do * {{@github-actions crossbow submit -g verify-rc}} to run everything * {{@github-actions crossbow submit -g verify-rc-wheel|source|binary}} to run those groups * Other groups at {{verify-rc-wheel|source-macos|ubuntu|windows}}, {{verify-rc-source-cpp|csharp|java|etc.}} * Individual workflows at e.g. {{verify-rc-wheel-windows}}, {{verify-rc-source-macos-csharp}}. We could break out the wheel verification by python version (maybe we should), but that requires changes to the verification scripts themselves. Running the main {{verify-rc}} group will put a ton of workflow svg badges on the PR so we can see at a glance what is passing and failing. If things fail when running all, can push fixes to the verification script to the branch and retry just those that failed. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8325) [R][CI] Stop including boost in R windows bundle
Neal Richardson created ARROW-8325: -- Summary: [R][CI] Stop including boost in R windows bundle Key: ARROW-8325 URL: https://issues.apache.org/jira/browse/ARROW-8325 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8324) [R] Add read/write_ipc_file separate from _feather
Neal Richardson created ARROW-8324: -- Summary: [R] Add read/write_ipc_file separate from _feather Key: ARROW-8324 URL: https://issues.apache.org/jira/browse/ARROW-8324 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson See [https://github.com/apache/arrow/pull/6771#issuecomment-608133760] {quote}Let's add read/write_ipc_file also? I'm wary of the "version" option in "write_feather" and the Feather version inference capability in "read_feather". It's potentially confusing and we may choose to add options to write_ipc_file/read_ipc_file that are more developer centric, having to do with particulars in the IPC format, that are not relevant or appropriate for the Feather APIs. IMHO it's best for "Feather format" to remain an abstracted higher-level concept with its use of the "IPC file format" as an implementation detail, and segregated from the other things. {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8309) [CI] C++/Java/Rust workflows should trigger on changes to Flight.proto
Neal Richardson created ARROW-8309: -- Summary: [CI] C++/Java/Rust workflows should trigger on changes to Flight.proto Key: ARROW-8309 URL: https://issues.apache.org/jira/browse/ARROW-8309 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 The Flight DoExchange format change caused Rust build failures (ARROW-8308). We would have caught these in the format change patch, but the Rust builds weren't triggered on changes to {{format/Flight.proto}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8301) [C++][Python][R] Handle ChunkedArray and Table in C data interface
Neal Richardson created ARROW-8301: -- Summary: [C++][Python][R] Handle ChunkedArray and Table in C data interface Key: ARROW-8301 URL: https://issues.apache.org/jira/browse/ARROW-8301 Project: Apache Arrow Issue Type: Improvement Components: C, C++, Python, R Reporter: Neal Richardson Assignee: Antoine Pitrou Currently the C data interface does Array and RecordBatch, but we're also going to need ChunkedArray and Table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8300) [R] Documentation and changelog updates for 0.17
Neal Richardson created ARROW-8300: -- Summary: [R] Documentation and changelog updates for 0.17 Key: ARROW-8300 URL: https://issues.apache.org/jira/browse/ARROW-8300 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8266) [C++] Add backup mirrors for external project source downloads
Neal Richardson created ARROW-8266: -- Summary: [C++] Add backup mirrors for external project source downloads Key: ARROW-8266 URL: https://issues.apache.org/jira/browse/ARROW-8266 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 As we've seen a number of times, most recently with boost, our builds sometimes fail because of a failure to download bundled dependencies. To reduce this risk, we can add alternate URLs to the cmake externalprojects, so that it will attempt to download from the second location if the first fails (https://cmake.org/cmake/help/latest/module/ExternalProject.html). This feature is available in cmake >=3.7. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8222) [C++] Use bcp to make a slim boost for bundled build
Neal Richardson created ARROW-8222: -- Summary: [C++] Use bcp to make a slim boost for bundled build Key: ARROW-8222 URL: https://issues.apache.org/jira/browse/ARROW-8222 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson We don't use much of Boost (just system, filesystem, and regex), but when we do a bundled build, we still download and extract all of boost. The tarball itself is 113mb, expanded is over 700mb. This can be slow, and it requires a lot of free disk space that we don't really need. [bcp|https://www.boost.org/doc/libs/1_72_0/tools/bcp/doc/html/index.html] is a boost tool that lets you extract a subset of boost, resolving any of its necessary dependencies across boost. The savings for us could be huge: {code} mkdir test ./bcp system.hpp filesystem.hpp regex.hpp test tar -czf test.tar.gz test/ {code} The resulting tarball is 885K (kilobytes!). {{bcp}} also lets you re-namespace, so this would (IIUC) solve ARROW-4286 as well. We would need a place to host this tarball, and we would have to updated it whenever we (1) bump the boost version or (2) add a new boost library dependency. This patch would of course include a script that would generate the tarball. Given the small size, we could also consider just vendoring it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8206) [R] Minor fix for backwards compatibility on Linux installation
Neal Richardson created ARROW-8206: -- Summary: [R] Minor fix for backwards compatibility on Linux installation Key: ARROW-8206 URL: https://issues.apache.org/jira/browse/ARROW-8206 Project: Apache Arrow Issue Type: Bug Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 In 0.16, the recommendation was to set {{LIBARROW_DOWNLOAD=true}} to install with dependencies, and this would include getting a binary. But the recent refactor to Linux installation didn't carry this setting forward correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8188) [R] Adapt to latest checks in R-devel
Neal Richardson created ARROW-8188: -- Summary: [R] Adapt to latest checks in R-devel Key: ARROW-8188 URL: https://issues.apache.org/jira/browse/ARROW-8188 Project: Apache Arrow Issue Type: Bug Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 See https://github.com/ursa-labs/crossbow/runs/526813242 for example. 1. checkbashisms now is complaining about a few things 2. Latest R-devel actually runs the donttest examples with --as-cran, and one fails. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8187) [R] Make test assertions robust to i18n
Neal Richardson created ARROW-8187: -- Summary: [R] Make test assertions robust to i18n Key: ARROW-8187 URL: https://issues.apache.org/jira/browse/ARROW-8187 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Antoine Pitrou Assignee: Neal Richardson Fix For: 0.17.0 {code} ── 1. Failure: codec_is_available (@test-compressed.R#22) ─ `codec_is_available("sdfasdf")` threw an error with unexpected message. Expected match: "'arg' should be one of" Actual message: "'arg' doit être un de “UNCOMPRESSED”, “SNAPPY”, “GZIP”, “BROTLI”, “ZSTD”, “LZ4”, “LZO”, “BZ2”" Backtrace: 1. testthat::expect_error(codec_is_available("sdfasdf"), "'arg' should be one of") testthat/test-compressed.R:22:2 6. arrow::codec_is_available("sdfasdf") 8. arrow:::compression_from_name(type) 9. purrr::map_int(...) 10. arrow:::.f(.x[[i]], ...) 11. base::match.arg(toupper(.x), names(CompressionType)) ── 2. Failure: time type unit validation (@test-data-type.R#298) ── `time32("years")` threw an error with unexpected message. Expected match: "'arg' should be one of" Actual message: "'arg' doit être un de “ms”, “s”" Backtrace: 1. testthat::expect_error(time32("years"), "'arg' should be one of") testthat/test-data-type.R:298:2 6. arrow::time32("years") 7. base::match.arg(unit) ── 3. Failure: time type unit validation (@test-data-type.R#305) ── `time64("years")` threw an error with unexpected message. Expected match: "'arg' should be one of" Actual message: "'arg' doit être un de “ns”, “us”" Backtrace: 1. testthat::expect_error(time64("years"), "'arg' should be one of") testthat/test-data-type.R:305:2 6. arrow::time64("years") 7. base::match.arg(unit) ── 4. Failure: decimal type and validation (@test-data-type.R#387) `decimal()` threw an error with unexpected message. Expected match: "argument \"precision\" is missing, with no default" Actual message: "l'argument \"precision\" est manquant, avec aucune valeur par défaut" Backtrace: 1. testthat::expect_error(decimal(), "argument \"precision\" is missing, with no default") testthat/test-data-type.R:387:2 6. arrow::decimal() ── 5. Failure: decimal type and validation (@test-data-type.R#389) `decimal(4)` threw an error with unexpected message. Expected match: "argument \"scale\" is missing, with no default" Actual message: "l'argument \"scale\" est manquant, avec aucune valeur par défaut" Backtrace: 1. testthat::expect_error(decimal(4), "argument \"scale\" is missing, with no default") testthat/test-data-type.R:389:2 6. arrow::decimal(4) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8139) [C++] FileSystem enum causes attributes warning
Neal Richardson created ARROW-8139: -- Summary: [C++] FileSystem enum causes attributes warning Key: ARROW-8139 URL: https://issues.apache.org/jira/browse/ARROW-8139 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 See e.g. https://github.com/apache/arrow/runs/512427577?check_suite_focus=true#step:7:996 {code} In file included from /arrow/r/check/arrow.Rcheck/00_pkg_src/arrow/libarrow/arrow-0.16.0.9000/include/arrow/dataset/discovery.h:31:0, from /arrow/r/check/arrow.Rcheck/00_pkg_src/arrow/libarrow/arrow-0.16.0.9000/include/arrow/dataset/api.h:21, from ./arrow_types.h:203, from array_to_vector.cpp:18: /arrow/r/check/arrow.Rcheck/00_pkg_src/arrow/libarrow/arrow-0.16.0.9000/include/arrow/filesystem/filesystem.h:65:1: warning: type attributes ignored after type is already defined [-Wattributes] {code} This isn't new but I've been staring at the R Linux builds a lot and wanted to clean this up. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8103) [R] Make default Linux build more minimal
Neal Richardson created ARROW-8103: -- Summary: [R] Make default Linux build more minimal Key: ARROW-8103 URL: https://issues.apache.org/jira/browse/ARROW-8103 Project: Apache Arrow Issue Type: New Feature Components: Packaging, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 So that we can build on CRAN as quickly as possible, and thus make the default experience for users installing the package better--no environment variable required to get something functional. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8095) [CI][Crossbow] Nightly turbodbc job fails
Neal Richardson created ARROW-8095: -- Summary: [CI][Crossbow] Nightly turbodbc job fails Key: ARROW-8095 URL: https://issues.apache.org/jira/browse/ARROW-8095 Project: Apache Arrow Issue Type: Bug Components: C++, Continuous Integration Reporter: Neal Richardson Fix For: 0.17.0 Turbodbc fails to compile (both "master" and "latest" versions with this error): {code} FAILED: cpp/turbodbc_arrow/Library/CMakeFiles/turbodbc_arrow_support.dir/src/arrow_result_set.cpp.o /opt/conda/envs/arrow/bin/x86_64-conda_cos6-linux-gnu-c++ -Dturbodbc_arrow_support_EXPORTS -I/turbodbc/cpp/turbodbc_arrow/Library -I/turbodbc/cpp/turbodbc_arrow/../cpp_odbc/Library -I/turbodbc/cpp/turbodbc_arrow/../turbodbc/Library -I/turbodbc/pybind11/include -isystem /opt/conda/envs/arrow/include -isystem /opt/conda/envs/arrow/include/python3.7m -isystem /opt/conda/envs/arrow/lib/python3.7/site-packages/numpy/core/include -fvisibility-inlines-hidden -Wall -Wextra -g -O0 -pedantic -fPIC -fvisibility=hidden -std=c++11 -std=c++14 -MD -MT cpp/turbodbc_arrow/Library/CMakeFiles/turbodbc_arrow_support.dir/src/arrow_result_set.cpp.o -MF cpp/turbodbc_arrow/Library/CMakeFiles/turbodbc_arrow_support.dir/src/arrow_result_set.cpp.o.d -o cpp/turbodbc_arrow/Library/CMakeFiles/turbodbc_arrow_support.dir/src/arrow_result_set.cpp.o -c /turbodbc/cpp/turbodbc_arrow/Library/src/arrow_result_set.cpp /turbodbc/cpp/turbodbc_arrow/Library/src/arrow_result_set.cpp: In member function 'arrow::Status turbodbc_arrow::{anonymous}::StringDictionaryBuilderProxy::AppendProxy(const char*, int32_t)': /turbodbc/cpp/turbodbc_arrow/Library/src/arrow_result_set.cpp:67:36: error: no matching function for call to 'turbodbc_arrow::{anonymous}::StringDictionaryBuilderProxy::Append(const char*&, int32_t&)' return Append(value, length); ^ In file included from /opt/conda/envs/arrow/include/arrow/builder.h:26:0, from /opt/conda/envs/arrow/include/arrow/api.h:26, from /turbodbc/cpp/turbodbc_arrow/Library/src/arrow_result_set.cpp:6: /opt/conda/envs/arrow/include/arrow/array/builder_dict.h:143:10: note: candidate: arrow::Status arrow::internal::DictionaryBuilderBase::Append(const Scalar&) [with BuilderType = arrow::AdaptiveIntBuilder; T = arrow::StringType; arrow::internal::DictionaryBuilderBase::Scalar = nonstd::sv_lite::basic_string_view] Status Append(const Scalar& value) { ^~ /opt/conda/envs/arrow/include/arrow/array/builder_dict.h:143:10: note: candidate expects 1 argument, 2 provided /opt/conda/envs/arrow/include/arrow/array/builder_dict.h:156:43: note: candidate: template arrow::enable_if_fixed_size_binary arrow::internal::DictionaryBuilderBase::Append(const uint8_t*) [with T1 = T1; BuilderType = arrow::AdaptiveIntBuilder; T = arrow::StringType] enable_if_fixed_size_binary Append(const uint8_t* value) { ^~ /opt/conda/envs/arrow/include/arrow/array/builder_dict.h:156:43: note: template argument deduction/substitution failed: /turbodbc/cpp/turbodbc_arrow/Library/src/arrow_result_set.cpp:67:36: note: candidate expects 1 argument, 2 provided return Append(value, length); ^ In file included from /opt/conda/envs/arrow/include/arrow/builder.h:26:0, from /opt/conda/envs/arrow/include/arrow/api.h:26, from /turbodbc/cpp/turbodbc_arrow/Library/src/arrow_result_set.cpp:6: /opt/conda/envs/arrow/include/arrow/array/builder_dict.h:162:43: note: candidate: template arrow::enable_if_fixed_size_binary arrow::internal::DictionaryBuilderBase::Append(const char*) [with T1 = T1; BuilderType = arrow::AdaptiveIntBuilder; T = arrow::StringType] enable_if_fixed_size_binary Append(const char* value) { ^~ /opt/conda/envs/arrow/include/arrow/array/builder_dict.h:162:43: note: template argument deduction/substitution failed: /turbodbc/cpp/turbodbc_arrow/Library/src/arrow_result_set.cpp:67:36: note: candidate expects 1 argument, 2 provided return Append(value, length); ^ In file included from /opt/conda/envs/arrow/include/arrow/builder.h:26:0, from /opt/conda/envs/arrow/include/arrow/api.h:26, from /turbodbc/cpp/turbodbc_arrow/Library/src/arrow_result_set.cpp:6: /opt/conda/envs/arrow/include/arrow/array/builder_dict.h:168:37: note: candidate: template arrow::enable_if_binary_like arrow::internal::DictionaryBuilderBase::Append(const uint8_t*, int32_t) [with T1 = T1; BuilderType = arrow::AdaptiveIntBuilder; T = arrow::StringType] enable_if_binary_like Append(const uint8_t* value, int32_t length) { ^~
[jira] [Created] (ARROW-8094) [CI][Crossbow] Nightly valgrind test fails
Neal Richardson created ARROW-8094: -- Summary: [CI][Crossbow] Nightly valgrind test fails Key: ARROW-8094 URL: https://issues.apache.org/jira/browse/ARROW-8094 Project: Apache Arrow Issue Type: Bug Components: C++, Continuous Integration Reporter: Neal Richardson Fix For: 0.17.0 See https://circleci.com/gh/ursa-labs/crossbow/9162 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8093) [CI][Crossbow] Pandas integration test fails
Neal Richardson created ARROW-8093: -- Summary: [CI][Crossbow] Pandas integration test fails Key: ARROW-8093 URL: https://issues.apache.org/jira/browse/ARROW-8093 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, Python Reporter: Neal Richardson Assignee: Joris Van den Bossche Fix For: 0.17.0 {code} === FAILURES === ___ test_conversion_extensiontype_to_extensionarray monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7f029f03f2a0> def test_conversion_extensiontype_to_extensionarray(monkeypatch): # converting extension type to linked pandas ExtensionDtype/Array import pandas.core.internals as _int storage = pa.array([1, 2, 3, 4], pa.int64()) arr = pa.ExtensionArray.from_storage(MyCustomIntegerType(), storage) table = pa.table({'a': arr}) if LooseVersion(pd.__version__) < "0.26.0.dev": # ensure pandas Int64Dtype has the protocol method (for older pandas) monkeypatch.setattr( pd.Int64Dtype, '__from_arrow__', _Int64Dtype__from_arrow__, raising=False) # extension type points to Int64Dtype, which knows how to create a # pandas ExtensionArray > result = table.to_pandas() opt/conda/envs/arrow/lib/python3.7/site-packages/pyarrow/tests/test_pandas.py:3633: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ pyarrow/array.pxi:566: in pyarrow.lib._PandasConvertible.to_pandas ??? pyarrow/table.pxi:1425: in pyarrow.lib.Table._to_pandas ??? opt/conda/envs/arrow/lib/python3.7/site-packages/pyarrow/pandas_compat.py:764: in table_to_blockmanager blocks = _table_to_blocks(options, table, categories, ext_columns_dtypes) opt/conda/envs/arrow/lib/python3.7/site-packages/pyarrow/pandas_compat.py:1102: in _table_to_blocks for item in result] opt/conda/envs/arrow/lib/python3.7/site-packages/pyarrow/pandas_compat.py:1102: in for item in result] opt/conda/envs/arrow/lib/python3.7/site-packages/pyarrow/pandas_compat.py:723: in _reconstruct_block pd_ext_arr = pandas_dtype.__from_arrow__(arr) opt/conda/envs/arrow/lib/python3.7/site-packages/pandas/core/arrays/integer.py:108: in __from_arrow__ array = array.cast(pyarrow_type) pyarrow/table.pxi:240: in pyarrow.lib.ChunkedArray.cast ??? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > ??? E pyarrow.lib.ArrowNotImplementedError: No cast implemented from extension to int64 {code} https://circleci.com/gh/ursa-labs/crossbow/9156 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8092) [CI][Crossbow] OSX wheels fail on bundled bzip2
Neal Richardson created ARROW-8092: -- Summary: [CI][Crossbow] OSX wheels fail on bundled bzip2 Key: ARROW-8092 URL: https://issues.apache.org/jira/browse/ARROW-8092 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, Packaging, Python Reporter: Neal Richardson Fix For: 0.17.0 See e.g. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8091) [CI][Crossbow] Fix nightly homebrew and R failures
Neal Richardson created ARROW-8091: -- Summary: [CI][Crossbow] Fix nightly homebrew and R failures Key: ARROW-8091 URL: https://issues.apache.org/jira/browse/ARROW-8091 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 0.17.0 R: [https://dev.azure.com/ursa-labs/crossbow/_build/results?buildId=8156=logs=0da5d1d9-276d-5173-c4c4-9d4d4ed14fdb=d9b15392-e4ce-5e4c-0c8c-b69645229181=127] Homebrew: [https://travis-ci.org/github/ursa-labs/crossbow/builds/661245549#L3392] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8025) [C++] Implement cast to Binary and FixedSizeBinary
Neal Richardson created ARROW-8025: -- Summary: [C++] Implement cast to Binary and FixedSizeBinary Key: ARROW-8025 URL: https://issues.apache.org/jira/browse/ARROW-8025 Project: Apache Arrow Issue Type: Improvement Components: C++, C++ - Compute Reporter: Neal Richardson It appears you can cast from Binary to String but not the other way. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8024) [R] Bindings for BinaryType and FixedBinaryType
Neal Richardson created ARROW-8024: -- Summary: [R] Bindings for BinaryType and FixedBinaryType Key: ARROW-8024 URL: https://issues.apache.org/jira/browse/ARROW-8024 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Prerequisite for ARROW-6235 (converting BinaryArray data to R). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8002) [C++][Dataset] Dataset writing should let you (re)partition the data
Neal Richardson created ARROW-8002: -- Summary: [C++][Dataset] Dataset writing should let you (re)partition the data Key: ARROW-8002 URL: https://issues.apache.org/jira/browse/ARROW-8002 Project: Apache Arrow Issue Type: Improvement Components: C++ - Dataset, Python, R Reporter: Neal Richardson Assignee: Ben Kietzman Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-8001) [C++][Dataset] R and Python bindings for dataset writing
Neal Richardson created ARROW-8001: -- Summary: [C++][Dataset] R and Python bindings for dataset writing Key: ARROW-8001 URL: https://issues.apache.org/jira/browse/ARROW-8001 Project: Apache Arrow Issue Type: Improvement Components: C++ - Dataset, Python, R Reporter: Neal Richardson Assignee: Ben Kietzman Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7988) [R] Fix on.exit calls in reticulate bindings
Neal Richardson created ARROW-7988: -- Summary: [R] Fix on.exit calls in reticulate bindings Key: ARROW-7988 URL: https://issues.apache.org/jira/browse/ARROW-7988 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7987) [CI][R] Fix for verbose nightly builds
Neal Richardson created ARROW-7987: -- Summary: [CI][R] Fix for verbose nightly builds Key: ARROW-7987 URL: https://issues.apache.org/jira/browse/ARROW-7987 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Followup to ARROW-7983 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7984) [R] Check for valid inputs in more places
Neal Richardson created ARROW-7984: -- Summary: [R] Check for valid inputs in more places Key: ARROW-7984 URL: https://issues.apache.org/jira/browse/ARROW-7984 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 In trying to reproduce bug reports, I typically hit code paths I don't usually use, and I often give some input that I expect should work and instead cause a segfault. That's no good. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7983) [CI][R] Nightly builds should be more verbose when they fail
Neal Richardson created ARROW-7983: -- Summary: [CI][R] Nightly builds should be more verbose when they fail Key: ARROW-7983 URL: https://issues.apache.org/jira/browse/ARROW-7983 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7967) [CI][Crossbow] Move autobrew job back to old macOS
Neal Richardson created ARROW-7967: -- Summary: [CI][Crossbow] Move autobrew job back to old macOS Key: ARROW-7967 URL: https://issues.apache.org/jira/browse/ARROW-7967 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson Followup to ARROW-7923. After hopefully fixing the underlying issue somewhere in Travis, revert the changes in that issue so that we're still testing on old macOS. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7962) [R][Dataset] Followup to "Consolidate Source and Dataset classes"
Neal Richardson created ARROW-7962: -- Summary: [R][Dataset] Followup to "Consolidate Source and Dataset classes" Key: ARROW-7962 URL: https://issues.apache.org/jira/browse/ARROW-7962 Project: Apache Arrow Issue Type: Bug Components: C++ - Dataset, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 This was pushed to ARROW-7886 but it got dropped in a force push. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7923) [CI][Crossbow] macOS autobrew fails on homebrew-versions
Neal Richardson created ARROW-7923: -- Summary: [CI][Crossbow] macOS autobrew fails on homebrew-versions Key: ARROW-7923 URL: https://issues.apache.org/jira/browse/ARROW-7923 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, Packaging, R Reporter: Neal Richardson Fix For: 1.0.0 See e.g. https://travis-ci.org/ursa-labs/crossbow/builds/653768049#L97. According to https://github.com/Homebrew/brew/issues/5734, there needs to be {{brew untap homebrew-versions}} before {{brew update}}, except this is happening in the Travis workflow in the setup stage, so we can't. Will need to change the travis-build config or base image upstream, or look for a different workaround. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7922) [CI][Crossbow] Nightly conda osx builds fail (brew bundle)
Neal Richardson created ARROW-7922: -- Summary: [CI][Crossbow] Nightly conda osx builds fail (brew bundle) Key: ARROW-7922 URL: https://issues.apache.org/jira/browse/ARROW-7922 Project: Apache Arrow Issue Type: Bug Components: Continuous Integration, Packaging Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 See e.g. https://travis-ci.org/ursa-labs/crossbow/builds/653768373#L129. Apparently a new Homebrew release changed some dependency of {{brew bundle}} so we need to be sure to {{brew update}} first: https://travis-ci.community/t/macos-build-fails-because-of-homebrew-bundle-unknown-command/7296/6. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7920) [R] Fill in some missing input validation
Neal Richardson created ARROW-7920: -- Summary: [R] Fill in some missing input validation Key: ARROW-7920 URL: https://issues.apache.org/jira/browse/ARROW-7920 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 I hit some segfaults trying to reproduce an issue because of missing input validation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7919) [R] install_arrow() should conda install if appropriate
Neal Richardson created ARROW-7919: -- Summary: [R] install_arrow() should conda install if appropriate Key: ARROW-7919 URL: https://issues.apache.org/jira/browse/ARROW-7919 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Like, check {{if (grepl("conda", R.Version()$platform))}} and if so then {{system("conda install ...")}}. Error if nightly == TRUE because we don't host conda nightlies yet. This would help with issues like https://github.com/apache/arrow/issues/6448 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7918) [R] Improve instructions for conda users in installation vignette
Neal Richardson created ARROW-7918: -- Summary: [R] Improve instructions for conda users in installation vignette Key: ARROW-7918 URL: https://issues.apache.org/jira/browse/ARROW-7918 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7913) [C++][Python][R] C++ implementation of C data protocol
Neal Richardson created ARROW-7913: -- Summary: [C++][Python][R] C++ implementation of C data protocol Key: ARROW-7913 URL: https://issues.apache.org/jira/browse/ARROW-7913 Project: Apache Arrow Issue Type: Improvement Components: C++, Python, R Affects Versions: 1.0.0 Reporter: Neal Richardson Assignee: Antoine Pitrou See ARROW-7912 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7912) [Format] C data interface
Neal Richardson created ARROW-7912: -- Summary: [Format] C data interface Key: ARROW-7912 URL: https://issues.apache.org/jira/browse/ARROW-7912 Project: Apache Arrow Issue Type: Improvement Components: Format Affects Versions: 1.0.0 Reporter: Neal Richardson Assignee: Antoine Pitrou Apache Arrow is designed to be a universal in-memory format for the representation of tabular ("columnar") data. However, some projects may face a difficult choice between either depending on a fast-evolving project such as the Arrow C++ library, or having to reimplement adapters for data interchange, which may require significant, redundant development effort. The Arrow C data interface defines a very small, stable set of C definitions that can be easily *copied* in any project's source code and used for columnar data interchange in the Arrow format. For non-C/C++ languages and runtimes, it should be almost as easy to translate the C definitions into the corresponding C FFI declarations. Applications and libraries can therefore work with Arrow memory without necessarily using Arrow libraries or reinventing the wheel. Developers can choose between tight integration with the Arrow *software project* (benefitting from the growing array of facilities exposed by e.g. the C++ or Java implementations of Apache Arrow, but with the cost of a dependency) or minimal integration with the Arrow *format* only. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7902) [Integration] Unskip nested dictionary integration tests
Neal Richardson created ARROW-7902: -- Summary: [Integration] Unskip nested dictionary integration tests Key: ARROW-7902 URL: https://issues.apache.org/jira/browse/ARROW-7902 Project: Apache Arrow Issue Type: Improvement Components: Integration Reporter: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7901) [Integration][Go] Add null type (and integration test)
Neal Richardson created ARROW-7901: -- Summary: [Integration][Go] Add null type (and integration test) Key: ARROW-7901 URL: https://issues.apache.org/jira/browse/ARROW-7901 Project: Apache Arrow Issue Type: Improvement Components: Go, Integration Reporter: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7900) [Integration][JavaScript] Add null type integration test
Neal Richardson created ARROW-7900: -- Summary: [Integration][JavaScript] Add null type integration test Key: ARROW-7900 URL: https://issues.apache.org/jira/browse/ARROW-7900 Project: Apache Arrow Issue Type: Improvement Components: Integration, JavaScript Reporter: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7899) [Integration][Java] null type integration test
Neal Richardson created ARROW-7899: -- Summary: [Integration][Java] null type integration test Key: ARROW-7899 URL: https://issues.apache.org/jira/browse/ARROW-7899 Project: Apache Arrow Issue Type: Bug Components: Integration, Java Reporter: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7895) [Python] Remove more python 2.7 cruft
Neal Richardson created ARROW-7895: -- Summary: [Python] Remove more python 2.7 cruft Key: ARROW-7895 URL: https://issues.apache.org/jira/browse/ARROW-7895 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7891) [C++] RecordBatch->Equals should also have a check_metadata argument
Neal Richardson created ARROW-7891: -- Summary: [C++] RecordBatch->Equals should also have a check_metadata argument Key: ARROW-7891 URL: https://issues.apache.org/jira/browse/ARROW-7891 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Fix For: 1.0.0 Followup to ARROW-7720 and ARROW-7786. Table and Schema both have it, so it stands to reason that RecordBatch should too. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7881) [C++] Fix pedantic warnings
Neal Richardson created ARROW-7881: -- Summary: [C++] Fix pedantic warnings Key: ARROW-7881 URL: https://issues.apache.org/jira/browse/ARROW-7881 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 Saw this while working on ARROW-7880: {code} In file included from /arrow/r/libarrow/arrow-0.16.0.9000/include/arrow/compute/kernel.h:27, from /arrow/r/libarrow/arrow-0.16.0.9000/include/arrow/compute/api.h:22, from ./arrow_types.h:199, from chunkedarray.cpp:18: /arrow/r/libarrow/arrow-0.16.0.9000/include/arrow/scalar.h:399:2: warning: extra ‘;’ [-Wpedantic] }; // namespace internal ^ In file included from /arrow/r/libarrow/arrow-0.16.0.9000/include/arrow/compute/api.h:31, from ./arrow_types.h:199, from chunkedarray.cpp:18: /arrow/r/libarrow/arrow-0.16.0.9000/include/arrow/compute/kernels/mean.h:66:2: warning: extra ‘;’ [-Wpedantic] }; // namespace arrow ^ In file included from /arrow/r/libarrow/arrow-0.16.0.9000/include/arrow/dataset/file_base.h:29, from /arrow/r/libarrow/arrow-0.16.0.9000/include/arrow/dataset/api.h:22, from ./arrow_types.h:201, from chunkedarray.cpp:18: /arrow/r/libarrow/arrow-0.16.0.9000/include/arrow/dataset/scanner.h:40:2: warning: extra ‘;’ [-Wpedantic] }; ^ In file included from /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/encryption.h:28, from /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/properties.h:29, from /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/metadata.h:29, from /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/file_reader.h:26, from /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/arrow/reader.h:25, from ./arrow_types.h:217, from chunkedarray.cpp:18: /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/schema.h:319:36: warning: extra ‘;’ [-Wpedantic] PRIMITIVE_FACTORY(Boolean, BOOLEAN); ^ /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/schema.h:320:32: warning: extra ‘;’ [-Wpedantic] PRIMITIVE_FACTORY(Int32, INT32); ^ /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/schema.h:321:32: warning: extra ‘;’ [-Wpedantic] PRIMITIVE_FACTORY(Int64, INT64); ^ /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/schema.h:322:32: warning: extra ‘;’ [-Wpedantic] PRIMITIVE_FACTORY(Int96, INT96); ^ /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/schema.h:323:32: warning: extra ‘;’ [-Wpedantic] PRIMITIVE_FACTORY(Float, FLOAT); ^ /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/schema.h:324:34: warning: extra ‘;’ [-Wpedantic] PRIMITIVE_FACTORY(Double, DOUBLE); ^ /arrow/r/libarrow/arrow-0.16.0.9000/include/parquet/schema.h:325:41: warning: extra ‘;’ [-Wpedantic] PRIMITIVE_FACTORY(ByteArray, BYTE_ARRAY); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7880) [CI][R] R sanitizer job is not really working
Neal Richardson created ARROW-7880: -- Summary: [CI][R] R sanitizer job is not really working Key: ARROW-7880 URL: https://issues.apache.org/jira/browse/ARROW-7880 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 It's not failing, but it's not doing useful things. It's building the C++ library, then installing the R package, but it's not finding the C++ library that was built, and then the rest of the build is not erroring but not actually working, just burning electricity. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7870) [CI][Packaging] Host nightly wheels on Apache bintray
Neal Richardson created ARROW-7870: -- Summary: [CI][Packaging] Host nightly wheels on Apache bintray Key: ARROW-7870 URL: https://issues.apache.org/jira/browse/ARROW-7870 Project: Apache Arrow Issue Type: Improvement Components: Packaging, Python Reporter: Neal Richardson Fix For: 1.0.0 See https://lists.apache.org/thread.html/r86c46849d8fe77de12821834b12330f0f77c3e7d7d4e6302c9f634d3%40%3Cdev.arrow.apache.org%3E Investigate whether bintray is a good alternative, and if we use it, add a note to our website about nightly builds. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7865) [R] Test builds on latest Linux versions
Neal Richardson created ARROW-7865: -- Summary: [R] Test builds on latest Linux versions Key: ARROW-7865 URL: https://issues.apache.org/jira/browse/ARROW-7865 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 See https://github.com/apache/arrow/issues/6435. CRAN might use old/stable versions but not everyone is so nostalgic. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7864) [R] Make sure bundled installation works even if there are system packages
Neal Richardson created ARROW-7864: -- Summary: [R] Make sure bundled installation works even if there are system packages Key: ARROW-7864 URL: https://issues.apache.org/jira/browse/ARROW-7864 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Fix For: 1.0.0 Among the issues: * In https://github.com/apache/arrow/issues/6435: 0.15 system packages didn't have libarrow_dataset, so if they're installed and you try to install 0.16, pkg-config probably reports that the packages aren't available and it tries to build from source. That's fine except that in the linking step, apparently the system packages are being picked up instead of the static libs we just built, so installation fails (presumably until you either upgrade the system packages or delete them). In general, if we've decided to build/download static libs to match the R package, we should make sure those are the ones that get picked up. * Whenever pkg-config does find packages, check the version and make sure it matches the R version, and if not, don't use them because they almost certainly won't work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7862) [R] Linux installation should run quieter by default
Neal Richardson created ARROW-7862: -- Summary: [R] Linux installation should run quieter by default Key: ARROW-7862 URL: https://issues.apache.org/jira/browse/ARROW-7862 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0, 0.16.1 No need to blow up the console by default. Also this solves an {{R CMD check}} warning that surfaced on CRAN. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7860) [C++] Support cast to/from halffloat
Neal Richardson created ARROW-7860: -- Summary: [C++] Support cast to/from halffloat Key: ARROW-7860 URL: https://issues.apache.org/jira/browse/ARROW-7860 Project: Apache Arrow Issue Type: Improvement Components: C++, C++ - Compute Reporter: Neal Richardson Fix For: 1.0.0 In trying to do ARROW-7753 I realized I couldn't make a halffloat. I tried creating a float64 (as R does naturally) and casting to float16, but it's not implemented. Looking at compute/kernels/cast.cc, and the associated source in compute/kernels/generated/codegen.py, {{FLOATING_TYPES = ['Float', 'Double']}}. Maybe halffloat just needs to be added there? Aside: searching through the code, it seems that this limitation of float types to float32 and float64 is the norm. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7859) [R] Minor patches for CRAN submission 0.16.0.2
Neal Richardson created ARROW-7859: -- Summary: [R] Minor patches for CRAN submission 0.16.0.2 Key: ARROW-7859 URL: https://issues.apache.org/jira/browse/ARROW-7859 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7853) [CI][Packaging] Add nightly test that pip-installs nightly wheels
Neal Richardson created ARROW-7853: -- Summary: [CI][Packaging] Add nightly test that pip-installs nightly wheels Key: ARROW-7853 URL: https://issues.apache.org/jira/browse/ARROW-7853 Project: Apache Arrow Issue Type: New Feature Components: Continuous Integration, Packaging, Python Reporter: Neal Richardson Assignee: Krisztian Szucs Fix For: 1.0.0 This would catch issues with wheels that we only encountered during release verification. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7844) [R] Parquet list column test is flaky
Neal Richardson created ARROW-7844: -- Summary: [R] Parquet list column test is flaky Key: ARROW-7844 URL: https://issues.apache.org/jira/browse/ARROW-7844 Project: Apache Arrow Issue Type: Bug Components: R Reporter: Neal Richardson Assignee: Francois Saint-Jacques See [https://travis-ci.org/ursa-labs/arrow-r-nightly/jobs/649649349#L373-L375] for an example on public CI. I was seeing this locally this week but figured I'd screwed up my env somehow. {code} ── 1. Failure: Lists are preserved when writing/reading from Parquet (@test-parq `object` not equivalent to `expected`. Component "num": Component 1: target is numeric, current is character {code} It's not always the same column in the data.frame that is affected. Also strange that it's only one column. You'd think that if it were transposing the order somehow, you'd get two that were swapped. The test itself is straightforward (https://github.com/apache/arrow/blob/master/r/tests/testthat/test-parquet.R#L124-L137) so this is somewhat troubling. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7833) [R] Make install_arrow() actually install arrow
Neal Richardson created ARROW-7833: -- Summary: [R] Make install_arrow() actually install arrow Key: ARROW-7833 URL: https://issues.apache.org/jira/browse/ARROW-7833 Project: Apache Arrow Issue Type: New Feature Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-7832) [R] Patches to 0.16.0 release
Neal Richardson created ARROW-7832: -- Summary: [R] Patches to 0.16.0 release Key: ARROW-7832 URL: https://issues.apache.org/jira/browse/ARROW-7832 Project: Apache Arrow Issue Type: Bug Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 CRAN did not like 0.16.0 as originally submitted. This contains the patches in the 0.16.0.1 resubmission. -- This message was sent by Atlassian Jira (v8.3.4#803005)