[jira] [Created] (ARROW-9400) [Python] Do not depend on conda-forge static libraries in Windows wheel builds
Wes McKinney created ARROW-9400: --- Summary: [Python] Do not depend on conda-forge static libraries in Windows wheel builds Key: ARROW-9400 URL: https://issues.apache.org/jira/browse/ARROW-9400 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Wes McKinney Based on https://github.com/conda-forge/cfep/blob/e9bb3f58eca79107baede71cb9b05311705a10f2/cfep-18.md it appears that static libraries may not be included in the future in many packages that we use for building the Windows Python wheels. We should change the build to use BUNDLED builds so we don't have this issue -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9399) [C++] Add forward compatibility checks for unrecognized future MetadataVersion
Wes McKinney created ARROW-9399: --- Summary: [C++] Add forward compatibility checks for unrecognized future MetadataVersion Key: ARROW-9399 URL: https://issues.apache.org/jira/browse/ARROW-9399 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Wes McKinney Fix For: 1.0.0 We should have no need of these checks in theory, but they present a safeguard should some years in the future it became necessary to increment the MetadataVersion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9398) [C++] Register the SIMD sum variants under function instance instead a SIMD function
Frank Du created ARROW-9398: --- Summary: [C++] Register the SIMD sum variants under function instance instead a SIMD function Key: ARROW-9398 URL: https://issues.apache.org/jira/browse/ARROW-9398 Project: Apache Arrow Issue Type: Improvement Reporter: Frank Du Assignee: Frank Du Per the review comments of [https://github.com/apache/arrow/pull/7607] "Instead, we should add all the kernel variants to the same {{Function}} instance and then the {{Dispatch*}} methods should select the kernel with the maximum SIMD level as set on the Kernel object. That was the idea of the {{simd_level}} parameter" We should delegate the SIMD sum kernels to function instance -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9397) [R] Pass CC/CXX to cmake when building libarrow in Linux build
Neal Richardson created ARROW-9397: -- Summary: [R] Pass CC/CXX to cmake when building libarrow in Linux build Key: ARROW-9397 URL: https://issues.apache.org/jira/browse/ARROW-9397 Project: Apache Arrow Issue Type: Bug Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 1.0.0 TL;DR one of CRAN's test machines uses a bespoke clang build that uses libc++ instead of libstdc++: https://www.stats.ox.ac.uk/pub/bdr/Rconfig/r-devel-linux-x86_64-fedora-clang. R may have various make conf set that it uses when compiling the R bindings in `r/src`, and we need to use those settings when we shell out to cmake to build Arrow C++. Package fails to load due to undefined symbols otherwise: https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-fedora-clang/arrow-00install.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9396) [Python] Expose CpuInfo for informational / debugging purposes
Wes McKinney created ARROW-9396: --- Summary: [Python] Expose CpuInfo for informational / debugging purposes Key: ARROW-9396 URL: https://issues.apache.org/jira/browse/ARROW-9396 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Wes McKinney This would help to see what CpuInfo says about the current processor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9395) [Python] Provide configurable MetadataVersion in IPC API and environment variable to set default to V4 when needed
Wes McKinney created ARROW-9395: --- Summary: [Python] Provide configurable MetadataVersion in IPC API and environment variable to set default to V4 when needed Key: ARROW-9395 URL: https://issues.apache.org/jira/browse/ARROW-9395 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Wes McKinney Fix For: 1.0.0 This is a follow up to ARROW-9265 and must be implemented in order to release 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9394) [Python] Support pickling of Scalars
Ben Kietzman created ARROW-9394: --- Summary: [Python] Support pickling of Scalars Key: ARROW-9394 URL: https://issues.apache.org/jira/browse/ARROW-9394 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Ben Kietzman Fix For: 2.0.0 Scalars don't currently support pickling. Could this be as implemented with {{Scalar, (self.type, self.as_py())}}? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9393) [Doc] update supported types documentation for Java
Ryan Murray created ARROW-9393: -- Summary: [Doc] update supported types documentation for Java Key: ARROW-9393 URL: https://issues.apache.org/jira/browse/ARROW-9393 Project: Apache Arrow Issue Type: Improvement Components: Documentation Reporter: Ryan Murray Assignee: Ryan Murray Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9392) [C++] Document more of the compute layer
Antoine Pitrou created ARROW-9392: - Summary: [C++] Document more of the compute layer Key: ARROW-9392 URL: https://issues.apache.org/jira/browse/ARROW-9392 Project: Apache Arrow Issue Type: Improvement Components: C++, Documentation Reporter: Antoine Pitrou Fix For: 1.0.0 Ideally, we should add: * a description and examples of how to call compute functions * an API reference for concrete C++ functions such as {{Cast}}, {{NthToIndices}}, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9390) [C++] Review compute function names
Antoine Pitrou created ARROW-9390: - Summary: [C++] Review compute function names Key: ARROW-9390 URL: https://issues.apache.org/jira/browse/ARROW-9390 Project: Apache Arrow Issue Type: Wish Reporter: Antoine Pitrou Fix For: 1.0.0 We should probably make compute function naming more consistent while it's not too late. Examples: * "isin", "minmax" but "less_equal", "or_fleene", "binary_contains_exact" * "binary_contains_exact" only works on string types -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9389) [C++] Can't call isin/match through CallFunction
Neal Richardson created ARROW-9389: -- Summary: [C++] Can't call isin/match through CallFunction Key: ARROW-9389 URL: https://issues.apache.org/jira/browse/ARROW-9389 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Neal Richardson Fix For: 2.0.0 >From R: {code:r} library(arrow) a <- Array$create(1:4) b <- Array$create(c(2L, 4L, 3L)) arrow:::call_function("isin", a, b) {code} says that "isin" takes only 1 argument, not 2, which doesn't make sense. In C++ scalar_set_lookup.cc, I see {{auto isin = std::make_shared("isin", Arity::Unary());}}, which is the source of that validation I guess, but the kernel in api_scalar.cc has signature {{Result IsIn(const Datum& values, const Datum& value_set, ExecContext* ctx)}}. If I actually call "isin" with one argument, i.e. {{arrow:::call_function("isin", a)}}, it segfaults. Changing the definition to Arity::Binary(), it accepts 2 arguments, but it errors with {{NotImplemented: Function isin has no kernel matching input types (array[int32], array[int32])}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9388) [C++] Division kernels
Neal Richardson created ARROW-9388: -- Summary: [C++] Division kernels Key: ARROW-9388 URL: https://issues.apache.org/jira/browse/ARROW-9388 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Neal Richardson Fix For: 2.0.0 We now have add, subtract, multiply, but no division -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9387) [R] Use new C++ table select method
Neal Richardson created ARROW-9387: -- Summary: [R] Use new C++ table select method Key: ARROW-9387 URL: https://issues.apache.org/jira/browse/ARROW-9387 Project: Apache Arrow Issue Type: Improvement Components: R Reporter: Neal Richardson Assignee: Neal Richardson Fix For: 2.0.0 ARROW-8314 adds it so we can use it instead of the one we wrote in the R package. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9386) [Rust] RecordBatch.schema() should not return
Andy Grove created ARROW-9386: - Summary: [Rust] RecordBatch.schema() should not return Key: ARROW-9386 URL: https://issues.apache.org/jira/browse/ARROW-9386 Project: Apache Arrow Issue Type: Improvement Components: Rust Reporter: Andy Grove Fix For: 1.0.0 RecordBatch.schema() should not return . It should either return or Arc. Given that the schema could be a large nested structure and other code may want to keep references to it, Arc is probably best. Returning a reference to an Arc just doesn't make much sense, IMHO, and I think I was responsible for introducing this early on. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9385) [Python] [CI] jpype integration failure
Antoine Pitrou created ARROW-9385: - Summary: [Python] [CI] jpype integration failure Key: ARROW-9385 URL: https://issues.apache.org/jira/browse/ARROW-9385 Project: Apache Arrow Issue Type: Bug Components: Java, Python Reporter: Antoine Pitrou Fix For: 1.0.0 Following the Netty changes in Java, the Python jpype integration tests are failing: https://github.com/ursa-labs/crossbow/runs/852764453 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [arrow-testing] pitrou merged pull request #37: ARROW-9384: Add fuzz regression file
pitrou merged pull request #37: URL: https://github.com/apache/arrow-testing/pull/37 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [arrow-testing] pitrou opened a new pull request #37: ARROW-9384: Add fuzz regression file
pitrou opened a new pull request #37: URL: https://github.com/apache/arrow-testing/pull/37 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (ARROW-9384) [C++] Out-of-memory on invalid IPC input (OSS-Fuzz)
Antoine Pitrou created ARROW-9384: - Summary: [C++] Out-of-memory on invalid IPC input (OSS-Fuzz) Key: ARROW-9384 URL: https://issues.apache.org/jira/browse/ARROW-9384 Project: Apache Arrow Issue Type: Bug Components: C++ Reporter: Antoine Pitrou Assignee: Antoine Pitrou Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ARROW-9383) [Python] Support fsspec filesystems in Dataset API through fs handler
Joris Van den Bossche created ARROW-9383: Summary: [Python] Support fsspec filesystems in Dataset API through fs handler Key: ARROW-9383 URL: https://issues.apache.org/jira/browse/ARROW-9383 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Joris Van den Bossche Assignee: Joris Van den Bossche Fix For: 1.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)