[jira] [Created] (ARROW-9400) [Python] Do not depend on conda-forge static libraries in Windows wheel builds

2020-07-09 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-9400:
---

 Summary: [Python] Do not depend on conda-forge static libraries in 
Windows wheel builds
 Key: ARROW-9400
 URL: https://issues.apache.org/jira/browse/ARROW-9400
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Wes McKinney


Based on 
https://github.com/conda-forge/cfep/blob/e9bb3f58eca79107baede71cb9b05311705a10f2/cfep-18.md
 it appears that static libraries may not be included in the future in many 
packages that we use for building the Windows Python wheels. We should change 
the build to use BUNDLED builds so we don't have this issue



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9399) [C++] Add forward compatibility checks for unrecognized future MetadataVersion

2020-07-09 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-9399:
---

 Summary: [C++] Add forward compatibility checks for unrecognized 
future MetadataVersion
 Key: ARROW-9399
 URL: https://issues.apache.org/jira/browse/ARROW-9399
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney
 Fix For: 1.0.0


We should have no need of these checks in theory, but they present a safeguard 
should some years in the future it became necessary to increment the 
MetadataVersion. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9398) [C++] Register the SIMD sum variants under function instance instead a SIMD function

2020-07-09 Thread Frank Du (Jira)
Frank Du created ARROW-9398:
---

 Summary: [C++] Register the SIMD sum variants under function 
instance instead a SIMD function
 Key: ARROW-9398
 URL: https://issues.apache.org/jira/browse/ARROW-9398
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Frank Du
Assignee: Frank Du


Per the review comments of [https://github.com/apache/arrow/pull/7607]

"Instead, we should add all the kernel variants to the same {{Function}} 
instance and then the {{Dispatch*}} methods should select the kernel with the 
maximum SIMD level as set on the Kernel object. That was the idea of the 
{{simd_level}} parameter"

 

We should delegate the SIMD sum kernels to function instance



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9397) [R] Pass CC/CXX to cmake when building libarrow in Linux build

2020-07-09 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-9397:
--

 Summary: [R] Pass CC/CXX to cmake when building libarrow in Linux 
build
 Key: ARROW-9397
 URL: https://issues.apache.org/jira/browse/ARROW-9397
 Project: Apache Arrow
  Issue Type: Bug
  Components: R
Reporter: Neal Richardson
Assignee: Neal Richardson
 Fix For: 1.0.0


TL;DR one of CRAN's test machines uses a bespoke clang build that uses libc++ 
instead of libstdc++: 
https://www.stats.ox.ac.uk/pub/bdr/Rconfig/r-devel-linux-x86_64-fedora-clang. R 
may have various make conf set that it uses when compiling the R bindings in 
`r/src`, and we need to use those settings when we shell out to cmake to build 
Arrow C++. Package fails to load due to undefined symbols otherwise: 
https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-fedora-clang/arrow-00install.html



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9396) [Python] Expose CpuInfo for informational / debugging purposes

2020-07-09 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-9396:
---

 Summary: [Python] Expose CpuInfo for informational / debugging 
purposes
 Key: ARROW-9396
 URL: https://issues.apache.org/jira/browse/ARROW-9396
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Wes McKinney


This would help to see what CpuInfo says about the current processor



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9395) [Python] Provide configurable MetadataVersion in IPC API and environment variable to set default to V4 when needed

2020-07-09 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-9395:
---

 Summary: [Python] Provide configurable MetadataVersion in IPC API 
and environment variable to set default to V4 when needed
 Key: ARROW-9395
 URL: https://issues.apache.org/jira/browse/ARROW-9395
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Wes McKinney
 Fix For: 1.0.0


This is a follow up to ARROW-9265 and must be implemented in order to release 
1.0.0



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9394) [Python] Support pickling of Scalars

2020-07-09 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-9394:
---

 Summary: [Python] Support pickling of Scalars
 Key: ARROW-9394
 URL: https://issues.apache.org/jira/browse/ARROW-9394
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Ben Kietzman
 Fix For: 2.0.0


Scalars don't currently support pickling.

Could this be as implemented with {{Scalar, (self.type, self.as_py())}}?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9393) [Doc] update supported types documentation for Java

2020-07-09 Thread Ryan Murray (Jira)
Ryan Murray created ARROW-9393:
--

 Summary: [Doc] update supported types documentation for Java
 Key: ARROW-9393
 URL: https://issues.apache.org/jira/browse/ARROW-9393
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Documentation
Reporter: Ryan Murray
Assignee: Ryan Murray
 Fix For: 1.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9392) [C++] Document more of the compute layer

2020-07-09 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-9392:
-

 Summary: [C++] Document more of the compute layer
 Key: ARROW-9392
 URL: https://issues.apache.org/jira/browse/ARROW-9392
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Documentation
Reporter: Antoine Pitrou
 Fix For: 1.0.0


Ideally, we should add:
* a description and examples of how to call compute functions
* an API reference for concrete C++ functions such as {{Cast}}, 
{{NthToIndices}}, etc.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9390) [C++] Review compute function names

2020-07-09 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-9390:
-

 Summary: [C++] Review compute function names
 Key: ARROW-9390
 URL: https://issues.apache.org/jira/browse/ARROW-9390
 Project: Apache Arrow
  Issue Type: Wish
Reporter: Antoine Pitrou
 Fix For: 1.0.0


We should probably make compute function naming more consistent while it's not 
too late.

Examples:
* "isin", "minmax" but "less_equal", "or_fleene", "binary_contains_exact"
* "binary_contains_exact" only works on string types




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9389) [C++] Can't call isin/match through CallFunction

2020-07-09 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-9389:
--

 Summary: [C++] Can't call isin/match through CallFunction
 Key: ARROW-9389
 URL: https://issues.apache.org/jira/browse/ARROW-9389
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Neal Richardson
 Fix For: 2.0.0


>From R:

{code:r}
library(arrow)
a <- Array$create(1:4)
b <- Array$create(c(2L, 4L, 3L))
arrow:::call_function("isin", a, b)
{code}

says that "isin" takes only 1 argument, not 2, which doesn't make sense. In C++ 
scalar_set_lookup.cc, I see {{auto isin = 
std::make_shared("isin", Arity::Unary());}}, which is the 
source of that validation I guess, but the kernel in api_scalar.cc has 
signature {{Result IsIn(const Datum& values, const Datum& value_set, 
ExecContext* ctx)}}.

If I actually call "isin" with one argument, i.e. 
{{arrow:::call_function("isin", a)}}, it segfaults.

Changing the definition to Arity::Binary(), it accepts 2 arguments, but it 
errors with {{NotImplemented: Function isin has no kernel matching input types 
(array[int32], array[int32])}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9388) [C++] Division kernels

2020-07-09 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-9388:
--

 Summary: [C++] Division kernels
 Key: ARROW-9388
 URL: https://issues.apache.org/jira/browse/ARROW-9388
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Neal Richardson
 Fix For: 2.0.0


We now have add, subtract, multiply, but no division



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9387) [R] Use new C++ table select method

2020-07-09 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-9387:
--

 Summary: [R] Use new C++ table select method
 Key: ARROW-9387
 URL: https://issues.apache.org/jira/browse/ARROW-9387
 Project: Apache Arrow
  Issue Type: Improvement
  Components: R
Reporter: Neal Richardson
Assignee: Neal Richardson
 Fix For: 2.0.0


ARROW-8314 adds it so we can use it instead of the one we wrote in the R 
package.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9386) [Rust] RecordBatch.schema() should not return

2020-07-09 Thread Andy Grove (Jira)
Andy Grove created ARROW-9386:
-

 Summary: [Rust] RecordBatch.schema() should not return 
 Key: ARROW-9386
 URL: https://issues.apache.org/jira/browse/ARROW-9386
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Andy Grove
 Fix For: 1.0.0


RecordBatch.schema() should not return . It should either return 
 or Arc.

Given that the schema could be a large nested structure and other code may want 
to keep references to it, Arc is probably best.

Returning a reference to an Arc just doesn't make much sense, IMHO, and I think 
I was responsible for introducing this early on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9385) [Python] [CI] jpype integration failure

2020-07-09 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-9385:
-

 Summary: [Python] [CI] jpype integration failure
 Key: ARROW-9385
 URL: https://issues.apache.org/jira/browse/ARROW-9385
 Project: Apache Arrow
  Issue Type: Bug
  Components: Java, Python
Reporter: Antoine Pitrou
 Fix For: 1.0.0


Following the Netty changes in Java, the Python jpype integration tests are 
failing:

https://github.com/ursa-labs/crossbow/runs/852764453



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [arrow-testing] pitrou merged pull request #37: ARROW-9384: Add fuzz regression file

2020-07-09 Thread GitBox


pitrou merged pull request #37:
URL: https://github.com/apache/arrow-testing/pull/37


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [arrow-testing] pitrou opened a new pull request #37: ARROW-9384: Add fuzz regression file

2020-07-09 Thread GitBox


pitrou opened a new pull request #37:
URL: https://github.com/apache/arrow-testing/pull/37


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (ARROW-9384) [C++] Out-of-memory on invalid IPC input (OSS-Fuzz)

2020-07-09 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-9384:
-

 Summary: [C++] Out-of-memory on invalid IPC input (OSS-Fuzz)
 Key: ARROW-9384
 URL: https://issues.apache.org/jira/browse/ARROW-9384
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Antoine Pitrou
Assignee: Antoine Pitrou
 Fix For: 1.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-9383) [Python] Support fsspec filesystems in Dataset API through fs handler

2020-07-09 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-9383:


 Summary: [Python] Support fsspec filesystems in Dataset API 
through fs handler
 Key: ARROW-9383
 URL: https://issues.apache.org/jira/browse/ARROW-9383
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Joris Van den Bossche
Assignee: Joris Van den Bossche
 Fix For: 1.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)