[jira] [Created] (ARROW-10608) [Python] Decimal256 Support finish off full support for conversion to/from decimal types

2020-11-15 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-10608:
---

 Summary: [Python] Decimal256 Support finish off full support for 
conversion to/from decimal types
 Key: ARROW-10608
 URL: https://issues.apache.org/jira/browse/ARROW-10608
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++, Python
Reporter: Micah Kornfield






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10607) [C++][Parquet] Support Reading/Writing Decimal256 type in Parquet

2020-11-15 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-10607:
---

 Summary: [C++][Parquet] Support Reading/Writing Decimal256 type in 
Parquet
 Key: ARROW-10607
 URL: https://issues.apache.org/jira/browse/ARROW-10607
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Micah Kornfield






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10606) [C++][Compute] Support casts to and from Decimal256 type.

2020-11-15 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-10606:
---

 Summary: [C++][Compute] Support casts to and from Decimal256 type.
 Key: ARROW-10606
 URL: https://issues.apache.org/jira/browse/ARROW-10606
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Micah Kornfield






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10605) [C++][Gandiva] Support Decimal256 type in gandiva computation.

2020-11-15 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-10605:
---

 Summary: [C++][Gandiva] Support Decimal256 type in gandiva 
computation.
 Key: ARROW-10605
 URL: https://issues.apache.org/jira/browse/ARROW-10605
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++ - Gandiva
Reporter: Micah Kornfield


There might be a lot of work here, so sub-jiras might be added once scope is 
determined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10604) [Ruby] Support Decimal256 type

2020-11-15 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-10604:
---

 Summary: [Ruby] Support Decimal256 type
 Key: ARROW-10604
 URL: https://issues.apache.org/jira/browse/ARROW-10604
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Ruby
Reporter: Micah Kornfield


The C++ implementation now support it.  We need to ensure Ruby/Gobject bindings 
do as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10603) [Javascript] Support Decimal type with 256 bits of precision

2020-11-15 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-10603:
---

 Summary: [Javascript] Support Decimal type with 256 bits of 
precision
 Key: ARROW-10603
 URL: https://issues.apache.org/jira/browse/ARROW-10603
 Project: Apache Arrow
  Issue Type: Improvement
  Components: JavaScript
Reporter: Micah Kornfield


The specification now supports it and there are basic implementations in 
C++/Java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10602) [Rust] Implement support for Decimal with 256 bits of precision.

2020-11-15 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-10602:
---

 Summary: [Rust] Implement support for Decimal with 256 bits of 
precision.
 Key: ARROW-10602
 URL: https://issues.apache.org/jira/browse/ARROW-10602
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Micah Kornfield


The specification now supports it and there are basic implementations in 
C++/Java



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10601) [C++] CSV Reader should support Decimal256 type

2020-11-15 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-10601:
---

 Summary: [C++] CSV Reader should support Decimal256 type
 Key: ARROW-10601
 URL: https://issues.apache.org/jira/browse/ARROW-10601
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Micah Kornfield






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10600) [Go] Support Decimal256 type

2020-11-15 Thread Micah Kornfield (Jira)
Micah Kornfield created ARROW-10600:
---

 Summary: [Go] Support Decimal256 type
 Key: ARROW-10600
 URL: https://issues.apache.org/jira/browse/ARROW-10600
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Go
Reporter: Micah Kornfield


Decimal with 256 bit precision is now allowed in the spec with a basic 
implementation in Java and C++.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10599) Prebuilt distributions (aka. pyarrow and libarrow-dev) should use the same ABI (with or without the DUAL abi)

2020-11-15 Thread Tao He (Jira)
Tao He created ARROW-10599:
--

 Summary: Prebuilt distributions (aka. pyarrow and libarrow-dev) 
should use the same ABI (with or without the DUAL abi)
 Key: ARROW-10599
 URL: https://issues.apache.org/jira/browse/ARROW-10599
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++, Python
Affects Versions: 2.0.0, 1.0.1, 0.17.0
Reporter: Tao He


I have observed that the python release (pyarrow) and c++ release (libarrow-dev 
for ubuntu) are built using the different GCC ABI.

The former, pyarrow, builtin within the manylinux1 environment, using gcc-4.8, 
however the later's ABI has a `[cxx11]` tag. That blocks users to develop 
python C extensions that depends on libarrow-dev. For example, we have 
developed `lib` A in C++, which use arrow's `Arrow::Buffer` from libarrow-dev, 
and wrap it using things like `pybind11` to a python module `liba`. After 
building the `liba` on commodity Ubuntu (which could install libarrow-dev with 
apt-get), the user import both `liba` and `pyarrow` to the python's script, it 
won't work correctly due to the ABI confliction (especially when it comes to 
the string cases).

I can see two options to make it works:

1. build arrow's python package using static link, that the pyarrow won't 
contains so many shared libraries (libarrow.so, libarrow_python.so, etc.)
2. distribute `libarrow-dev` with `-D_GLIBCXX_USE_CXX11_ABI=0`

I'm also wondering if there's any technical issues that not distributing 
packages in different languages with the same ABI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10598) [C++] Improve performance of GenerateBitsUnrolled

2020-11-15 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-10598:


 Summary: [C++] Improve performance of GenerateBitsUnrolled 
 Key: ARROW-10598
 URL: https://issues.apache.org/jira/browse/ARROW-10598
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney
Assignee: Wes McKinney
 Fix For: 3.0.0


internal::GenerateBitsUnrolled doesn't vectorize too well, there are some 
improvements we can make to get better code generation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10597) [Rust] [DataFusion] Enable all clippy lints

2020-11-15 Thread Andrew Lamb (Jira)
Andrew Lamb created ARROW-10597:
---

 Summary: [Rust] [DataFusion] Enable all clippy lints
 Key: ARROW-10597
 URL: https://issues.apache.org/jira/browse/ARROW-10597
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Andrew Lamb


The idea is to get the code to the point where all clippy lints can be enabled. 

Here is a list of clippy lints that was disabled:

https://github.com/apache/arrow/pull/8666/files#diff-678f16a0c102ed15656bd00d98c47755920c8c4ede296e3bf5c09a0d4f38a42cR19

The goal of this ticket is to enable them (maybe it would be better to break it 
into multiple PRs/subtasks)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10596) [Rust] Improve take benchmark

2020-11-15 Thread Jira
Jorge Leitão created ARROW-10596:


 Summary: [Rust] Improve take benchmark
 Key: ARROW-10596
 URL: https://issues.apache.org/jira/browse/ARROW-10596
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Jorge Leitão
Assignee: Jorge Leitão


The take benchmark has three issues:

1. it is always taking the same element (indices is a constant vector), which 
is very easy to speculatively predict what the element is going to be.
2. It also does not compare with vs without nulls
3. all elements of the array are equal, which is again easy to speculatively 
predict



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10595) Simplify inner loop of min/max kernels for non-null case

2020-11-15 Thread Jira
Daniël Heres created ARROW-10595:


 Summary: Simplify inner loop of min/max kernels for non-null case
 Key: ARROW-10595
 URL: https://issues.apache.org/jira/browse/ARROW-10595
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Daniël Heres


The inner loop of min/max kernels can be slightly simplified by removing a 
conditional.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10594) [Rust] Make take kernel not take values of childs when taking a null

2020-11-15 Thread Jira
Jorge Leitão created ARROW-10594:


 Summary: [Rust] Make take kernel not take values of childs when 
taking a null
 Key: ARROW-10594
 URL: https://issues.apache.org/jira/browse/ARROW-10594
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Jorge Leitão
Assignee: Jorge Leitão


Currently, take just takes all values from the childs, irrespectively of 
whether we took a null or not.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10593) [Rust] Fix take for lists

2020-11-15 Thread Jira
Jorge Leitão created ARROW-10593:


 Summary: [Rust] Fix take for lists
 Key: ARROW-10593
 URL: https://issues.apache.org/jira/browse/ARROW-10593
 Project: Apache Arrow
  Issue Type: Bug
  Components: Rust
Reporter: Jorge Leitão
Assignee: Jorge Leitão


Currently take of lists has a bug when the list contains nulls (the bitmap, not 
the values).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10592) [Rust] Fix error in taking from structArrays with nulls

2020-11-15 Thread Jira
Jorge Leitão created ARROW-10592:


 Summary: [Rust] Fix error in taking from structArrays with nulls
 Key: ARROW-10592
 URL: https://issues.apache.org/jira/browse/ARROW-10592
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Jorge Leitão
Assignee: Jorge Leitão


Take currently does not take nulls into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10591) [Rust] Add support to filter structArrays

2020-11-15 Thread Jira
Jorge Leitão created ARROW-10591:


 Summary: [Rust] Add support to filter structArrays
 Key: ARROW-10591
 URL: https://issues.apache.org/jira/browse/ARROW-10591
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Reporter: Jorge Leitão
Assignee: Jorge Leitão






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (ARROW-10590) [Rust] Remove Date32(Millisecond) from test

2020-11-15 Thread Jira
Jorge Leitão created ARROW-10590:


 Summary: [Rust] Remove Date32(Millisecond) from test
 Key: ARROW-10590
 URL: https://issues.apache.org/jira/browse/ARROW-10590
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Jorge Leitão
Assignee: Jorge Leitão


This is not supported by the arrow specification and `make_array` does not 
support it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)