[jira] [Created] (ARROW-3742) Fix gandiva cython bindings

2018-11-09 Thread Siyuan Zhuang (JIRA)
Siyuan Zhuang created ARROW-3742:


 Summary: Fix gandiva cython bindings
 Key: ARROW-3742
 URL: https://issues.apache.org/jira/browse/ARROW-3742
 Project: Apache Arrow
  Issue Type: Bug
Reporter: Siyuan Zhuang


After updating the gandiva cpp part (ARROW-3587), the cython bindings 
(ARROW-3602) are not consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3742) Fix pyarrow.types & gandiva cython bindings

2018-11-09 Thread Siyuan Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyuan Zhuang updated ARROW-3742:
-
Component/s: Python
 Gandiva

> Fix pyarrow.types & gandiva cython bindings
> ---
>
> Key: ARROW-3742
> URL: https://issues.apache.org/jira/browse/ARROW-3742
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Gandiva, Python
>Reporter: Siyuan Zhuang
>Assignee: Siyuan Zhuang
>Priority: Major
>
> 1. 'types.py' didn't export `_as_type`, causing failures in certain 
> cython/python combinations. I am surprised to see that the CI didn't fail.
> 2. After updating the gandiva cpp part (ARROW-3587), the cython bindings 
> (ARROW-3602) are not consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3742) Fix pyarrow.types & gandiva cython bindings

2018-11-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3742:
--
Labels: pull-request-available  (was: )

> Fix pyarrow.types & gandiva cython bindings
> ---
>
> Key: ARROW-3742
> URL: https://issues.apache.org/jira/browse/ARROW-3742
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Gandiva, Python
>Reporter: Siyuan Zhuang
>Assignee: Siyuan Zhuang
>Priority: Major
>  Labels: pull-request-available
>
> 1. 'types.py' didn't export `_as_type`, causing failures in certain 
> cython/python combinations. I am surprised to see that the CI didn't fail.
> 2. After updating the gandiva cpp part (ARROW-3587), the cython bindings 
> (ARROW-3602) are not consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3746) [Gandiva] [Python] Make it possible to list all functions registered with Gandiva

2018-11-09 Thread Philipp Moritz (JIRA)
Philipp Moritz created ARROW-3746:
-

 Summary: [Gandiva] [Python] Make it possible to list all functions 
registered with Gandiva
 Key: ARROW-3746
 URL: https://issues.apache.org/jira/browse/ARROW-3746
 Project: Apache Arrow
  Issue Type: Improvement
Reporter: Philipp Moritz


This will also be useful for documentation purposes (right now it is not very 
easy to get a list of all the functions that are registered).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3745) [C++] CMake passes static libraries multiple times to linker

2018-11-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3745:
---

 Summary: [C++] CMake passes static libraries multiple times to 
linker
 Key: ARROW-3745
 URL: https://issues.apache.org/jira/browse/ARROW-3745
 Project: Apache Arrow
  Issue Type: Bug
  Components: C#, C++
Affects Versions: 0.11.1
Reporter: Wes McKinney


With {{make array-test}} I see

{code}
[ 97%] Building CXX object src/arrow/CMakeFiles/array-test.dir/array-test.cc.o
cd /home/wesm/code/arrow/cpp/build-test/src/arrow && /usr/bin/ccache 
/usr/bin/clang++-6.0  -DARROW_JEMALLOC 
-DARROW_JEMALLOC_INCLUDE_DIR=/home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src/jemalloc_ep/dist//include
 -DARROW_WITH_BROTLI -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY -DARROW_WITH_ZLIB 
-DARROW_WITH_ZSTD -isystem /home/wesm/cpp-toolchain/include -isystem 
/home/wesm/code/arrow/cpp/build-test/double-conversion_ep/src/double-conversion_ep/include
 -isystem /home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src -isystem 
/home/wesm/code/arrow/cpp/thirdparty/hadoop/include 
-I/home/wesm/code/arrow/cpp/build-test/src -I/home/wesm/code/arrow/cpp/src  
-std=c++11  -Qunused-arguments -ggdb -O0  -Wall -Wno-unknown-warning-option 
-msse3 -maltivec  -g   -std=gnu++11 -o 
CMakeFiles/array-test.dir/array-test.cc.o -c 
/home/wesm/code/arrow/cpp/src/arrow/array-test.cc
[100%] Linking CXX executable ../../debug/array-test
cd /home/wesm/code/arrow/cpp/build-test/src/arrow && 
/home/wesm/cpp-toolchain/bin/cmake -E cmake_link_script 
CMakeFiles/array-test.dir/link.txt --verbose=1
/usr/bin/ccache /usr/bin/clang++-6.0   -std=c++11  -Qunused-arguments -ggdb -O0 
 -Wall -Wno-unknown-warning-option -msse3 -maltivec  -g  -rdynamic 
CMakeFiles/array-test.dir/array-test.cc.o  -o ../../debug/array-test 
-Wl,-rpath,/home/wesm/cpp-toolchain/lib ../../debug/libarrow.a 
/home/wesm/cpp-toolchain/lib/libgtest_main.a 
/home/wesm/cpp-toolchain/lib/libgtest.a -ldl 
/home/wesm/cpp-toolchain/lib/libglog.a /home/wesm/cpp-toolchain/lib/libglog.a 
/home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libzstd.a 
/home/wesm/cpp-toolchain/lib/libz.so /home/wesm/cpp-toolchain/lib/libz.so 
/home/wesm/cpp-toolchain/lib/libsnappy.a 
/home/wesm/cpp-toolchain/lib/libsnappy.a /home/wesm/cpp-toolchain/lib/liblz4.a 
/home/wesm/cpp-toolchain/lib/liblz4.a 
/home/wesm/cpp-toolchain/lib/libbrotlidec-static.a 
/home/wesm/cpp-toolchain/lib/libbrotlidec-static.a 
/home/wesm/cpp-toolchain/lib/libbrotlienc-static.a 
/home/wesm/cpp-toolchain/lib/libbrotlienc-static.a 
/home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a 
/home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a 
../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a 
../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a 
/home/wesm/cpp-toolchain/lib/libboost_system.so 
/home/wesm/cpp-toolchain/lib/libboost_filesystem.so 
/home/wesm/cpp-toolchain/lib/libboost_regex.so 
../../jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a -lpthread 
-lrt /usr/lib/x86_64-linux-gnu/libpthread.so 
/home/wesm/cpp-toolchain/lib/libgtest_main.a 
/home/wesm/cpp-toolchain/lib/libgtest.a 
{code}

Note how some of the static libraries are passed multiple times



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3750) [R] Pass various wrapped Arrow objects created in Python into R with zero copy via reticulate

2018-11-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3750:
---

 Summary: [R] Pass various wrapped Arrow objects created in Python 
into R with zero copy via reticulate
 Key: ARROW-3750
 URL: https://issues.apache.org/jira/browse/ARROW-3750
 Project: Apache Arrow
  Issue Type: New Feature
  Components: R
Reporter: Wes McKinney


A user may wish to use some functionality available only in pyarrow using 
reticulate; it would be useful to be able to construct an R wrapper object to 
the C++ object inside the corresponding Python type, e.g. {{pyarrow.Table}}. 

This probably will require some new functions to return the memory address of 
the shared_ptr/unique_ptr inside the Cython types so that a function on the R 
side can copy the smart pointer and create the corresponding R wrapper type

cc [~pitrou]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3747) [C++] Flip order of data members in arrow::Decimal128

2018-11-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3747:
---

 Summary: [C++] Flip order of data members in arrow::Decimal128
 Key: ARROW-3747
 URL: https://issues.apache.org/jira/browse/ARROW-3747
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.12.0


As discussed in https://github.com/apache/arrow/pull/2845, this will enable a 
data buffer to be correctly interpreted as {{Decimal128**}}, so memcpy and 
other operations will work



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3745) [C++] CMake passes static libraries multiple times to linker

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3745:
---

Assignee: Wes McKinney

> [C++] CMake passes static libraries multiple times to linker
> 
>
> Key: ARROW-3745
> URL: https://issues.apache.org/jira/browse/ARROW-3745
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C#, C++
>Affects Versions: 0.11.1
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> With {{make array-test}} I see
> {code}
> [ 97%] Building CXX object src/arrow/CMakeFiles/array-test.dir/array-test.cc.o
> cd /home/wesm/code/arrow/cpp/build-test/src/arrow && /usr/bin/ccache 
> /usr/bin/clang++-6.0  -DARROW_JEMALLOC 
> -DARROW_JEMALLOC_INCLUDE_DIR=/home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src/jemalloc_ep/dist//include
>  -DARROW_WITH_BROTLI -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY -DARROW_WITH_ZLIB 
> -DARROW_WITH_ZSTD -isystem /home/wesm/cpp-toolchain/include -isystem 
> /home/wesm/code/arrow/cpp/build-test/double-conversion_ep/src/double-conversion_ep/include
>  -isystem /home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src 
> -isystem /home/wesm/code/arrow/cpp/thirdparty/hadoop/include 
> -I/home/wesm/code/arrow/cpp/build-test/src -I/home/wesm/code/arrow/cpp/src  
> -std=c++11  -Qunused-arguments -ggdb -O0  -Wall -Wno-unknown-warning-option 
> -msse3 -maltivec  -g   -std=gnu++11 -o 
> CMakeFiles/array-test.dir/array-test.cc.o -c 
> /home/wesm/code/arrow/cpp/src/arrow/array-test.cc
> [100%] Linking CXX executable ../../debug/array-test
> cd /home/wesm/code/arrow/cpp/build-test/src/arrow && 
> /home/wesm/cpp-toolchain/bin/cmake -E cmake_link_script 
> CMakeFiles/array-test.dir/link.txt --verbose=1
> /usr/bin/ccache /usr/bin/clang++-6.0   -std=c++11  -Qunused-arguments -ggdb 
> -O0  -Wall -Wno-unknown-warning-option -msse3 -maltivec  -g  -rdynamic 
> CMakeFiles/array-test.dir/array-test.cc.o  -o ../../debug/array-test 
> -Wl,-rpath,/home/wesm/cpp-toolchain/lib ../../debug/libarrow.a 
> /home/wesm/cpp-toolchain/lib/libgtest_main.a 
> /home/wesm/cpp-toolchain/lib/libgtest.a -ldl 
> /home/wesm/cpp-toolchain/lib/libglog.a /home/wesm/cpp-toolchain/lib/libglog.a 
> /home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libzstd.a 
> /home/wesm/cpp-toolchain/lib/libz.so /home/wesm/cpp-toolchain/lib/libz.so 
> /home/wesm/cpp-toolchain/lib/libsnappy.a 
> /home/wesm/cpp-toolchain/lib/libsnappy.a 
> /home/wesm/cpp-toolchain/lib/liblz4.a /home/wesm/cpp-toolchain/lib/liblz4.a 
> /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a 
> ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a
>  
> ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a
>  /home/wesm/cpp-toolchain/lib/libboost_system.so 
> /home/wesm/cpp-toolchain/lib/libboost_filesystem.so 
> /home/wesm/cpp-toolchain/lib/libboost_regex.so 
> ../../jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a 
> -lpthread -lrt /usr/lib/x86_64-linux-gnu/libpthread.so 
> /home/wesm/cpp-toolchain/lib/libgtest_main.a 
> /home/wesm/cpp-toolchain/lib/libgtest.a 
> {code}
> Note how some of the static libraries are passed multiple times



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3745) [C++] CMake passes static libraries multiple times to linker

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-3745:

Fix Version/s: 0.12.0

> [C++] CMake passes static libraries multiple times to linker
> 
>
> Key: ARROW-3745
> URL: https://issues.apache.org/jira/browse/ARROW-3745
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C#, C++
>Affects Versions: 0.11.1
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> With {{make array-test}} I see
> {code}
> [ 97%] Building CXX object src/arrow/CMakeFiles/array-test.dir/array-test.cc.o
> cd /home/wesm/code/arrow/cpp/build-test/src/arrow && /usr/bin/ccache 
> /usr/bin/clang++-6.0  -DARROW_JEMALLOC 
> -DARROW_JEMALLOC_INCLUDE_DIR=/home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src/jemalloc_ep/dist//include
>  -DARROW_WITH_BROTLI -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY -DARROW_WITH_ZLIB 
> -DARROW_WITH_ZSTD -isystem /home/wesm/cpp-toolchain/include -isystem 
> /home/wesm/code/arrow/cpp/build-test/double-conversion_ep/src/double-conversion_ep/include
>  -isystem /home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src 
> -isystem /home/wesm/code/arrow/cpp/thirdparty/hadoop/include 
> -I/home/wesm/code/arrow/cpp/build-test/src -I/home/wesm/code/arrow/cpp/src  
> -std=c++11  -Qunused-arguments -ggdb -O0  -Wall -Wno-unknown-warning-option 
> -msse3 -maltivec  -g   -std=gnu++11 -o 
> CMakeFiles/array-test.dir/array-test.cc.o -c 
> /home/wesm/code/arrow/cpp/src/arrow/array-test.cc
> [100%] Linking CXX executable ../../debug/array-test
> cd /home/wesm/code/arrow/cpp/build-test/src/arrow && 
> /home/wesm/cpp-toolchain/bin/cmake -E cmake_link_script 
> CMakeFiles/array-test.dir/link.txt --verbose=1
> /usr/bin/ccache /usr/bin/clang++-6.0   -std=c++11  -Qunused-arguments -ggdb 
> -O0  -Wall -Wno-unknown-warning-option -msse3 -maltivec  -g  -rdynamic 
> CMakeFiles/array-test.dir/array-test.cc.o  -o ../../debug/array-test 
> -Wl,-rpath,/home/wesm/cpp-toolchain/lib ../../debug/libarrow.a 
> /home/wesm/cpp-toolchain/lib/libgtest_main.a 
> /home/wesm/cpp-toolchain/lib/libgtest.a -ldl 
> /home/wesm/cpp-toolchain/lib/libglog.a /home/wesm/cpp-toolchain/lib/libglog.a 
> /home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libzstd.a 
> /home/wesm/cpp-toolchain/lib/libz.so /home/wesm/cpp-toolchain/lib/libz.so 
> /home/wesm/cpp-toolchain/lib/libsnappy.a 
> /home/wesm/cpp-toolchain/lib/libsnappy.a 
> /home/wesm/cpp-toolchain/lib/liblz4.a /home/wesm/cpp-toolchain/lib/liblz4.a 
> /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a 
> ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a
>  
> ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a
>  /home/wesm/cpp-toolchain/lib/libboost_system.so 
> /home/wesm/cpp-toolchain/lib/libboost_filesystem.so 
> /home/wesm/cpp-toolchain/lib/libboost_regex.so 
> ../../jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a 
> -lpthread -lrt /usr/lib/x86_64-linux-gnu/libpthread.so 
> /home/wesm/cpp-toolchain/lib/libgtest_main.a 
> /home/wesm/cpp-toolchain/lib/libgtest.a 
> {code}
> Note how some of the static libraries are passed multiple times



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3742) Fix pyarrow.types & gandiva cython bindings

2018-11-09 Thread Philipp Moritz (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philipp Moritz resolved ARROW-3742.
---
   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2931
[https://github.com/apache/arrow/pull/2931]

> Fix pyarrow.types & gandiva cython bindings
> ---
>
> Key: ARROW-3742
> URL: https://issues.apache.org/jira/browse/ARROW-3742
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Gandiva, Python
>Reporter: Siyuan Zhuang
>Assignee: Siyuan Zhuang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> 1. 'types.py' didn't export `_as_type`, causing failures in certain 
> cython/python combinations. I am surprised to see that the CI didn't fail.
> 2. After updating the gandiva cpp part (ARROW-3587), the cython bindings 
> (ARROW-3602) are not consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-2673) [Python] Add documentation + docstring for ARROW-2661

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-2673.
-
Resolution: Fixed

Issue resolved by pull request 2893
[https://github.com/apache/arrow/pull/2893]

> [Python] Add documentation + docstring for ARROW-2661
> -
>
> Key: ARROW-2673
> URL: https://issues.apache.org/jira/browse/ARROW-2673
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Matt Topol
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3744) [Ruby] Use garrow_table_to_string() in Arrow::Table#to_s

2018-11-09 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3744:
---

 Summary: [Ruby] Use garrow_table_to_string() in Arrow::Table#to_s
 Key: ARROW-3744
 URL: https://issues.apache.org/jira/browse/ARROW-3744
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Ruby
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3744) [Ruby] Use garrow_table_to_string() in Arrow::Table#to_s

2018-11-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3744:
--
Labels: pull-request-available  (was: )

> [Ruby] Use garrow_table_to_string() in Arrow::Table#to_s
> 
>
> Key: ARROW-3744
> URL: https://issues.apache.org/jira/browse/ARROW-3744
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Ruby
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3749) [GLib] Typos in documentation and test case name

2018-11-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3749:
--
Labels: pull-request-available  (was: )

> [GLib] Typos in documentation and test case name
> 
>
> Key: ARROW-3749
> URL: https://issues.apache.org/jira/browse/ARROW-3749
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Trivial
>  Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3748) [GLib] Add GArrowCSVReader

2018-11-09 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3748:
---

 Summary: [GLib] Add GArrowCSVReader
 Key: ARROW-3748
 URL: https://issues.apache.org/jira/browse/ARROW-3748
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3748) [GLib] Add GArrowCSVReader

2018-11-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3748:
--
Labels: pull-request-available  (was: )

> [GLib] Add GArrowCSVReader
> --
>
> Key: ARROW-3748
> URL: https://issues.apache.org/jira/browse/ARROW-3748
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3716) [R] Missing cases for ChunkedArray conversion

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-3716.
-
   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2928
[https://github.com/apache/arrow/pull/2928]

> [R] Missing cases for ChunkedArray conversion
> -
>
> Key: ARROW-3716
> URL: https://issues.apache.org/jira/browse/ARROW-3716
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Romain François
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
>  library(arrow)
>  tab <- table(iris)
>  tab$schema()
>  #> arrow::Schema 
>  #> Sepal.Length: double
>  #> Sepal.Width: double
>  #> Petal.Length: double
>  #> Petal.Width: double
>  #> Species: dictionary
> as_tibble(tab)
>  #> Error in Table__to_dataframe(x): cannot handle Array of type 26
>  # simpler reprex:
>  a <- chunked_array(factor(c("a", "b")))
>  a$as_vector()
>  #> Error in ChunkedArray__as_vector(self): cannot handle Array of type 26
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3716) [R] Missing cases for ChunkedArray conversion

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3716:
---

Assignee: Romain François

> [R] Missing cases for ChunkedArray conversion
> -
>
> Key: ARROW-3716
> URL: https://issues.apache.org/jira/browse/ARROW-3716
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Romain François
>Assignee: Romain François
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> {code}
>  library(arrow)
>  tab <- table(iris)
>  tab$schema()
>  #> arrow::Schema 
>  #> Sepal.Length: double
>  #> Sepal.Width: double
>  #> Petal.Length: double
>  #> Petal.Width: double
>  #> Species: dictionary
> as_tibble(tab)
>  #> Error in Table__to_dataframe(x): cannot handle Array of type 26
>  # simpler reprex:
>  a <- chunked_array(factor(c("a", "b")))
>  a$as_vector()
>  #> Error in ChunkedArray__as_vector(self): cannot handle Array of type 26
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3742) Fix pyarrow.types & gandiva cython bindings

2018-11-09 Thread Siyuan Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyuan Zhuang updated ARROW-3742:
-
Description: 
1. 'types.py' didn't export `_as_type`, causing failures in certain 
cython/python combinations. I am surprised to see that the CI didn't fail.
2. After updating the gandiva cpp part (ARROW-3587), the cython bindings 
(ARROW-3602) are not consistent.

  was:After updating the gandiva cpp part (ARROW-3587), the cython bindings 
(ARROW-3602) are not consistent.

Summary: Fix pyarrow.types & gandiva cython bindings  (was: Fix gandiva 
cython bindings)

> Fix pyarrow.types & gandiva cython bindings
> ---
>
> Key: ARROW-3742
> URL: https://issues.apache.org/jira/browse/ARROW-3742
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Siyuan Zhuang
>Assignee: Siyuan Zhuang
>Priority: Major
>
> 1. 'types.py' didn't export `_as_type`, causing failures in certain 
> cython/python combinations. I am surprised to see that the CI didn't fail.
> 2. After updating the gandiva cpp part (ARROW-3587), the cython bindings 
> (ARROW-3602) are not consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (ARROW-3476) [Java] mvn test in memory fails on a big-endian platform

2018-11-09 Thread Kazuaki Ishizaki (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682228#comment-16682228
 ] 

Kazuaki Ishizaki edited comment on ARROW-3476 at 11/10/18 6:13 AM:
---

[~wesmckinn] Now, kou, pitrou, xhochy, kszucs, cploud, and I (@kiszk) can 
access https://ibmz-ci.osuosl.org/.


was (Author: kiszk):
[~wesmckinn] Now, kou, pitrou, xhochy, kszucs, cploud, and me (@kiszk) can 
access https://ibmz-ci.osuosl.org/.

> [Java] mvn test in memory fails on a big-endian platform
> 
>
> Key: ARROW-3476
> URL: https://issues.apache.org/jira/browse/ARROW-3476
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java
>Reporter: Kazuaki Ishizaki
>Priority: Major
>
> Apache Arrow is becoming commonplace to exchange data among important 
> emerging analytics frameworks such as Pandas, Numpy, and Spark.
> [IBM Z|https://en.wikipedia.org/wiki/IBM_Z] is one of platforms to process 
> critical transactions such as bank or credit card. Users of IBM Z want to 
> extract insights from these transactions using the emerging analytics systems 
> on IBM Z Linux. These analytics pipelines can be also fast and effective on 
> IBM Z Linux by using Apache Arrow on memory.
> From the technical perspective, since IBM Z Linux uses big-endian data 
> format, it is not possible to use Apache Arrow in this pipeline. If Apache 
> Arrow could support big-endian, the use case would be expanded.
> When I ran test case of Apache arrow on a big-endian platform (ppc64be), 
> {{mvn test}} in memory causes a failure due to an assertion.
> In {{TestEndianess.testLittleEndian}} test suite, the assertion occurs during 
> an allocation of a {{RootAllocator}} class.
> {code}
> $ uname -a
> Linux ppc64be.novalocal 4.5.7-300.fc24.ppc64 #1 SMP Fri Jun 10 20:29:32 UTC 
> 2016 ppc64 ppc64 ppc64 GNU/Linux
> $ arch  
> ppc64
> $ cd java/memory
> $ mvn test
> [INFO] Scanning for projects...
> [INFO]
>  
> [INFO] 
> 
> [INFO] Building Arrow Memory 0.12.0-SNAPSHOT
> [INFO] 
> 
> [INFO] 
> ...
> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.082 
> s - in org.apache.arrow.memory.TestAccountant
> [INFO] Running org.apache.arrow.memory.TestLowCostIdentityHashMap
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 
> s - in org.apache.arrow.memory.TestLowCostIdentityHashMap
> [INFO] Running org.apache.arrow.memory.TestBaseAllocator
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.746 
> s <<< FAILURE! - in org.apache.arrow.memory.TestEndianess
> [ERROR] testLittleEndian(org.apache.arrow.memory.TestEndianess)  Time 
> elapsed: 0.313 s  <<< ERROR!
> java.lang.ExceptionInInitializerError
>   at 
> org.apache.arrow.memory.TestEndianess.testLittleEndian(TestEndianess.java:31)
> Caused by: java.lang.IllegalStateException: Arrow only runs on LittleEndian 
> systems.
>   at 
> org.apache.arrow.memory.TestEndianess.testLittleEndian(TestEndianess.java:31)
> [ERROR] Tests run: 22, Failures: 0, Errors: 21, Skipped: 1, Time elapsed: 
> 0.055 s <<< FAILURE! - in org.apache.arrow.memory.TestBaseAllocator
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3743) [Ruby] Add support for saving/loading Feather

2018-11-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3743:
--
Labels: pull-request-available  (was: )

> [Ruby] Add support for saving/loading Feather
> -
>
> Key: ARROW-3743
> URL: https://issues.apache.org/jira/browse/ARROW-3743
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Ruby
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3743) [Ruby] Add support for saving/loading Feather

2018-11-09 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3743:
---

 Summary: [Ruby] Add support for saving/loading Feather
 Key: ARROW-3743
 URL: https://issues.apache.org/jira/browse/ARROW-3743
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Ruby
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3749) [GLib] Typos in documentation and test case name

2018-11-09 Thread Kouhei Sutou (JIRA)
Kouhei Sutou created ARROW-3749:
---

 Summary: [GLib] Typos in documentation and test case name
 Key: ARROW-3749
 URL: https://issues.apache.org/jira/browse/ARROW-3749
 Project: Apache Arrow
  Issue Type: Improvement
  Components: GLib
Reporter: Kouhei Sutou
Assignee: Kouhei Sutou






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3746) [Gandiva] [Python] Make it possible to list all functions registered with Gandiva

2018-11-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3746:
--
Labels: pull-request-available  (was: )

> [Gandiva] [Python] Make it possible to list all functions registered with 
> Gandiva
> -
>
> Key: ARROW-3746
> URL: https://issues.apache.org/jira/browse/ARROW-3746
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Philipp Moritz
>Priority: Major
>  Labels: pull-request-available
>
> This will also be useful for documentation purposes (right now it is not very 
> easy to get a list of all the functions that are registered).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3745) [C++] CMake passes static libraries multiple times to linker

2018-11-09 Thread Wes McKinney (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682208#comment-16682208
 ] 

Wes McKinney commented on ARROW-3745:
-

I found the problem. Libraries are linking to themselves here

https://github.com/apache/arrow/blob/master/cpp/cmake_modules/BuildUtils.cmake#L70

I will remove in https://github.com/apache/arrow/pull/2735 and see if that 
doesn't break anything

> [C++] CMake passes static libraries multiple times to linker
> 
>
> Key: ARROW-3745
> URL: https://issues.apache.org/jira/browse/ARROW-3745
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C#, C++
>Affects Versions: 0.11.1
>Reporter: Wes McKinney
>Priority: Major
>
> With {{make array-test}} I see
> {code}
> [ 97%] Building CXX object src/arrow/CMakeFiles/array-test.dir/array-test.cc.o
> cd /home/wesm/code/arrow/cpp/build-test/src/arrow && /usr/bin/ccache 
> /usr/bin/clang++-6.0  -DARROW_JEMALLOC 
> -DARROW_JEMALLOC_INCLUDE_DIR=/home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src/jemalloc_ep/dist//include
>  -DARROW_WITH_BROTLI -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY -DARROW_WITH_ZLIB 
> -DARROW_WITH_ZSTD -isystem /home/wesm/cpp-toolchain/include -isystem 
> /home/wesm/code/arrow/cpp/build-test/double-conversion_ep/src/double-conversion_ep/include
>  -isystem /home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src 
> -isystem /home/wesm/code/arrow/cpp/thirdparty/hadoop/include 
> -I/home/wesm/code/arrow/cpp/build-test/src -I/home/wesm/code/arrow/cpp/src  
> -std=c++11  -Qunused-arguments -ggdb -O0  -Wall -Wno-unknown-warning-option 
> -msse3 -maltivec  -g   -std=gnu++11 -o 
> CMakeFiles/array-test.dir/array-test.cc.o -c 
> /home/wesm/code/arrow/cpp/src/arrow/array-test.cc
> [100%] Linking CXX executable ../../debug/array-test
> cd /home/wesm/code/arrow/cpp/build-test/src/arrow && 
> /home/wesm/cpp-toolchain/bin/cmake -E cmake_link_script 
> CMakeFiles/array-test.dir/link.txt --verbose=1
> /usr/bin/ccache /usr/bin/clang++-6.0   -std=c++11  -Qunused-arguments -ggdb 
> -O0  -Wall -Wno-unknown-warning-option -msse3 -maltivec  -g  -rdynamic 
> CMakeFiles/array-test.dir/array-test.cc.o  -o ../../debug/array-test 
> -Wl,-rpath,/home/wesm/cpp-toolchain/lib ../../debug/libarrow.a 
> /home/wesm/cpp-toolchain/lib/libgtest_main.a 
> /home/wesm/cpp-toolchain/lib/libgtest.a -ldl 
> /home/wesm/cpp-toolchain/lib/libglog.a /home/wesm/cpp-toolchain/lib/libglog.a 
> /home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libzstd.a 
> /home/wesm/cpp-toolchain/lib/libz.so /home/wesm/cpp-toolchain/lib/libz.so 
> /home/wesm/cpp-toolchain/lib/libsnappy.a 
> /home/wesm/cpp-toolchain/lib/libsnappy.a 
> /home/wesm/cpp-toolchain/lib/liblz4.a /home/wesm/cpp-toolchain/lib/liblz4.a 
> /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a 
> /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a 
> ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a
>  
> ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a
>  /home/wesm/cpp-toolchain/lib/libboost_system.so 
> /home/wesm/cpp-toolchain/lib/libboost_filesystem.so 
> /home/wesm/cpp-toolchain/lib/libboost_regex.so 
> ../../jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a 
> -lpthread -lrt /usr/lib/x86_64-linux-gnu/libpthread.so 
> /home/wesm/cpp-toolchain/lib/libgtest_main.a 
> /home/wesm/cpp-toolchain/lib/libgtest.a 
> {code}
> Note how some of the static libraries are passed multiple times



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3476) [Java] mvn test in memory fails on a big-endian platform

2018-11-09 Thread Kazuaki Ishizaki (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682228#comment-16682228
 ] 

Kazuaki Ishizaki commented on ARROW-3476:
-

[~wesmckinn] Now, kou, pitrou, xhochy, kszucs, cploud, and me (@kiszk) can 
access https://ibmz-ci.osuosl.org/.

> [Java] mvn test in memory fails on a big-endian platform
> 
>
> Key: ARROW-3476
> URL: https://issues.apache.org/jira/browse/ARROW-3476
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Java
>Reporter: Kazuaki Ishizaki
>Priority: Major
>
> Apache Arrow is becoming commonplace to exchange data among important 
> emerging analytics frameworks such as Pandas, Numpy, and Spark.
> [IBM Z|https://en.wikipedia.org/wiki/IBM_Z] is one of platforms to process 
> critical transactions such as bank or credit card. Users of IBM Z want to 
> extract insights from these transactions using the emerging analytics systems 
> on IBM Z Linux. These analytics pipelines can be also fast and effective on 
> IBM Z Linux by using Apache Arrow on memory.
> From the technical perspective, since IBM Z Linux uses big-endian data 
> format, it is not possible to use Apache Arrow in this pipeline. If Apache 
> Arrow could support big-endian, the use case would be expanded.
> When I ran test case of Apache arrow on a big-endian platform (ppc64be), 
> {{mvn test}} in memory causes a failure due to an assertion.
> In {{TestEndianess.testLittleEndian}} test suite, the assertion occurs during 
> an allocation of a {{RootAllocator}} class.
> {code}
> $ uname -a
> Linux ppc64be.novalocal 4.5.7-300.fc24.ppc64 #1 SMP Fri Jun 10 20:29:32 UTC 
> 2016 ppc64 ppc64 ppc64 GNU/Linux
> $ arch  
> ppc64
> $ cd java/memory
> $ mvn test
> [INFO] Scanning for projects...
> [INFO]
>  
> [INFO] 
> 
> [INFO] Building Arrow Memory 0.12.0-SNAPSHOT
> [INFO] 
> 
> [INFO] 
> ...
> [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.082 
> s - in org.apache.arrow.memory.TestAccountant
> [INFO] Running org.apache.arrow.memory.TestLowCostIdentityHashMap
> [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 
> s - in org.apache.arrow.memory.TestLowCostIdentityHashMap
> [INFO] Running org.apache.arrow.memory.TestBaseAllocator
> [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.746 
> s <<< FAILURE! - in org.apache.arrow.memory.TestEndianess
> [ERROR] testLittleEndian(org.apache.arrow.memory.TestEndianess)  Time 
> elapsed: 0.313 s  <<< ERROR!
> java.lang.ExceptionInInitializerError
>   at 
> org.apache.arrow.memory.TestEndianess.testLittleEndian(TestEndianess.java:31)
> Caused by: java.lang.IllegalStateException: Arrow only runs on LittleEndian 
> systems.
>   at 
> org.apache.arrow.memory.TestEndianess.testLittleEndian(TestEndianess.java:31)
> [ERROR] Tests run: 22, Failures: 0, Errors: 21, Skipped: 1, Time elapsed: 
> 0.055 s <<< FAILURE! - in org.apache.arrow.memory.TestBaseAllocator
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3613) [Go] Resize does not correctly update the length

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3613:
---

Assignee: Jonathan A Sternberg

> [Go] Resize does not correctly update the length
> 
>
> Key: ARROW-3613
> URL: https://issues.apache.org/jira/browse/ARROW-3613
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Reporter: Jonathan A Sternberg
>Assignee: Jonathan A Sternberg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> If you have the following code:
> {code:java}
> package main
> import (
> "fmt"
> "github.com/apache/arrow/go/arrow/array"
> "github.com/apache/arrow/go/arrow/memory"
> )
> func main() {
> builder := array.NewFloat64Builder(memory.DefaultAllocator)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> for i := 0; i < 44; i++ {
> builder.Append(0)
> }
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> }
> {code}
> It gives the following output:
> {code:java}
> 0 0
> 0 64
> 0 32
> 44 64
> 44 32
> {code}
> For whatever reason, the length is not recorded as 5. I understand why the 
> capacity might not be 5, but it does seem like the length should be set to 5 
> if the array is resized to a length smaller than its current capacity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3407) [C++] Add UTF8 conversion modes in CSV reader conversion options

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3407:
---

Assignee: Antoine Pitrou

> [C++] Add UTF8 conversion modes in CSV reader conversion options
> 
>
> Key: ARROW-3407
> URL: https://issues.apache.org/jira/browse/ARROW-3407
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Antoine Pitrou
>Priority: Major
>  Labels: csv, pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There should be a few options:
> * Assume UTF8, but do not verify ("no seatbelts mode", for users that have 
> reasonable security about UTF8 and want the maximum performance)
> * Full UTF8 verification
> * Maybe ASCII-only verification (because ASCII verification is very fast)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3407) [C++] Add UTF8 conversion modes in CSV reader conversion options

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-3407.
-
Resolution: Fixed

Issue resolved by pull request 2924
[https://github.com/apache/arrow/pull/2924]

> [C++] Add UTF8 conversion modes in CSV reader conversion options
> 
>
> Key: ARROW-3407
> URL: https://issues.apache.org/jira/browse/ARROW-3407
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
>  Labels: csv, pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> There should be a few options:
> * Assume UTF8, but do not verify ("no seatbelts mode", for users that have 
> reasonable security about UTF8 and want the maximum performance)
> * Full UTF8 verification
> * Maybe ASCII-only verification (because ASCII verification is very fast)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3742) Fix gandiva cython bindings

2018-11-09 Thread Siyuan Zhuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siyuan Zhuang reassigned ARROW-3742:


Assignee: Siyuan Zhuang

> Fix gandiva cython bindings
> ---
>
> Key: ARROW-3742
> URL: https://issues.apache.org/jira/browse/ARROW-3742
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Siyuan Zhuang
>Assignee: Siyuan Zhuang
>Priority: Major
>
> After updating the gandiva cpp part (ARROW-3587), the cython bindings 
> (ARROW-3602) are not consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3738) [C++] Add CSV conversion option to parse ISO8601-like timestamp strings

2018-11-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3738:
---

 Summary: [C++] Add CSV conversion option to parse ISO8601-like 
timestamp strings
 Key: ARROW-3738
 URL: https://issues.apache.org/jira/browse/ARROW-3738
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++
Reporter: Wes McKinney


See similar functionality in other libraries. I believe pandas has a fast path 
for iso8601



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3739) [C++] Add option to convert a particular column to timestamps or dates using a passed strptime-compatible string

2018-11-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3739:
---

 Summary: [C++] Add option to convert a particular column to 
timestamps or dates using a passed strptime-compatible string
 Key: ARROW-3739
 URL: https://issues.apache.org/jira/browse/ARROW-3739
 Project: Apache Arrow
  Issue Type: New Feature
  Components: C++
Reporter: Wes McKinney


Probably will need something like

{code}
...
types={'date_col': csv.convert_date('%Y%m%d')}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3613) [Go] Resize does not correctly update the length

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-3613.
-
   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2927
[https://github.com/apache/arrow/pull/2927]

> [Go] Resize does not correctly update the length
> 
>
> Key: ARROW-3613
> URL: https://issues.apache.org/jira/browse/ARROW-3613
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Reporter: Jonathan A Sternberg
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> If you have the following code:
> {code:java}
> package main
> import (
> "fmt"
> "github.com/apache/arrow/go/arrow/array"
> "github.com/apache/arrow/go/arrow/memory"
> )
> func main() {
> builder := array.NewFloat64Builder(memory.DefaultAllocator)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> for i := 0; i < 44; i++ {
> builder.Append(0)
> }
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> }
> {code}
> It gives the following output:
> {code:java}
> 0 0
> 0 64
> 0 32
> 44 64
> 44 32
> {code}
> For whatever reason, the length is not recorded as 5. I understand why the 
> capacity might not be 5, but it does seem like the length should be set to 5 
> if the array is resized to a length smaller than its current capacity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3740) [C++] Calling ArrayBuilder::Resize with length smaller than current appended length results in invalid state

2018-11-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3740:
---

 Summary: [C++] Calling ArrayBuilder::Resize with length smaller 
than current appended length results in invalid state
 Key: ARROW-3740
 URL: https://issues.apache.org/jira/browse/ARROW-3740
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.11.1
Reporter: Wes McKinney
 Fix For: 0.12.0


This was brought up by the Go patch ARROW-3613. If you append some data to a 
builder, then call {{Resize}} with something smaller than what's reported by 
{{length()}}, the capacity will be updated, but the length will not. So I think 
further appends would probably segfault. Either way we should add some tests 
for this case of "shrinking" a builder (which destroys data, but it's permitted 
by the API 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3741) [R] Add support for arrow::compute::Cast to convert Arrow arrays from one type to another

2018-11-09 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-3741:
---

 Summary: [R] Add support for arrow::compute::Cast to convert Arrow 
arrays from one type to another
 Key: ARROW-3741
 URL: https://issues.apache.org/jira/browse/ARROW-3741
 Project: Apache Arrow
  Issue Type: New Feature
  Components: R
Reporter: Wes McKinney


See {{pyarrow.Array.cast}} and {{pyarrow.Table.cast}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3613) [Go] Resize does not correctly update the length

2018-11-09 Thread Alexandre Crayssac (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681230#comment-16681230
 ] 

Alexandre Crayssac commented on ARROW-3613:
---

Ok, still investigating on it since it looks like the bug has others 
ramifications.

> [Go] Resize does not correctly update the length
> 
>
> Key: ARROW-3613
> URL: https://issues.apache.org/jira/browse/ARROW-3613
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Reporter: Jonathan A Sternberg
>Priority: Major
>
> If you have the following code:
> {code:java}
> package main
> import (
> "fmt"
> "github.com/apache/arrow/go/arrow/array"
> "github.com/apache/arrow/go/arrow/memory"
> )
> func main() {
> builder := array.NewFloat64Builder(memory.DefaultAllocator)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> for i := 0; i < 44; i++ {
> builder.Append(0)
> }
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> }
> {code}
> It gives the following output:
> {code:java}
> 0 0
> 0 64
> 0 32
> 44 64
> 44 32
> {code}
> For whatever reason, the length is not recorded as 5. I understand why the 
> capacity might not be 5, but it does seem like the length should be set to 5 
> if the array is resized to a length smaller than its current capacity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64

2018-11-09 Thread Dimitri Vorona (JIRA)
Dimitri Vorona created ARROW-3734:
-

 Summary: [C++] Linking static zstd library fails on Arch x86-64
 Key: ARROW-3734
 URL: https://issues.apache.org/jira/browse/ARROW-3734
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Affects Versions: 0.12.0
Reporter: Dimitri Vorona


zlib install the static library into the ${CMAKE_INSTALL_LIBDIR} which is lib64 
on 64-bit systems. We should also look at this path when we're linking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64

2018-11-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3734:
--
Labels: pull-request-available  (was: )

> [C++] Linking static zstd library fails on Arch x86-64
> --
>
> Key: ARROW-3734
> URL: https://issues.apache.org/jira/browse/ARROW-3734
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.12.0
>Reporter: Dimitri Vorona
>Priority: Major
>  Labels: pull-request-available
>
> zlib install the static library into the ${CMAKE_INSTALL_LIBDIR} which is 
> lib64 on 64-bit systems. We should also look at this path when we're linking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64

2018-11-09 Thread Dimitri Vorona (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dimitri Vorona updated ARROW-3734:
--
Description: zlib installs the static library into the 
${CMAKE_INSTALL_LIBDIR} which is lib64 on 64-bit systems. We should also look 
at this path when we're linking.  (was: zlib install the static library into 
the ${CMAKE_INSTALL_LIBDIR} which is lib64 on 64-bit systems. We should also 
look at this path when we're linking.)

> [C++] Linking static zstd library fails on Arch x86-64
> --
>
> Key: ARROW-3734
> URL: https://issues.apache.org/jira/browse/ARROW-3734
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.12.0
>Reporter: Dimitri Vorona
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> zlib installs the static library into the ${CMAKE_INSTALL_LIBDIR} which is 
> lib64 on 64-bit systems. We should also look at this path when we're linking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3733) [GLib] Add to_string() to GArrowTable and GArrowColumn

2018-11-09 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3733.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2925
[https://github.com/apache/arrow/pull/2925]

> [GLib] Add to_string() to GArrowTable and GArrowColumn
> --
>
> Key: ARROW-3733
> URL: https://issues.apache.org/jira/browse/ARROW-3733
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: GLib
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64

2018-11-09 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn reassigned ARROW-3734:
--

Assignee: Dimitri Vorona

> [C++] Linking static zstd library fails on Arch x86-64
> --
>
> Key: ARROW-3734
> URL: https://issues.apache.org/jira/browse/ARROW-3734
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.12.0
>Reporter: Dimitri Vorona
>Assignee: Dimitri Vorona
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> zlib installs the static library into the ${CMAKE_INSTALL_LIBDIR} which is 
> lib64 on 64-bit systems. We should also look at this path when we're linking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64

2018-11-09 Thread Uwe L. Korn (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uwe L. Korn resolved ARROW-3734.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2926
[https://github.com/apache/arrow/pull/2926]

> [C++] Linking static zstd library fails on Arch x86-64
> --
>
> Key: ARROW-3734
> URL: https://issues.apache.org/jira/browse/ARROW-3734
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.12.0
>Reporter: Dimitri Vorona
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> zlib installs the static library into the ${CMAKE_INSTALL_LIBDIR} which is 
> lib64 on 64-bit systems. We should also look at this path when we're linking.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3613) [Go] Resize does not correctly update the length

2018-11-09 Thread Alexandre Crayssac (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681405#comment-16681405
 ] 

Alexandre Crayssac commented on ARROW-3613:
---

Just submitted a PR : [https://github.com/apache/arrow/pull/2927]

Need review though.

> [Go] Resize does not correctly update the length
> 
>
> Key: ARROW-3613
> URL: https://issues.apache.org/jira/browse/ARROW-3613
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Reporter: Jonathan A Sternberg
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If you have the following code:
> {code:java}
> package main
> import (
> "fmt"
> "github.com/apache/arrow/go/arrow/array"
> "github.com/apache/arrow/go/arrow/memory"
> )
> func main() {
> builder := array.NewFloat64Builder(memory.DefaultAllocator)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> for i := 0; i < 44; i++ {
> builder.Append(0)
> }
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> }
> {code}
> It gives the following output:
> {code:java}
> 0 0
> 0 64
> 0 32
> 44 64
> 44 32
> {code}
> For whatever reason, the length is not recorded as 5. I understand why the 
> capacity might not be 5, but it does seem like the length should be set to 5 
> if the array is resized to a length smaller than its current capacity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3613) [Go] Resize does not correctly update the length

2018-11-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3613:
--
Labels: pull-request-available  (was: )

> [Go] Resize does not correctly update the length
> 
>
> Key: ARROW-3613
> URL: https://issues.apache.org/jira/browse/ARROW-3613
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Go
>Reporter: Jonathan A Sternberg
>Priority: Major
>  Labels: pull-request-available
>
> If you have the following code:
> {code:java}
> package main
> import (
> "fmt"
> "github.com/apache/arrow/go/arrow/array"
> "github.com/apache/arrow/go/arrow/memory"
> )
> func main() {
> builder := array.NewFloat64Builder(memory.DefaultAllocator)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> builder.Reserve(44)
> for i := 0; i < 44; i++ {
> builder.Append(0)
> }
> fmt.Println(builder.Len(), builder.Cap())
> builder.Resize(5)
> fmt.Println(builder.Len(), builder.Cap())
> }
> {code}
> It gives the following output:
> {code:java}
> 0 0
> 0 64
> 0 32
> 44 64
> 44 32
> {code}
> For whatever reason, the length is not recorded as 5. I understand why the 
> capacity might not be 5, but it does seem like the length should be set to 5 
> if the array is resized to a length smaller than its current capacity.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3698) [C++] Segmentation fault when using a large table in Gandiva

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-3698.
-
   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2902
[https://github.com/apache/arrow/pull/2902]

> [C++] Segmentation fault when using a large table in Gandiva
> 
>
> Key: ARROW-3698
> URL: https://issues.apache.org/jira/browse/ARROW-3698
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Gandiva
>Reporter: Siyuan Zhuang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> {code}
> >>> import pyarrow as pa
> Registry has 519 pre-compiled functions
> >>> import pandas as pd
> >>> import numpy as np
> >>> import pyarrow.gandiva as gandiva
> >>> import timeit
> >>>
> >>> from matplotlib import pyplot as plt
> >>> for scale in range(25, 26):
> ... frame_data = 1.0 * np.random.randint(0, 100, size=(2**scale, 2))
> ... df = pd.DataFrame(frame_data).add_prefix("col")
> ... table = pa.Table.from_pandas(df)
> ...
> >>>
> >>> def float64_add(table):
> ... builder = gandiva.TreeExprBuilder()
> ... node_a = builder.make_field(table.schema.field_by_name("col0"))
> ... node_b = builder.make_field(table.schema.field_by_name("col1"))
> ... sum = builder.make_function(b"add", [node_a, node_b], pa.float64())
> ... field_result = pa.field("c", pa.float64())
> ... expr = builder.make_expression(sum, field_result)
> ... projector = gandiva.make_projector(table.schema, [expr], 
> pa.default_memory_pool())
> ... return projector
> ...
> >>> projector = float64_add(table)
> >>> projector.evaluate(table.to_batches()[0])
> [1] 36393 segmentation fault python{code}
> It is because there is an integer overflow in Gandiva:
> [https://github.com/apache/arrow/blob/1a6545aa51f5f41f0233ee0a11ef87d21127c5ed/cpp/src/gandiva/projector.cc#L141]
> It should be `int64_t` instead of `int`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ARROW-3698) [C++] Segmentation fault when using a large table in Gandiva

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3698:
---

Assignee: Siyuan Zhuang

> [C++] Segmentation fault when using a large table in Gandiva
> 
>
> Key: ARROW-3698
> URL: https://issues.apache.org/jira/browse/ARROW-3698
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Gandiva
>Reporter: Siyuan Zhuang
>Assignee: Siyuan Zhuang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> {code}
> >>> import pyarrow as pa
> Registry has 519 pre-compiled functions
> >>> import pandas as pd
> >>> import numpy as np
> >>> import pyarrow.gandiva as gandiva
> >>> import timeit
> >>>
> >>> from matplotlib import pyplot as plt
> >>> for scale in range(25, 26):
> ... frame_data = 1.0 * np.random.randint(0, 100, size=(2**scale, 2))
> ... df = pd.DataFrame(frame_data).add_prefix("col")
> ... table = pa.Table.from_pandas(df)
> ...
> >>>
> >>> def float64_add(table):
> ... builder = gandiva.TreeExprBuilder()
> ... node_a = builder.make_field(table.schema.field_by_name("col0"))
> ... node_b = builder.make_field(table.schema.field_by_name("col1"))
> ... sum = builder.make_function(b"add", [node_a, node_b], pa.float64())
> ... field_result = pa.field("c", pa.float64())
> ... expr = builder.make_expression(sum, field_result)
> ... projector = gandiva.make_projector(table.schema, [expr], 
> pa.default_memory_pool())
> ... return projector
> ...
> >>> projector = float64_add(table)
> >>> projector.evaluate(table.to_batches()[0])
> [1] 36393 segmentation fault python{code}
> It is because there is an integer overflow in Gandiva:
> [https://github.com/apache/arrow/blob/1a6545aa51f5f41f0233ee0a11ef87d21127c5ed/cpp/src/gandiva/projector.cc#L141]
> It should be `int64_t` instead of `int`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3735) [Python] Proper error handling in _ensure_type

2018-11-09 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-3735:
--

 Summary: [Python] Proper error handling in _ensure_type
 Key: ARROW-3735
 URL: https://issues.apache.org/jira/browse/ARROW-3735
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Python
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


We have multiple _ensure_type like functions, the in defined in array.pxi 
bypasses None which causes segfault in the following example:

{code}
pa.array([1, 2, 3]).cast(None)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-3701) [Gandiva] Add support for decimal operations

2018-11-09 Thread Pindikura Ravindra (JIRA)


[ 
https://issues.apache.org/jira/browse/ARROW-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681704#comment-16681704
 ] 

Pindikura Ravindra commented on ARROW-3701:
---

@wesm - I tried the same cmd on my windows 10 home desktop using llc from 
llvm-4.1. It worked fine.

> [Gandiva] Add support for decimal operations
> 
>
> Key: ARROW-3701
> URL: https://issues.apache.org/jira/browse/ARROW-3701
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Gandiva
>Reporter: Pindikura Ravindra
>Assignee: Pindikura Ravindra
>Priority: Major
>
> To begin with, will add support for 128-bit decimals. There are two parts :
>  # llvm_generator needs to understand decimal types (value, precision, scale)
>  # code decimal operations : add/subtract/multiply/divide/mod/..
>  ** This will be c++ code that can be pre-compiled to emit IR code



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3736) [CI/Docker] Ninja test in docker-compose run cpp hangs

2018-11-09 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-3736:
--

 Summary: [CI/Docker] Ninja test in docker-compose run cpp hangs
 Key: ARROW-3736
 URL: https://issues.apache.org/jira/browse/ARROW-3736
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ARROW-3737) [CI/Docker/Python] Support running integration tests on multiple python versions

2018-11-09 Thread Krisztian Szucs (JIRA)
Krisztian Szucs created ARROW-3737:
--

 Summary: [CI/Docker/Python] Support running integration tests on 
multiple python versions
 Key: ARROW-3737
 URL: https://issues.apache.org/jira/browse/ARROW-3737
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Continuous Integration, Python
Reporter: Krisztian Szucs
Assignee: Krisztian Szucs


Currently python-3.6 image is pinned in integration/hdfs/Dockerfile and 
integration/pandas-master/Dockerfile. It's possible to pass build time argument 
similarly like the arrow:python-${PYTHON_VERSION} image works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3611) Give error more quickly when pyarrow serialization context is used incorrectly.

2018-11-09 Thread Wes McKinney (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-3611.
-
   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2833
[https://github.com/apache/arrow/pull/2833]

> Give error more quickly when pyarrow serialization context is used 
> incorrectly.
> ---
>
> Key: ARROW-3611
> URL: https://issues.apache.org/jira/browse/ARROW-3611
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Robert Nishihara
>Assignee: Robert Nishihara
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When {{type_id}} is not a string or can't be cast to a string, 
> {{register_type}} will succeed, but {{_deserialize_callback}} can fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ARROW-3721) [Gandiva] [Python] Support all Gandiva literals

2018-11-09 Thread Krisztian Szucs (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs resolved ARROW-3721.

   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2920
[https://github.com/apache/arrow/pull/2920]

> [Gandiva] [Python] Support all Gandiva literals
> ---
>
> Key: ARROW-3721
> URL: https://issues.apache.org/jira/browse/ARROW-3721
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Philipp Moritz
>Assignee: Philipp Moritz
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Support all the literals from 
> [https://github.com/apache/arrow/blob/5b116ab175292fe70ed3c8727bcc6868b9695f4a/cpp/src/gandiva/tree_expr_builder.h#L35]
>  in the Cython bindings.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ARROW-3716) [R] Missing cases for ChunkedArray conversion

2018-11-09 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/ARROW-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3716:
--
Labels: pull-request-available  (was: )

> [R] Missing cases for ChunkedArray conversion
> -
>
> Key: ARROW-3716
> URL: https://issues.apache.org/jira/browse/ARROW-3716
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Romain François
>Priority: Major
>  Labels: pull-request-available
>
> {code}
>  library(arrow)
>  tab <- table(iris)
>  tab$schema()
>  #> arrow::Schema 
>  #> Sepal.Length: double
>  #> Sepal.Width: double
>  #> Petal.Length: double
>  #> Petal.Width: double
>  #> Species: dictionary
> as_tibble(tab)
>  #> Error in Table__to_dataframe(x): cannot handle Array of type 26
>  # simpler reprex:
>  a <- chunked_array(factor(c("a", "b")))
>  a$as_vector()
>  #> Error in ChunkedArray__as_vector(self): cannot handle Array of type 26
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)