[jira] [Created] (ARROW-3742) Fix gandiva cython bindings
Siyuan Zhuang created ARROW-3742: Summary: Fix gandiva cython bindings Key: ARROW-3742 URL: https://issues.apache.org/jira/browse/ARROW-3742 Project: Apache Arrow Issue Type: Bug Reporter: Siyuan Zhuang After updating the gandiva cpp part (ARROW-3587), the cython bindings (ARROW-3602) are not consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3742) Fix pyarrow.types & gandiva cython bindings
[ https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyuan Zhuang updated ARROW-3742: - Component/s: Python Gandiva > Fix pyarrow.types & gandiva cython bindings > --- > > Key: ARROW-3742 > URL: https://issues.apache.org/jira/browse/ARROW-3742 > Project: Apache Arrow > Issue Type: Bug > Components: Gandiva, Python >Reporter: Siyuan Zhuang >Assignee: Siyuan Zhuang >Priority: Major > > 1. 'types.py' didn't export `_as_type`, causing failures in certain > cython/python combinations. I am surprised to see that the CI didn't fail. > 2. After updating the gandiva cpp part (ARROW-3587), the cython bindings > (ARROW-3602) are not consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3742) Fix pyarrow.types & gandiva cython bindings
[ https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3742: -- Labels: pull-request-available (was: ) > Fix pyarrow.types & gandiva cython bindings > --- > > Key: ARROW-3742 > URL: https://issues.apache.org/jira/browse/ARROW-3742 > Project: Apache Arrow > Issue Type: Bug > Components: Gandiva, Python >Reporter: Siyuan Zhuang >Assignee: Siyuan Zhuang >Priority: Major > Labels: pull-request-available > > 1. 'types.py' didn't export `_as_type`, causing failures in certain > cython/python combinations. I am surprised to see that the CI didn't fail. > 2. After updating the gandiva cpp part (ARROW-3587), the cython bindings > (ARROW-3602) are not consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3746) [Gandiva] [Python] Make it possible to list all functions registered with Gandiva
Philipp Moritz created ARROW-3746: - Summary: [Gandiva] [Python] Make it possible to list all functions registered with Gandiva Key: ARROW-3746 URL: https://issues.apache.org/jira/browse/ARROW-3746 Project: Apache Arrow Issue Type: Improvement Reporter: Philipp Moritz This will also be useful for documentation purposes (right now it is not very easy to get a list of all the functions that are registered). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3745) [C++] CMake passes static libraries multiple times to linker
Wes McKinney created ARROW-3745: --- Summary: [C++] CMake passes static libraries multiple times to linker Key: ARROW-3745 URL: https://issues.apache.org/jira/browse/ARROW-3745 Project: Apache Arrow Issue Type: Bug Components: C#, C++ Affects Versions: 0.11.1 Reporter: Wes McKinney With {{make array-test}} I see {code} [ 97%] Building CXX object src/arrow/CMakeFiles/array-test.dir/array-test.cc.o cd /home/wesm/code/arrow/cpp/build-test/src/arrow && /usr/bin/ccache /usr/bin/clang++-6.0 -DARROW_JEMALLOC -DARROW_JEMALLOC_INCLUDE_DIR=/home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src/jemalloc_ep/dist//include -DARROW_WITH_BROTLI -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY -DARROW_WITH_ZLIB -DARROW_WITH_ZSTD -isystem /home/wesm/cpp-toolchain/include -isystem /home/wesm/code/arrow/cpp/build-test/double-conversion_ep/src/double-conversion_ep/include -isystem /home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src -isystem /home/wesm/code/arrow/cpp/thirdparty/hadoop/include -I/home/wesm/code/arrow/cpp/build-test/src -I/home/wesm/code/arrow/cpp/src -std=c++11 -Qunused-arguments -ggdb -O0 -Wall -Wno-unknown-warning-option -msse3 -maltivec -g -std=gnu++11 -o CMakeFiles/array-test.dir/array-test.cc.o -c /home/wesm/code/arrow/cpp/src/arrow/array-test.cc [100%] Linking CXX executable ../../debug/array-test cd /home/wesm/code/arrow/cpp/build-test/src/arrow && /home/wesm/cpp-toolchain/bin/cmake -E cmake_link_script CMakeFiles/array-test.dir/link.txt --verbose=1 /usr/bin/ccache /usr/bin/clang++-6.0 -std=c++11 -Qunused-arguments -ggdb -O0 -Wall -Wno-unknown-warning-option -msse3 -maltivec -g -rdynamic CMakeFiles/array-test.dir/array-test.cc.o -o ../../debug/array-test -Wl,-rpath,/home/wesm/cpp-toolchain/lib ../../debug/libarrow.a /home/wesm/cpp-toolchain/lib/libgtest_main.a /home/wesm/cpp-toolchain/lib/libgtest.a -ldl /home/wesm/cpp-toolchain/lib/libglog.a /home/wesm/cpp-toolchain/lib/libglog.a /home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libz.so /home/wesm/cpp-toolchain/lib/libz.so /home/wesm/cpp-toolchain/lib/libsnappy.a /home/wesm/cpp-toolchain/lib/libsnappy.a /home/wesm/cpp-toolchain/lib/liblz4.a /home/wesm/cpp-toolchain/lib/liblz4.a /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a /home/wesm/cpp-toolchain/lib/libboost_system.so /home/wesm/cpp-toolchain/lib/libboost_filesystem.so /home/wesm/cpp-toolchain/lib/libboost_regex.so ../../jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a -lpthread -lrt /usr/lib/x86_64-linux-gnu/libpthread.so /home/wesm/cpp-toolchain/lib/libgtest_main.a /home/wesm/cpp-toolchain/lib/libgtest.a {code} Note how some of the static libraries are passed multiple times -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3750) [R] Pass various wrapped Arrow objects created in Python into R with zero copy via reticulate
Wes McKinney created ARROW-3750: --- Summary: [R] Pass various wrapped Arrow objects created in Python into R with zero copy via reticulate Key: ARROW-3750 URL: https://issues.apache.org/jira/browse/ARROW-3750 Project: Apache Arrow Issue Type: New Feature Components: R Reporter: Wes McKinney A user may wish to use some functionality available only in pyarrow using reticulate; it would be useful to be able to construct an R wrapper object to the C++ object inside the corresponding Python type, e.g. {{pyarrow.Table}}. This probably will require some new functions to return the memory address of the shared_ptr/unique_ptr inside the Cython types so that a function on the R side can copy the smart pointer and create the corresponding R wrapper type cc [~pitrou] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3747) [C++] Flip order of data members in arrow::Decimal128
Wes McKinney created ARROW-3747: --- Summary: [C++] Flip order of data members in arrow::Decimal128 Key: ARROW-3747 URL: https://issues.apache.org/jira/browse/ARROW-3747 Project: Apache Arrow Issue Type: Improvement Components: C++ Reporter: Wes McKinney Fix For: 0.12.0 As discussed in https://github.com/apache/arrow/pull/2845, this will enable a data buffer to be correctly interpreted as {{Decimal128**}}, so memcpy and other operations will work -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-3745) [C++] CMake passes static libraries multiple times to linker
[ https://issues.apache.org/jira/browse/ARROW-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-3745: --- Assignee: Wes McKinney > [C++] CMake passes static libraries multiple times to linker > > > Key: ARROW-3745 > URL: https://issues.apache.org/jira/browse/ARROW-3745 > Project: Apache Arrow > Issue Type: Bug > Components: C#, C++ >Affects Versions: 0.11.1 >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Fix For: 0.12.0 > > > With {{make array-test}} I see > {code} > [ 97%] Building CXX object src/arrow/CMakeFiles/array-test.dir/array-test.cc.o > cd /home/wesm/code/arrow/cpp/build-test/src/arrow && /usr/bin/ccache > /usr/bin/clang++-6.0 -DARROW_JEMALLOC > -DARROW_JEMALLOC_INCLUDE_DIR=/home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src/jemalloc_ep/dist//include > -DARROW_WITH_BROTLI -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY -DARROW_WITH_ZLIB > -DARROW_WITH_ZSTD -isystem /home/wesm/cpp-toolchain/include -isystem > /home/wesm/code/arrow/cpp/build-test/double-conversion_ep/src/double-conversion_ep/include > -isystem /home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src > -isystem /home/wesm/code/arrow/cpp/thirdparty/hadoop/include > -I/home/wesm/code/arrow/cpp/build-test/src -I/home/wesm/code/arrow/cpp/src > -std=c++11 -Qunused-arguments -ggdb -O0 -Wall -Wno-unknown-warning-option > -msse3 -maltivec -g -std=gnu++11 -o > CMakeFiles/array-test.dir/array-test.cc.o -c > /home/wesm/code/arrow/cpp/src/arrow/array-test.cc > [100%] Linking CXX executable ../../debug/array-test > cd /home/wesm/code/arrow/cpp/build-test/src/arrow && > /home/wesm/cpp-toolchain/bin/cmake -E cmake_link_script > CMakeFiles/array-test.dir/link.txt --verbose=1 > /usr/bin/ccache /usr/bin/clang++-6.0 -std=c++11 -Qunused-arguments -ggdb > -O0 -Wall -Wno-unknown-warning-option -msse3 -maltivec -g -rdynamic > CMakeFiles/array-test.dir/array-test.cc.o -o ../../debug/array-test > -Wl,-rpath,/home/wesm/cpp-toolchain/lib ../../debug/libarrow.a > /home/wesm/cpp-toolchain/lib/libgtest_main.a > /home/wesm/cpp-toolchain/lib/libgtest.a -ldl > /home/wesm/cpp-toolchain/lib/libglog.a /home/wesm/cpp-toolchain/lib/libglog.a > /home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libzstd.a > /home/wesm/cpp-toolchain/lib/libz.so /home/wesm/cpp-toolchain/lib/libz.so > /home/wesm/cpp-toolchain/lib/libsnappy.a > /home/wesm/cpp-toolchain/lib/libsnappy.a > /home/wesm/cpp-toolchain/lib/liblz4.a /home/wesm/cpp-toolchain/lib/liblz4.a > /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a > /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a > /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a > /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a > /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a > /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a > ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a > > ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a > /home/wesm/cpp-toolchain/lib/libboost_system.so > /home/wesm/cpp-toolchain/lib/libboost_filesystem.so > /home/wesm/cpp-toolchain/lib/libboost_regex.so > ../../jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a > -lpthread -lrt /usr/lib/x86_64-linux-gnu/libpthread.so > /home/wesm/cpp-toolchain/lib/libgtest_main.a > /home/wesm/cpp-toolchain/lib/libgtest.a > {code} > Note how some of the static libraries are passed multiple times -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3745) [C++] CMake passes static libraries multiple times to linker
[ https://issues.apache.org/jira/browse/ARROW-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-3745: Fix Version/s: 0.12.0 > [C++] CMake passes static libraries multiple times to linker > > > Key: ARROW-3745 > URL: https://issues.apache.org/jira/browse/ARROW-3745 > Project: Apache Arrow > Issue Type: Bug > Components: C#, C++ >Affects Versions: 0.11.1 >Reporter: Wes McKinney >Assignee: Wes McKinney >Priority: Major > Fix For: 0.12.0 > > > With {{make array-test}} I see > {code} > [ 97%] Building CXX object src/arrow/CMakeFiles/array-test.dir/array-test.cc.o > cd /home/wesm/code/arrow/cpp/build-test/src/arrow && /usr/bin/ccache > /usr/bin/clang++-6.0 -DARROW_JEMALLOC > -DARROW_JEMALLOC_INCLUDE_DIR=/home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src/jemalloc_ep/dist//include > -DARROW_WITH_BROTLI -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY -DARROW_WITH_ZLIB > -DARROW_WITH_ZSTD -isystem /home/wesm/cpp-toolchain/include -isystem > /home/wesm/code/arrow/cpp/build-test/double-conversion_ep/src/double-conversion_ep/include > -isystem /home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src > -isystem /home/wesm/code/arrow/cpp/thirdparty/hadoop/include > -I/home/wesm/code/arrow/cpp/build-test/src -I/home/wesm/code/arrow/cpp/src > -std=c++11 -Qunused-arguments -ggdb -O0 -Wall -Wno-unknown-warning-option > -msse3 -maltivec -g -std=gnu++11 -o > CMakeFiles/array-test.dir/array-test.cc.o -c > /home/wesm/code/arrow/cpp/src/arrow/array-test.cc > [100%] Linking CXX executable ../../debug/array-test > cd /home/wesm/code/arrow/cpp/build-test/src/arrow && > /home/wesm/cpp-toolchain/bin/cmake -E cmake_link_script > CMakeFiles/array-test.dir/link.txt --verbose=1 > /usr/bin/ccache /usr/bin/clang++-6.0 -std=c++11 -Qunused-arguments -ggdb > -O0 -Wall -Wno-unknown-warning-option -msse3 -maltivec -g -rdynamic > CMakeFiles/array-test.dir/array-test.cc.o -o ../../debug/array-test > -Wl,-rpath,/home/wesm/cpp-toolchain/lib ../../debug/libarrow.a > /home/wesm/cpp-toolchain/lib/libgtest_main.a > /home/wesm/cpp-toolchain/lib/libgtest.a -ldl > /home/wesm/cpp-toolchain/lib/libglog.a /home/wesm/cpp-toolchain/lib/libglog.a > /home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libzstd.a > /home/wesm/cpp-toolchain/lib/libz.so /home/wesm/cpp-toolchain/lib/libz.so > /home/wesm/cpp-toolchain/lib/libsnappy.a > /home/wesm/cpp-toolchain/lib/libsnappy.a > /home/wesm/cpp-toolchain/lib/liblz4.a /home/wesm/cpp-toolchain/lib/liblz4.a > /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a > /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a > /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a > /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a > /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a > /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a > ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a > > ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a > /home/wesm/cpp-toolchain/lib/libboost_system.so > /home/wesm/cpp-toolchain/lib/libboost_filesystem.so > /home/wesm/cpp-toolchain/lib/libboost_regex.so > ../../jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a > -lpthread -lrt /usr/lib/x86_64-linux-gnu/libpthread.so > /home/wesm/cpp-toolchain/lib/libgtest_main.a > /home/wesm/cpp-toolchain/lib/libgtest.a > {code} > Note how some of the static libraries are passed multiple times -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3742) Fix pyarrow.types & gandiva cython bindings
[ https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philipp Moritz resolved ARROW-3742. --- Resolution: Fixed Fix Version/s: 0.12.0 Issue resolved by pull request 2931 [https://github.com/apache/arrow/pull/2931] > Fix pyarrow.types & gandiva cython bindings > --- > > Key: ARROW-3742 > URL: https://issues.apache.org/jira/browse/ARROW-3742 > Project: Apache Arrow > Issue Type: Bug > Components: Gandiva, Python >Reporter: Siyuan Zhuang >Assignee: Siyuan Zhuang >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > 1. 'types.py' didn't export `_as_type`, causing failures in certain > cython/python combinations. I am surprised to see that the CI didn't fail. > 2. After updating the gandiva cpp part (ARROW-3587), the cython bindings > (ARROW-3602) are not consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-2673) [Python] Add documentation + docstring for ARROW-2661
[ https://issues.apache.org/jira/browse/ARROW-2673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-2673. - Resolution: Fixed Issue resolved by pull request 2893 [https://github.com/apache/arrow/pull/2893] > [Python] Add documentation + docstring for ARROW-2661 > - > > Key: ARROW-2673 > URL: https://issues.apache.org/jira/browse/ARROW-2673 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Wes McKinney >Assignee: Matt Topol >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3744) [Ruby] Use garrow_table_to_string() in Arrow::Table#to_s
Kouhei Sutou created ARROW-3744: --- Summary: [Ruby] Use garrow_table_to_string() in Arrow::Table#to_s Key: ARROW-3744 URL: https://issues.apache.org/jira/browse/ARROW-3744 Project: Apache Arrow Issue Type: Improvement Components: Ruby Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3744) [Ruby] Use garrow_table_to_string() in Arrow::Table#to_s
[ https://issues.apache.org/jira/browse/ARROW-3744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3744: -- Labels: pull-request-available (was: ) > [Ruby] Use garrow_table_to_string() in Arrow::Table#to_s > > > Key: ARROW-3744 > URL: https://issues.apache.org/jira/browse/ARROW-3744 > Project: Apache Arrow > Issue Type: Improvement > Components: Ruby >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3749) [GLib] Typos in documentation and test case name
[ https://issues.apache.org/jira/browse/ARROW-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3749: -- Labels: pull-request-available (was: ) > [GLib] Typos in documentation and test case name > > > Key: ARROW-3749 > URL: https://issues.apache.org/jira/browse/ARROW-3749 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Trivial > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3748) [GLib] Add GArrowCSVReader
Kouhei Sutou created ARROW-3748: --- Summary: [GLib] Add GArrowCSVReader Key: ARROW-3748 URL: https://issues.apache.org/jira/browse/ARROW-3748 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3748) [GLib] Add GArrowCSVReader
[ https://issues.apache.org/jira/browse/ARROW-3748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3748: -- Labels: pull-request-available (was: ) > [GLib] Add GArrowCSVReader > -- > > Key: ARROW-3748 > URL: https://issues.apache.org/jira/browse/ARROW-3748 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3716) [R] Missing cases for ChunkedArray conversion
[ https://issues.apache.org/jira/browse/ARROW-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-3716. - Resolution: Fixed Fix Version/s: 0.12.0 Issue resolved by pull request 2928 [https://github.com/apache/arrow/pull/2928] > [R] Missing cases for ChunkedArray conversion > - > > Key: ARROW-3716 > URL: https://issues.apache.org/jira/browse/ARROW-3716 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Romain François >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {code} > library(arrow) > tab <- table(iris) > tab$schema() > #> arrow::Schema > #> Sepal.Length: double > #> Sepal.Width: double > #> Petal.Length: double > #> Petal.Width: double > #> Species: dictionary > as_tibble(tab) > #> Error in Table__to_dataframe(x): cannot handle Array of type 26 > # simpler reprex: > a <- chunked_array(factor(c("a", "b"))) > a$as_vector() > #> Error in ChunkedArray__as_vector(self): cannot handle Array of type 26 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-3716) [R] Missing cases for ChunkedArray conversion
[ https://issues.apache.org/jira/browse/ARROW-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-3716: --- Assignee: Romain François > [R] Missing cases for ChunkedArray conversion > - > > Key: ARROW-3716 > URL: https://issues.apache.org/jira/browse/ARROW-3716 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Romain François >Assignee: Romain François >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {code} > library(arrow) > tab <- table(iris) > tab$schema() > #> arrow::Schema > #> Sepal.Length: double > #> Sepal.Width: double > #> Petal.Length: double > #> Petal.Width: double > #> Species: dictionary > as_tibble(tab) > #> Error in Table__to_dataframe(x): cannot handle Array of type 26 > # simpler reprex: > a <- chunked_array(factor(c("a", "b"))) > a$as_vector() > #> Error in ChunkedArray__as_vector(self): cannot handle Array of type 26 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3742) Fix pyarrow.types & gandiva cython bindings
[ https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyuan Zhuang updated ARROW-3742: - Description: 1. 'types.py' didn't export `_as_type`, causing failures in certain cython/python combinations. I am surprised to see that the CI didn't fail. 2. After updating the gandiva cpp part (ARROW-3587), the cython bindings (ARROW-3602) are not consistent. was:After updating the gandiva cpp part (ARROW-3587), the cython bindings (ARROW-3602) are not consistent. Summary: Fix pyarrow.types & gandiva cython bindings (was: Fix gandiva cython bindings) > Fix pyarrow.types & gandiva cython bindings > --- > > Key: ARROW-3742 > URL: https://issues.apache.org/jira/browse/ARROW-3742 > Project: Apache Arrow > Issue Type: Bug >Reporter: Siyuan Zhuang >Assignee: Siyuan Zhuang >Priority: Major > > 1. 'types.py' didn't export `_as_type`, causing failures in certain > cython/python combinations. I am surprised to see that the CI didn't fail. > 2. After updating the gandiva cpp part (ARROW-3587), the cython bindings > (ARROW-3602) are not consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ARROW-3476) [Java] mvn test in memory fails on a big-endian platform
[ https://issues.apache.org/jira/browse/ARROW-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682228#comment-16682228 ] Kazuaki Ishizaki edited comment on ARROW-3476 at 11/10/18 6:13 AM: --- [~wesmckinn] Now, kou, pitrou, xhochy, kszucs, cploud, and I (@kiszk) can access https://ibmz-ci.osuosl.org/. was (Author: kiszk): [~wesmckinn] Now, kou, pitrou, xhochy, kszucs, cploud, and me (@kiszk) can access https://ibmz-ci.osuosl.org/. > [Java] mvn test in memory fails on a big-endian platform > > > Key: ARROW-3476 > URL: https://issues.apache.org/jira/browse/ARROW-3476 > Project: Apache Arrow > Issue Type: Bug > Components: Java >Reporter: Kazuaki Ishizaki >Priority: Major > > Apache Arrow is becoming commonplace to exchange data among important > emerging analytics frameworks such as Pandas, Numpy, and Spark. > [IBM Z|https://en.wikipedia.org/wiki/IBM_Z] is one of platforms to process > critical transactions such as bank or credit card. Users of IBM Z want to > extract insights from these transactions using the emerging analytics systems > on IBM Z Linux. These analytics pipelines can be also fast and effective on > IBM Z Linux by using Apache Arrow on memory. > From the technical perspective, since IBM Z Linux uses big-endian data > format, it is not possible to use Apache Arrow in this pipeline. If Apache > Arrow could support big-endian, the use case would be expanded. > When I ran test case of Apache arrow on a big-endian platform (ppc64be), > {{mvn test}} in memory causes a failure due to an assertion. > In {{TestEndianess.testLittleEndian}} test suite, the assertion occurs during > an allocation of a {{RootAllocator}} class. > {code} > $ uname -a > Linux ppc64be.novalocal 4.5.7-300.fc24.ppc64 #1 SMP Fri Jun 10 20:29:32 UTC > 2016 ppc64 ppc64 ppc64 GNU/Linux > $ arch > ppc64 > $ cd java/memory > $ mvn test > [INFO] Scanning for projects... > [INFO] > > [INFO] > > [INFO] Building Arrow Memory 0.12.0-SNAPSHOT > [INFO] > > [INFO] > ... > [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.082 > s - in org.apache.arrow.memory.TestAccountant > [INFO] Running org.apache.arrow.memory.TestLowCostIdentityHashMap > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 > s - in org.apache.arrow.memory.TestLowCostIdentityHashMap > [INFO] Running org.apache.arrow.memory.TestBaseAllocator > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.746 > s <<< FAILURE! - in org.apache.arrow.memory.TestEndianess > [ERROR] testLittleEndian(org.apache.arrow.memory.TestEndianess) Time > elapsed: 0.313 s <<< ERROR! > java.lang.ExceptionInInitializerError > at > org.apache.arrow.memory.TestEndianess.testLittleEndian(TestEndianess.java:31) > Caused by: java.lang.IllegalStateException: Arrow only runs on LittleEndian > systems. > at > org.apache.arrow.memory.TestEndianess.testLittleEndian(TestEndianess.java:31) > [ERROR] Tests run: 22, Failures: 0, Errors: 21, Skipped: 1, Time elapsed: > 0.055 s <<< FAILURE! - in org.apache.arrow.memory.TestBaseAllocator > ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3743) [Ruby] Add support for saving/loading Feather
[ https://issues.apache.org/jira/browse/ARROW-3743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3743: -- Labels: pull-request-available (was: ) > [Ruby] Add support for saving/loading Feather > - > > Key: ARROW-3743 > URL: https://issues.apache.org/jira/browse/ARROW-3743 > Project: Apache Arrow > Issue Type: Improvement > Components: Ruby >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Labels: pull-request-available > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3743) [Ruby] Add support for saving/loading Feather
Kouhei Sutou created ARROW-3743: --- Summary: [Ruby] Add support for saving/loading Feather Key: ARROW-3743 URL: https://issues.apache.org/jira/browse/ARROW-3743 Project: Apache Arrow Issue Type: Improvement Components: Ruby Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3749) [GLib] Typos in documentation and test case name
Kouhei Sutou created ARROW-3749: --- Summary: [GLib] Typos in documentation and test case name Key: ARROW-3749 URL: https://issues.apache.org/jira/browse/ARROW-3749 Project: Apache Arrow Issue Type: Improvement Components: GLib Reporter: Kouhei Sutou Assignee: Kouhei Sutou -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3746) [Gandiva] [Python] Make it possible to list all functions registered with Gandiva
[ https://issues.apache.org/jira/browse/ARROW-3746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3746: -- Labels: pull-request-available (was: ) > [Gandiva] [Python] Make it possible to list all functions registered with > Gandiva > - > > Key: ARROW-3746 > URL: https://issues.apache.org/jira/browse/ARROW-3746 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Philipp Moritz >Priority: Major > Labels: pull-request-available > > This will also be useful for documentation purposes (right now it is not very > easy to get a list of all the functions that are registered). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3745) [C++] CMake passes static libraries multiple times to linker
[ https://issues.apache.org/jira/browse/ARROW-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682208#comment-16682208 ] Wes McKinney commented on ARROW-3745: - I found the problem. Libraries are linking to themselves here https://github.com/apache/arrow/blob/master/cpp/cmake_modules/BuildUtils.cmake#L70 I will remove in https://github.com/apache/arrow/pull/2735 and see if that doesn't break anything > [C++] CMake passes static libraries multiple times to linker > > > Key: ARROW-3745 > URL: https://issues.apache.org/jira/browse/ARROW-3745 > Project: Apache Arrow > Issue Type: Bug > Components: C#, C++ >Affects Versions: 0.11.1 >Reporter: Wes McKinney >Priority: Major > > With {{make array-test}} I see > {code} > [ 97%] Building CXX object src/arrow/CMakeFiles/array-test.dir/array-test.cc.o > cd /home/wesm/code/arrow/cpp/build-test/src/arrow && /usr/bin/ccache > /usr/bin/clang++-6.0 -DARROW_JEMALLOC > -DARROW_JEMALLOC_INCLUDE_DIR=/home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src/jemalloc_ep/dist//include > -DARROW_WITH_BROTLI -DARROW_WITH_LZ4 -DARROW_WITH_SNAPPY -DARROW_WITH_ZLIB > -DARROW_WITH_ZSTD -isystem /home/wesm/cpp-toolchain/include -isystem > /home/wesm/code/arrow/cpp/build-test/double-conversion_ep/src/double-conversion_ep/include > -isystem /home/wesm/code/arrow/cpp/build-test/jemalloc_ep-prefix/src > -isystem /home/wesm/code/arrow/cpp/thirdparty/hadoop/include > -I/home/wesm/code/arrow/cpp/build-test/src -I/home/wesm/code/arrow/cpp/src > -std=c++11 -Qunused-arguments -ggdb -O0 -Wall -Wno-unknown-warning-option > -msse3 -maltivec -g -std=gnu++11 -o > CMakeFiles/array-test.dir/array-test.cc.o -c > /home/wesm/code/arrow/cpp/src/arrow/array-test.cc > [100%] Linking CXX executable ../../debug/array-test > cd /home/wesm/code/arrow/cpp/build-test/src/arrow && > /home/wesm/cpp-toolchain/bin/cmake -E cmake_link_script > CMakeFiles/array-test.dir/link.txt --verbose=1 > /usr/bin/ccache /usr/bin/clang++-6.0 -std=c++11 -Qunused-arguments -ggdb > -O0 -Wall -Wno-unknown-warning-option -msse3 -maltivec -g -rdynamic > CMakeFiles/array-test.dir/array-test.cc.o -o ../../debug/array-test > -Wl,-rpath,/home/wesm/cpp-toolchain/lib ../../debug/libarrow.a > /home/wesm/cpp-toolchain/lib/libgtest_main.a > /home/wesm/cpp-toolchain/lib/libgtest.a -ldl > /home/wesm/cpp-toolchain/lib/libglog.a /home/wesm/cpp-toolchain/lib/libglog.a > /home/wesm/cpp-toolchain/lib/libzstd.a /home/wesm/cpp-toolchain/lib/libzstd.a > /home/wesm/cpp-toolchain/lib/libz.so /home/wesm/cpp-toolchain/lib/libz.so > /home/wesm/cpp-toolchain/lib/libsnappy.a > /home/wesm/cpp-toolchain/lib/libsnappy.a > /home/wesm/cpp-toolchain/lib/liblz4.a /home/wesm/cpp-toolchain/lib/liblz4.a > /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a > /home/wesm/cpp-toolchain/lib/libbrotlidec-static.a > /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a > /home/wesm/cpp-toolchain/lib/libbrotlienc-static.a > /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a > /home/wesm/cpp-toolchain/lib/libbrotlicommon-static.a > ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a > > ../../double-conversion_ep/src/double-conversion_ep/lib/libdouble-conversion.a > /home/wesm/cpp-toolchain/lib/libboost_system.so > /home/wesm/cpp-toolchain/lib/libboost_filesystem.so > /home/wesm/cpp-toolchain/lib/libboost_regex.so > ../../jemalloc_ep-prefix/src/jemalloc_ep/dist//lib/libjemalloc_pic.a > -lpthread -lrt /usr/lib/x86_64-linux-gnu/libpthread.so > /home/wesm/cpp-toolchain/lib/libgtest_main.a > /home/wesm/cpp-toolchain/lib/libgtest.a > {code} > Note how some of the static libraries are passed multiple times -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3476) [Java] mvn test in memory fails on a big-endian platform
[ https://issues.apache.org/jira/browse/ARROW-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682228#comment-16682228 ] Kazuaki Ishizaki commented on ARROW-3476: - [~wesmckinn] Now, kou, pitrou, xhochy, kszucs, cploud, and me (@kiszk) can access https://ibmz-ci.osuosl.org/. > [Java] mvn test in memory fails on a big-endian platform > > > Key: ARROW-3476 > URL: https://issues.apache.org/jira/browse/ARROW-3476 > Project: Apache Arrow > Issue Type: Bug > Components: Java >Reporter: Kazuaki Ishizaki >Priority: Major > > Apache Arrow is becoming commonplace to exchange data among important > emerging analytics frameworks such as Pandas, Numpy, and Spark. > [IBM Z|https://en.wikipedia.org/wiki/IBM_Z] is one of platforms to process > critical transactions such as bank or credit card. Users of IBM Z want to > extract insights from these transactions using the emerging analytics systems > on IBM Z Linux. These analytics pipelines can be also fast and effective on > IBM Z Linux by using Apache Arrow on memory. > From the technical perspective, since IBM Z Linux uses big-endian data > format, it is not possible to use Apache Arrow in this pipeline. If Apache > Arrow could support big-endian, the use case would be expanded. > When I ran test case of Apache arrow on a big-endian platform (ppc64be), > {{mvn test}} in memory causes a failure due to an assertion. > In {{TestEndianess.testLittleEndian}} test suite, the assertion occurs during > an allocation of a {{RootAllocator}} class. > {code} > $ uname -a > Linux ppc64be.novalocal 4.5.7-300.fc24.ppc64 #1 SMP Fri Jun 10 20:29:32 UTC > 2016 ppc64 ppc64 ppc64 GNU/Linux > $ arch > ppc64 > $ cd java/memory > $ mvn test > [INFO] Scanning for projects... > [INFO] > > [INFO] > > [INFO] Building Arrow Memory 0.12.0-SNAPSHOT > [INFO] > > [INFO] > ... > [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.082 > s - in org.apache.arrow.memory.TestAccountant > [INFO] Running org.apache.arrow.memory.TestLowCostIdentityHashMap > [INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.012 > s - in org.apache.arrow.memory.TestLowCostIdentityHashMap > [INFO] Running org.apache.arrow.memory.TestBaseAllocator > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.746 > s <<< FAILURE! - in org.apache.arrow.memory.TestEndianess > [ERROR] testLittleEndian(org.apache.arrow.memory.TestEndianess) Time > elapsed: 0.313 s <<< ERROR! > java.lang.ExceptionInInitializerError > at > org.apache.arrow.memory.TestEndianess.testLittleEndian(TestEndianess.java:31) > Caused by: java.lang.IllegalStateException: Arrow only runs on LittleEndian > systems. > at > org.apache.arrow.memory.TestEndianess.testLittleEndian(TestEndianess.java:31) > [ERROR] Tests run: 22, Failures: 0, Errors: 21, Skipped: 1, Time elapsed: > 0.055 s <<< FAILURE! - in org.apache.arrow.memory.TestBaseAllocator > ... > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-3613) [Go] Resize does not correctly update the length
[ https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-3613: --- Assignee: Jonathan A Sternberg > [Go] Resize does not correctly update the length > > > Key: ARROW-3613 > URL: https://issues.apache.org/jira/browse/ARROW-3613 > Project: Apache Arrow > Issue Type: Bug > Components: Go >Reporter: Jonathan A Sternberg >Assignee: Jonathan A Sternberg >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 1h > Remaining Estimate: 0h > > If you have the following code: > {code:java} > package main > import ( > "fmt" > "github.com/apache/arrow/go/arrow/array" > "github.com/apache/arrow/go/arrow/memory" > ) > func main() { > builder := array.NewFloat64Builder(memory.DefaultAllocator) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > for i := 0; i < 44; i++ { > builder.Append(0) > } > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > } > {code} > It gives the following output: > {code:java} > 0 0 > 0 64 > 0 32 > 44 64 > 44 32 > {code} > For whatever reason, the length is not recorded as 5. I understand why the > capacity might not be 5, but it does seem like the length should be set to 5 > if the array is resized to a length smaller than its current capacity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-3407) [C++] Add UTF8 conversion modes in CSV reader conversion options
[ https://issues.apache.org/jira/browse/ARROW-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-3407: --- Assignee: Antoine Pitrou > [C++] Add UTF8 conversion modes in CSV reader conversion options > > > Key: ARROW-3407 > URL: https://issues.apache.org/jira/browse/ARROW-3407 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Wes McKinney >Assignee: Antoine Pitrou >Priority: Major > Labels: csv, pull-request-available > Fix For: 0.12.0 > > Time Spent: 40m > Remaining Estimate: 0h > > There should be a few options: > * Assume UTF8, but do not verify ("no seatbelts mode", for users that have > reasonable security about UTF8 and want the maximum performance) > * Full UTF8 verification > * Maybe ASCII-only verification (because ASCII verification is very fast) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3407) [C++] Add UTF8 conversion modes in CSV reader conversion options
[ https://issues.apache.org/jira/browse/ARROW-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-3407. - Resolution: Fixed Issue resolved by pull request 2924 [https://github.com/apache/arrow/pull/2924] > [C++] Add UTF8 conversion modes in CSV reader conversion options > > > Key: ARROW-3407 > URL: https://issues.apache.org/jira/browse/ARROW-3407 > Project: Apache Arrow > Issue Type: New Feature > Components: C++ >Reporter: Wes McKinney >Priority: Major > Labels: csv, pull-request-available > Fix For: 0.12.0 > > Time Spent: 40m > Remaining Estimate: 0h > > There should be a few options: > * Assume UTF8, but do not verify ("no seatbelts mode", for users that have > reasonable security about UTF8 and want the maximum performance) > * Full UTF8 verification > * Maybe ASCII-only verification (because ASCII verification is very fast) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-3742) Fix gandiva cython bindings
[ https://issues.apache.org/jira/browse/ARROW-3742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyuan Zhuang reassigned ARROW-3742: Assignee: Siyuan Zhuang > Fix gandiva cython bindings > --- > > Key: ARROW-3742 > URL: https://issues.apache.org/jira/browse/ARROW-3742 > Project: Apache Arrow > Issue Type: Bug >Reporter: Siyuan Zhuang >Assignee: Siyuan Zhuang >Priority: Major > > After updating the gandiva cpp part (ARROW-3587), the cython bindings > (ARROW-3602) are not consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3738) [C++] Add CSV conversion option to parse ISO8601-like timestamp strings
Wes McKinney created ARROW-3738: --- Summary: [C++] Add CSV conversion option to parse ISO8601-like timestamp strings Key: ARROW-3738 URL: https://issues.apache.org/jira/browse/ARROW-3738 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Wes McKinney See similar functionality in other libraries. I believe pandas has a fast path for iso8601 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3739) [C++] Add option to convert a particular column to timestamps or dates using a passed strptime-compatible string
Wes McKinney created ARROW-3739: --- Summary: [C++] Add option to convert a particular column to timestamps or dates using a passed strptime-compatible string Key: ARROW-3739 URL: https://issues.apache.org/jira/browse/ARROW-3739 Project: Apache Arrow Issue Type: New Feature Components: C++ Reporter: Wes McKinney Probably will need something like {code} ... types={'date_col': csv.convert_date('%Y%m%d')} {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3613) [Go] Resize does not correctly update the length
[ https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-3613. - Resolution: Fixed Fix Version/s: 0.12.0 Issue resolved by pull request 2927 [https://github.com/apache/arrow/pull/2927] > [Go] Resize does not correctly update the length > > > Key: ARROW-3613 > URL: https://issues.apache.org/jira/browse/ARROW-3613 > Project: Apache Arrow > Issue Type: Bug > Components: Go >Reporter: Jonathan A Sternberg >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 50m > Remaining Estimate: 0h > > If you have the following code: > {code:java} > package main > import ( > "fmt" > "github.com/apache/arrow/go/arrow/array" > "github.com/apache/arrow/go/arrow/memory" > ) > func main() { > builder := array.NewFloat64Builder(memory.DefaultAllocator) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > for i := 0; i < 44; i++ { > builder.Append(0) > } > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > } > {code} > It gives the following output: > {code:java} > 0 0 > 0 64 > 0 32 > 44 64 > 44 32 > {code} > For whatever reason, the length is not recorded as 5. I understand why the > capacity might not be 5, but it does seem like the length should be set to 5 > if the array is resized to a length smaller than its current capacity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3740) [C++] Calling ArrayBuilder::Resize with length smaller than current appended length results in invalid state
Wes McKinney created ARROW-3740: --- Summary: [C++] Calling ArrayBuilder::Resize with length smaller than current appended length results in invalid state Key: ARROW-3740 URL: https://issues.apache.org/jira/browse/ARROW-3740 Project: Apache Arrow Issue Type: Bug Components: C++ Affects Versions: 0.11.1 Reporter: Wes McKinney Fix For: 0.12.0 This was brought up by the Go patch ARROW-3613. If you append some data to a builder, then call {{Resize}} with something smaller than what's reported by {{length()}}, the capacity will be updated, but the length will not. So I think further appends would probably segfault. Either way we should add some tests for this case of "shrinking" a builder (which destroys data, but it's permitted by the API -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3741) [R] Add support for arrow::compute::Cast to convert Arrow arrays from one type to another
Wes McKinney created ARROW-3741: --- Summary: [R] Add support for arrow::compute::Cast to convert Arrow arrays from one type to another Key: ARROW-3741 URL: https://issues.apache.org/jira/browse/ARROW-3741 Project: Apache Arrow Issue Type: New Feature Components: R Reporter: Wes McKinney See {{pyarrow.Array.cast}} and {{pyarrow.Table.cast}} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3613) [Go] Resize does not correctly update the length
[ https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681230#comment-16681230 ] Alexandre Crayssac commented on ARROW-3613: --- Ok, still investigating on it since it looks like the bug has others ramifications. > [Go] Resize does not correctly update the length > > > Key: ARROW-3613 > URL: https://issues.apache.org/jira/browse/ARROW-3613 > Project: Apache Arrow > Issue Type: Bug > Components: Go >Reporter: Jonathan A Sternberg >Priority: Major > > If you have the following code: > {code:java} > package main > import ( > "fmt" > "github.com/apache/arrow/go/arrow/array" > "github.com/apache/arrow/go/arrow/memory" > ) > func main() { > builder := array.NewFloat64Builder(memory.DefaultAllocator) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > for i := 0; i < 44; i++ { > builder.Append(0) > } > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > } > {code} > It gives the following output: > {code:java} > 0 0 > 0 64 > 0 32 > 44 64 > 44 32 > {code} > For whatever reason, the length is not recorded as 5. I understand why the > capacity might not be 5, but it does seem like the length should be set to 5 > if the array is resized to a length smaller than its current capacity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64
Dimitri Vorona created ARROW-3734: - Summary: [C++] Linking static zstd library fails on Arch x86-64 Key: ARROW-3734 URL: https://issues.apache.org/jira/browse/ARROW-3734 Project: Apache Arrow Issue Type: Bug Components: C++ Affects Versions: 0.12.0 Reporter: Dimitri Vorona zlib install the static library into the ${CMAKE_INSTALL_LIBDIR} which is lib64 on 64-bit systems. We should also look at this path when we're linking. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64
[ https://issues.apache.org/jira/browse/ARROW-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3734: -- Labels: pull-request-available (was: ) > [C++] Linking static zstd library fails on Arch x86-64 > -- > > Key: ARROW-3734 > URL: https://issues.apache.org/jira/browse/ARROW-3734 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 0.12.0 >Reporter: Dimitri Vorona >Priority: Major > Labels: pull-request-available > > zlib install the static library into the ${CMAKE_INSTALL_LIBDIR} which is > lib64 on 64-bit systems. We should also look at this path when we're linking. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64
[ https://issues.apache.org/jira/browse/ARROW-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dimitri Vorona updated ARROW-3734: -- Description: zlib installs the static library into the ${CMAKE_INSTALL_LIBDIR} which is lib64 on 64-bit systems. We should also look at this path when we're linking. (was: zlib install the static library into the ${CMAKE_INSTALL_LIBDIR} which is lib64 on 64-bit systems. We should also look at this path when we're linking.) > [C++] Linking static zstd library fails on Arch x86-64 > -- > > Key: ARROW-3734 > URL: https://issues.apache.org/jira/browse/ARROW-3734 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 0.12.0 >Reporter: Dimitri Vorona >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > zlib installs the static library into the ${CMAKE_INSTALL_LIBDIR} which is > lib64 on 64-bit systems. We should also look at this path when we're linking. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3733) [GLib] Add to_string() to GArrowTable and GArrowColumn
[ https://issues.apache.org/jira/browse/ARROW-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved ARROW-3733. Resolution: Fixed Fix Version/s: 0.12.0 Issue resolved by pull request 2925 [https://github.com/apache/arrow/pull/2925] > [GLib] Add to_string() to GArrowTable and GArrowColumn > -- > > Key: ARROW-3733 > URL: https://issues.apache.org/jira/browse/ARROW-3733 > Project: Apache Arrow > Issue Type: Improvement > Components: GLib >Reporter: Kouhei Sutou >Assignee: Kouhei Sutou >Priority: Minor > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64
[ https://issues.apache.org/jira/browse/ARROW-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn reassigned ARROW-3734: -- Assignee: Dimitri Vorona > [C++] Linking static zstd library fails on Arch x86-64 > -- > > Key: ARROW-3734 > URL: https://issues.apache.org/jira/browse/ARROW-3734 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 0.12.0 >Reporter: Dimitri Vorona >Assignee: Dimitri Vorona >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > zlib installs the static library into the ${CMAKE_INSTALL_LIBDIR} which is > lib64 on 64-bit systems. We should also look at this path when we're linking. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3734) [C++] Linking static zstd library fails on Arch x86-64
[ https://issues.apache.org/jira/browse/ARROW-3734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn resolved ARROW-3734. Resolution: Fixed Fix Version/s: 0.12.0 Issue resolved by pull request 2926 [https://github.com/apache/arrow/pull/2926] > [C++] Linking static zstd library fails on Arch x86-64 > -- > > Key: ARROW-3734 > URL: https://issues.apache.org/jira/browse/ARROW-3734 > Project: Apache Arrow > Issue Type: Bug > Components: C++ >Affects Versions: 0.12.0 >Reporter: Dimitri Vorona >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > zlib installs the static library into the ${CMAKE_INSTALL_LIBDIR} which is > lib64 on 64-bit systems. We should also look at this path when we're linking. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3613) [Go] Resize does not correctly update the length
[ https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681405#comment-16681405 ] Alexandre Crayssac commented on ARROW-3613: --- Just submitted a PR : [https://github.com/apache/arrow/pull/2927] Need review though. > [Go] Resize does not correctly update the length > > > Key: ARROW-3613 > URL: https://issues.apache.org/jira/browse/ARROW-3613 > Project: Apache Arrow > Issue Type: Bug > Components: Go >Reporter: Jonathan A Sternberg >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If you have the following code: > {code:java} > package main > import ( > "fmt" > "github.com/apache/arrow/go/arrow/array" > "github.com/apache/arrow/go/arrow/memory" > ) > func main() { > builder := array.NewFloat64Builder(memory.DefaultAllocator) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > for i := 0; i < 44; i++ { > builder.Append(0) > } > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > } > {code} > It gives the following output: > {code:java} > 0 0 > 0 64 > 0 32 > 44 64 > 44 32 > {code} > For whatever reason, the length is not recorded as 5. I understand why the > capacity might not be 5, but it does seem like the length should be set to 5 > if the array is resized to a length smaller than its current capacity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3613) [Go] Resize does not correctly update the length
[ https://issues.apache.org/jira/browse/ARROW-3613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3613: -- Labels: pull-request-available (was: ) > [Go] Resize does not correctly update the length > > > Key: ARROW-3613 > URL: https://issues.apache.org/jira/browse/ARROW-3613 > Project: Apache Arrow > Issue Type: Bug > Components: Go >Reporter: Jonathan A Sternberg >Priority: Major > Labels: pull-request-available > > If you have the following code: > {code:java} > package main > import ( > "fmt" > "github.com/apache/arrow/go/arrow/array" > "github.com/apache/arrow/go/arrow/memory" > ) > func main() { > builder := array.NewFloat64Builder(memory.DefaultAllocator) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > builder.Reserve(44) > for i := 0; i < 44; i++ { > builder.Append(0) > } > fmt.Println(builder.Len(), builder.Cap()) > builder.Resize(5) > fmt.Println(builder.Len(), builder.Cap()) > } > {code} > It gives the following output: > {code:java} > 0 0 > 0 64 > 0 32 > 44 64 > 44 32 > {code} > For whatever reason, the length is not recorded as 5. I understand why the > capacity might not be 5, but it does seem like the length should be set to 5 > if the array is resized to a length smaller than its current capacity. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3698) [C++] Segmentation fault when using a large table in Gandiva
[ https://issues.apache.org/jira/browse/ARROW-3698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-3698. - Resolution: Fixed Fix Version/s: 0.12.0 Issue resolved by pull request 2902 [https://github.com/apache/arrow/pull/2902] > [C++] Segmentation fault when using a large table in Gandiva > > > Key: ARROW-3698 > URL: https://issues.apache.org/jira/browse/ARROW-3698 > Project: Apache Arrow > Issue Type: Bug > Components: C++, Gandiva >Reporter: Siyuan Zhuang >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > {code} > >>> import pyarrow as pa > Registry has 519 pre-compiled functions > >>> import pandas as pd > >>> import numpy as np > >>> import pyarrow.gandiva as gandiva > >>> import timeit > >>> > >>> from matplotlib import pyplot as plt > >>> for scale in range(25, 26): > ... frame_data = 1.0 * np.random.randint(0, 100, size=(2**scale, 2)) > ... df = pd.DataFrame(frame_data).add_prefix("col") > ... table = pa.Table.from_pandas(df) > ... > >>> > >>> def float64_add(table): > ... builder = gandiva.TreeExprBuilder() > ... node_a = builder.make_field(table.schema.field_by_name("col0")) > ... node_b = builder.make_field(table.schema.field_by_name("col1")) > ... sum = builder.make_function(b"add", [node_a, node_b], pa.float64()) > ... field_result = pa.field("c", pa.float64()) > ... expr = builder.make_expression(sum, field_result) > ... projector = gandiva.make_projector(table.schema, [expr], > pa.default_memory_pool()) > ... return projector > ... > >>> projector = float64_add(table) > >>> projector.evaluate(table.to_batches()[0]) > [1] 36393 segmentation fault python{code} > It is because there is an integer overflow in Gandiva: > [https://github.com/apache/arrow/blob/1a6545aa51f5f41f0233ee0a11ef87d21127c5ed/cpp/src/gandiva/projector.cc#L141] > It should be `int64_t` instead of `int`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ARROW-3698) [C++] Segmentation fault when using a large table in Gandiva
[ https://issues.apache.org/jira/browse/ARROW-3698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-3698: --- Assignee: Siyuan Zhuang > [C++] Segmentation fault when using a large table in Gandiva > > > Key: ARROW-3698 > URL: https://issues.apache.org/jira/browse/ARROW-3698 > Project: Apache Arrow > Issue Type: Bug > Components: C++, Gandiva >Reporter: Siyuan Zhuang >Assignee: Siyuan Zhuang >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > {code} > >>> import pyarrow as pa > Registry has 519 pre-compiled functions > >>> import pandas as pd > >>> import numpy as np > >>> import pyarrow.gandiva as gandiva > >>> import timeit > >>> > >>> from matplotlib import pyplot as plt > >>> for scale in range(25, 26): > ... frame_data = 1.0 * np.random.randint(0, 100, size=(2**scale, 2)) > ... df = pd.DataFrame(frame_data).add_prefix("col") > ... table = pa.Table.from_pandas(df) > ... > >>> > >>> def float64_add(table): > ... builder = gandiva.TreeExprBuilder() > ... node_a = builder.make_field(table.schema.field_by_name("col0")) > ... node_b = builder.make_field(table.schema.field_by_name("col1")) > ... sum = builder.make_function(b"add", [node_a, node_b], pa.float64()) > ... field_result = pa.field("c", pa.float64()) > ... expr = builder.make_expression(sum, field_result) > ... projector = gandiva.make_projector(table.schema, [expr], > pa.default_memory_pool()) > ... return projector > ... > >>> projector = float64_add(table) > >>> projector.evaluate(table.to_batches()[0]) > [1] 36393 segmentation fault python{code} > It is because there is an integer overflow in Gandiva: > [https://github.com/apache/arrow/blob/1a6545aa51f5f41f0233ee0a11ef87d21127c5ed/cpp/src/gandiva/projector.cc#L141] > It should be `int64_t` instead of `int`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3735) [Python] Proper error handling in _ensure_type
Krisztian Szucs created ARROW-3735: -- Summary: [Python] Proper error handling in _ensure_type Key: ARROW-3735 URL: https://issues.apache.org/jira/browse/ARROW-3735 Project: Apache Arrow Issue Type: Improvement Components: Python Reporter: Krisztian Szucs Assignee: Krisztian Szucs We have multiple _ensure_type like functions, the in defined in array.pxi bypasses None which causes segfault in the following example: {code} pa.array([1, 2, 3]).cast(None) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ARROW-3701) [Gandiva] Add support for decimal operations
[ https://issues.apache.org/jira/browse/ARROW-3701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681704#comment-16681704 ] Pindikura Ravindra commented on ARROW-3701: --- @wesm - I tried the same cmd on my windows 10 home desktop using llc from llvm-4.1. It worked fine. > [Gandiva] Add support for decimal operations > > > Key: ARROW-3701 > URL: https://issues.apache.org/jira/browse/ARROW-3701 > Project: Apache Arrow > Issue Type: Task > Components: Gandiva >Reporter: Pindikura Ravindra >Assignee: Pindikura Ravindra >Priority: Major > > To begin with, will add support for 128-bit decimals. There are two parts : > # llvm_generator needs to understand decimal types (value, precision, scale) > # code decimal operations : add/subtract/multiply/divide/mod/.. > ** This will be c++ code that can be pre-compiled to emit IR code -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3736) [CI/Docker] Ninja test in docker-compose run cpp hangs
Krisztian Szucs created ARROW-3736: -- Summary: [CI/Docker] Ninja test in docker-compose run cpp hangs Key: ARROW-3736 URL: https://issues.apache.org/jira/browse/ARROW-3736 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration Reporter: Krisztian Szucs Assignee: Krisztian Szucs -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ARROW-3737) [CI/Docker/Python] Support running integration tests on multiple python versions
Krisztian Szucs created ARROW-3737: -- Summary: [CI/Docker/Python] Support running integration tests on multiple python versions Key: ARROW-3737 URL: https://issues.apache.org/jira/browse/ARROW-3737 Project: Apache Arrow Issue Type: Improvement Components: Continuous Integration, Python Reporter: Krisztian Szucs Assignee: Krisztian Szucs Currently python-3.6 image is pinned in integration/hdfs/Dockerfile and integration/pandas-master/Dockerfile. It's possible to pass build time argument similarly like the arrow:python-${PYTHON_VERSION} image works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3611) Give error more quickly when pyarrow serialization context is used incorrectly.
[ https://issues.apache.org/jira/browse/ARROW-3611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-3611. - Resolution: Fixed Fix Version/s: 0.12.0 Issue resolved by pull request 2833 [https://github.com/apache/arrow/pull/2833] > Give error more quickly when pyarrow serialization context is used > incorrectly. > --- > > Key: ARROW-3611 > URL: https://issues.apache.org/jira/browse/ARROW-3611 > Project: Apache Arrow > Issue Type: Improvement > Components: Python >Reporter: Robert Nishihara >Assignee: Robert Nishihara >Priority: Minor > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > When {{type_id}} is not a string or can't be cast to a string, > {{register_type}} will succeed, but {{_deserialize_callback}} can fail. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ARROW-3721) [Gandiva] [Python] Support all Gandiva literals
[ https://issues.apache.org/jira/browse/ARROW-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs resolved ARROW-3721. Resolution: Fixed Fix Version/s: 0.12.0 Issue resolved by pull request 2920 [https://github.com/apache/arrow/pull/2920] > [Gandiva] [Python] Support all Gandiva literals > --- > > Key: ARROW-3721 > URL: https://issues.apache.org/jira/browse/ARROW-3721 > Project: Apache Arrow > Issue Type: Improvement >Reporter: Philipp Moritz >Assignee: Philipp Moritz >Priority: Major > Labels: pull-request-available > Fix For: 0.12.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Support all the literals from > [https://github.com/apache/arrow/blob/5b116ab175292fe70ed3c8727bcc6868b9695f4a/cpp/src/gandiva/tree_expr_builder.h#L35] > in the Cython bindings. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ARROW-3716) [R] Missing cases for ChunkedArray conversion
[ https://issues.apache.org/jira/browse/ARROW-3716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-3716: -- Labels: pull-request-available (was: ) > [R] Missing cases for ChunkedArray conversion > - > > Key: ARROW-3716 > URL: https://issues.apache.org/jira/browse/ARROW-3716 > Project: Apache Arrow > Issue Type: Bug > Components: R >Reporter: Romain François >Priority: Major > Labels: pull-request-available > > {code} > library(arrow) > tab <- table(iris) > tab$schema() > #> arrow::Schema > #> Sepal.Length: double > #> Sepal.Width: double > #> Petal.Length: double > #> Petal.Width: double > #> Species: dictionary > as_tibble(tab) > #> Error in Table__to_dataframe(x): cannot handle Array of type 26 > # simpler reprex: > a <- chunked_array(factor(c("a", "b"))) > a$as_vector() > #> Error in ChunkedArray__as_vector(self): cannot handle Array of type 26 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)