date:20181119

[jira] [Resolved] (ARROW-3501) [Gandiva] Enable building with gcc 4.8.x on Ubuntu Trusty, similar distros

2018-11-19 Thread Kouhei Sutou (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou resolved ARROW-3501.
-
Resolution: Fixed

> [Gandiva] Enable building with gcc 4.8.x on Ubuntu Trusty, similar distros
> --
>
> Key: ARROW-3501
> URL: https://issues.apache.org/jira/browse/ARROW-3501
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Gandiva
>Reporter: Pindikura Ravindra
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> Gandiva has a dependency on gcc 4.9 - causes a link error with gcc 4.8. 
> Investigate and remove this dependency if possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (ARROW-3436) [C++] Boost version required by Gandiva is too new for Ubuntu 14.04

2018-11-19 Thread Kouhei Sutou (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou resolved ARROW-3436.
-
Resolution: Fixed

https://github.com/apache/arrow/pull/2998 resolved this too.

> [C++] Boost version required by Gandiva is too new for Ubuntu 14.04
> ---
>
> Key: ARROW-3436
> URL: https://issues.apache.org/jira/browse/ARROW-3436
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> I encountered this bug when testing a non-conda-toolchain build on Ubuntu 
> Trusty
> {code}
> [ 56%] Building CXX object 
> src/gandiva/CMakeFiles/lru_cache_test.dir/lru_cache_test.cc.o
> /home/wesm/code/arrow/cpp/src/gandiva/lru_cache_test.cc: In member function 
> ‘virtual void gandiva::TestLruCache_TestLruBehavior_Test::TestBody()’:
> /home/wesm/code/arrow/cpp/src/gandiva/lru_cache_test.cc:62:188: error: ‘class 
> boost::optional >’ has no member named ‘value’
>ASSERT_EQ(cache_.get(TestCacheKey(1)).value(), "hello");
>   
>   
> ^
> /home/wesm/code/arrow/cpp/src/gandiva/lru_cache_test.cc:62:203: error: 
> template argument 1 is invalid
>ASSERT_EQ(cache_.get(TestCacheKey(1)).value(), "hello");
>   
>   
>^
> /home/wesm/code/arrow/cpp/src/gandiva/lru_cache_test.cc:62:294: error: ‘class 
> boost::optional >’ has no member named ‘value’
>ASSERT_EQ(cache_.get(TestCacheKey(1)).value(), "hello");
>   
>   
>   
> ^
> make[2]: *** [src/gandiva/CMakeFiles/lru_cache_test.dir/lru_cache_test.cc.o] 
> Error 1
> make[1]: *** [src/gandiva/CMakeFiles/lru_cache_test.dir/all] Error 2
> make: *** [all] Error 2
> {code}
> Abseil has a {{std::optional}} backport, so we could switch from using 
> {{boost::optional}} if/when we start using Abseil 
> https://github.com/abseil/abseil-cpp/blob/master/absl/types/optional.h



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (ARROW-3437) [Gandiva][C++] Configure static linking of libgcc, libstdc++ with LDFLAGS

2018-11-19 Thread Kouhei Sutou (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou resolved ARROW-3437.
-
Resolution: Fixed

Issue resolved by pull request 2998
[https://github.com/apache/arrow/pull/2998]

> [Gandiva][C++] Configure static linking of libgcc, libstdc++ with LDFLAGS 
> --
>
> Key: ARROW-3437
> URL: https://issues.apache.org/jira/browse/ARROW-3437
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Gandiva
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is to create dependency-free binaries for deployment on Linux. Currently 
> this is hard coded but some deployments (e.g. conda) may wish to use the 
> libstdc++ that is available



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (ARROW-3773) [C++] Remove duplicated AssertArraysEqual code in parquet/arrow/arrow-reader-writer-test.cc

2018-11-19 Thread Kouhei Sutou (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kouhei Sutou resolved ARROW-3773.
-
Resolution: Fixed

Issue resolved by pull request 2999
[https://github.com/apache/arrow/pull/2999]

> [C++] Remove duplicated AssertArraysEqual code in 
> parquet/arrow/arrow-reader-writer-test.cc
> ---
>
> Key: ARROW-3773
> URL: https://issues.apache.org/jira/browse/ARROW-3773
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Phillip Cloud
>Assignee: Wes McKinney
>Priority: Major
>  Labels: parquet, pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3837) [C++] gflags link errors on Windows

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3837:
---

Assignee: Wes McKinney

> [C++] gflags link errors on Windows
> ---
>
> Key: ARROW-3837
> URL: https://issues.apache.org/jira/browse/ARROW-3837
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> These errors have been occurring in the last few days
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/20402981/job/cygaqwbjulgaxcn8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (ARROW-3837) [C++] gflags link errors on Windows

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-3837.
-
Resolution: Fixed

Issue resolved by pull request 3000
[https://github.com/apache/arrow/pull/3000]

> [C++] gflags link errors on Windows
> ---
>
> Key: ARROW-3837
> URL: https://issues.apache.org/jira/browse/ARROW-3837
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> These errors have been occurring in the last few days
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/20402981/job/cygaqwbjulgaxcn8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3837) [C++] gflags link errors on Windows

2018-11-19 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3837:
--
Labels: pull-request-available  (was: )

> [C++] gflags link errors on Windows
> ---
>
> Key: ARROW-3837
> URL: https://issues.apache.org/jira/browse/ARROW-3837
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> These errors have been occurring in the last few days
> https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/20402981/job/cygaqwbjulgaxcn8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3773) [C++] Remove duplicated AssertArraysEqual code in parquet/arrow/arrow-reader-writer-test.cc

2018-11-19 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3773:
--
Labels: parquet pull-request-available  (was: parquet)

> [C++] Remove duplicated AssertArraysEqual code in 
> parquet/arrow/arrow-reader-writer-test.cc
> ---
>
> Key: ARROW-3773
> URL: https://issues.apache.org/jira/browse/ARROW-3773
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Phillip Cloud
>Assignee: Wes McKinney
>Priority: Major
>  Labels: parquet, pull-request-available
> Fix For: 0.12.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3773) [C++] Remove duplicated AssertArraysEqual code in parquet/arrow/arrow-reader-writer-test.cc

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3773:
---

Assignee: Wes McKinney

> [C++] Remove duplicated AssertArraysEqual code in 
> parquet/arrow/arrow-reader-writer-test.cc
> ---
>
> Key: ARROW-3773
> URL: https://issues.apache.org/jira/browse/ARROW-3773
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Phillip Cloud
>Assignee: Wes McKinney
>Priority: Major
>  Labels: parquet
> Fix For: 0.12.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-3501) [Gandiva] Enable building with gcc 4.8.x on Ubuntu Trusty, similar distros

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692360#comment-16692360
 ] 

Wes McKinney commented on ARROW-3501:
-

This is working in https://github.com/apache/arrow/pull/2998

> [Gandiva] Enable building with gcc 4.8.x on Ubuntu Trusty, similar distros
> --
>
> Key: ARROW-3501
> URL: https://issues.apache.org/jira/browse/ARROW-3501
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Gandiva
>Reporter: Pindikura Ravindra
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> Gandiva has a dependency on gcc 4.9 - causes a link error with gcc 4.8. 
> Investigate and remove this dependency if possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3437) [Gandiva][C++] Configure static linking of libgcc, libstdc++ with LDFLAGS

2018-11-19 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3437:
--
Labels: pull-request-available  (was: )

> [Gandiva][C++] Configure static linking of libgcc, libstdc++ with LDFLAGS 
> --
>
> Key: ARROW-3437
> URL: https://issues.apache.org/jira/browse/ARROW-3437
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Gandiva
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> This is to create dependency-free binaries for deployment on Linux. Currently 
> this is hard coded but some deployments (e.g. conda) may wish to use the 
> libstdc++ that is available



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3501) [Gandiva] Enable building with gcc 4.8.x on Ubuntu Trusty, similar distros

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3501:
---

Assignee: Wes McKinney

> [Gandiva] Enable building with gcc 4.8.x on Ubuntu Trusty, similar distros
> --
>
> Key: ARROW-3501
> URL: https://issues.apache.org/jira/browse/ARROW-3501
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Gandiva
>Reporter: Pindikura Ravindra
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> Gandiva has a dependency on gcc 4.9 - causes a link error with gcc 4.8. 
> Investigate and remove this dependency if possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3501) [Gandiva] Enable building with gcc 4.8.x on Ubuntu Trusty, similar distros

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-3501:

Fix Version/s: 0.12.0

> [Gandiva] Enable building with gcc 4.8.x on Ubuntu Trusty, similar distros
> --
>
> Key: ARROW-3501
> URL: https://issues.apache.org/jira/browse/ARROW-3501
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Gandiva
>Reporter: Pindikura Ravindra
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> Gandiva has a dependency on gcc 4.9 - causes a link error with gcc 4.8. 
> Investigate and remove this dependency if possible. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3436) [C++] Boost version required by Gandiva is too new for Ubuntu 14.04

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3436:
---

Assignee: Wes McKinney

> [C++] Boost version required by Gandiva is too new for Ubuntu 14.04
> ---
>
> Key: ARROW-3436
> URL: https://issues.apache.org/jira/browse/ARROW-3436
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> I encountered this bug when testing a non-conda-toolchain build on Ubuntu 
> Trusty
> {code}
> [ 56%] Building CXX object 
> src/gandiva/CMakeFiles/lru_cache_test.dir/lru_cache_test.cc.o
> /home/wesm/code/arrow/cpp/src/gandiva/lru_cache_test.cc: In member function 
> ‘virtual void gandiva::TestLruCache_TestLruBehavior_Test::TestBody()’:
> /home/wesm/code/arrow/cpp/src/gandiva/lru_cache_test.cc:62:188: error: ‘class 
> boost::optional >’ has no member named ‘value’
>ASSERT_EQ(cache_.get(TestCacheKey(1)).value(), "hello");
>   
>   
> ^
> /home/wesm/code/arrow/cpp/src/gandiva/lru_cache_test.cc:62:203: error: 
> template argument 1 is invalid
>ASSERT_EQ(cache_.get(TestCacheKey(1)).value(), "hello");
>   
>   
>^
> /home/wesm/code/arrow/cpp/src/gandiva/lru_cache_test.cc:62:294: error: ‘class 
> boost::optional >’ has no member named ‘value’
>ASSERT_EQ(cache_.get(TestCacheKey(1)).value(), "hello");
>   
>   
>   
> ^
> make[2]: *** [src/gandiva/CMakeFiles/lru_cache_test.dir/lru_cache_test.cc.o] 
> Error 1
> make[1]: *** [src/gandiva/CMakeFiles/lru_cache_test.dir/all] Error 2
> make: *** [all] Error 2
> {code}
> Abseil has a {{std::optional}} backport, so we could switch from using 
> {{boost::optional}} if/when we start using Abseil 
> https://github.com/abseil/abseil-cpp/blob/master/absl/types/optional.h



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3437) [Gandiva][C++] Configure static linking of libgcc, libstdc++ with LDFLAGS

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3437:
---

Assignee: Wes McKinney

> [Gandiva][C++] Configure static linking of libgcc, libstdc++ with LDFLAGS 
> --
>
> Key: ARROW-3437
> URL: https://issues.apache.org/jira/browse/ARROW-3437
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++, Gandiva
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Blocker
> Fix For: 0.12.0
>
>
> This is to create dependency-free binaries for deployment on Linux. Currently 
> this is hard coded but some deployments (e.g. conda) may wish to use the 
> libstdc++ that is available



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3434) [Packaging] Add Apache ORC C++ library to conda-forge

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-3434:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Packaging] Add Apache ORC C++ library to conda-forge
> -
>
> Key: ARROW-3434
> URL: https://issues.apache.org/jira/browse/ARROW-3434
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
>  Labels: toolchain
> Fix For: 0.13.0
>
>
> In the vein of "toolchain all the things", it would be useful to be able to 
> obtain the ORC static libraries from a conda package rather than building 
> from source every time



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-3441) [Gandiva][C++] Produce fewer test executables

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692340#comment-16692340
 ] 

Wes McKinney commented on ARROW-3441:
-

We can revisit this after ARROW-3254. Moved to 0.13

> [Gandiva][C++] Produce fewer test executables
> -
>
> Key: ARROW-3441
> URL: https://issues.apache.org/jira/browse/ARROW-3441
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Gandiva
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.13.0
>
>
> In ARROW-3254, I am adding the functionality to create test executables from 
> multiple files that use googletest. So we can continue to have relatively 
> small unit test files, but combine unit tests into groups of 
> semantically-related functionality. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3424) [Python] Improved workflow for loading an arbitrary collection of Parquet files

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-3424:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] Improved workflow for loading an arbitrary collection of Parquet 
> files
> ---
>
> Key: ARROW-3424
> URL: https://issues.apache.org/jira/browse/ARROW-3424
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
>  Labels: parquet
> Fix For: 0.13.0
>
>
> See SO question for use case: 
> https://stackoverflow.com/questions/52613682/load-multiple-parquet-files-into-dataframe-for-analysis



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3441) [Gandiva][C++] Produce fewer test executables

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-3441:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Gandiva][C++] Produce fewer test executables
> -
>
> Key: ARROW-3441
> URL: https://issues.apache.org/jira/browse/ARROW-3441
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Gandiva
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.13.0
>
>
> In ARROW-3254, I am adding the functionality to create test executables from 
> multiple files that use googletest. So we can continue to have relatively 
> small unit test files, but combine unit tests into groups of 
> semantically-related functionality. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3778) [C++] Don't put implementations in test-util.h

2018-11-19 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3778:
--
Labels: pull-request-available  (was: )

> [C++] Don't put implementations in test-util.h
> --
>
> Key: ARROW-3778
> URL: https://issues.apache.org/jira/browse/ARROW-3778
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Affects Versions: 0.11.1
>Reporter: Antoine Pitrou
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> {{test-util.h}} is included in most (all?) test files, and it's quite long to 
> compile because it includes many other files and recompiles helper functions 
> all the time. Instead we should have only declarations in {{test-util.h}} and 
> put implementations in a separate {{.cc}} file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3778) [C++] Don't put implementations in test-util.h

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3778:
---

Assignee: Wes McKinney

> [C++] Don't put implementations in test-util.h
> --
>
> Key: ARROW-3778
> URL: https://issues.apache.org/jira/browse/ARROW-3778
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Affects Versions: 0.11.1
>Reporter: Antoine Pitrou
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> {{test-util.h}} is included in most (all?) test files, and it's quite long to 
> compile because it includes many other files and recompiles helper functions 
> all the time. Instead we should have only declarations in {{test-util.h}} and 
> put implementations in a separate {{.cc}} file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-3778) [C++] Don't put implementations in test-util.h

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692311#comment-16692311
 ] 

Wes McKinney commented on ARROW-3778:
-

I'll take care of this, since I already partially did it in my other patch

> [C++] Don't put implementations in test-util.h
> --
>
> Key: ARROW-3778
> URL: https://issues.apache.org/jira/browse/ARROW-3778
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Affects Versions: 0.11.1
>Reporter: Antoine Pitrou
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> {{test-util.h}} is included in most (all?) test files, and it's quite long to 
> compile because it includes many other files and recompiles helper functions 
> all the time. Instead we should have only declarations in {{test-util.h}} and 
> put implementations in a separate {{.cc}} file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3785) [C++] Use double-conversion conda package in CI toolchain

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3785:
---

Assignee: Wes McKinney

> [C++] Use double-conversion conda package in CI toolchain
> -
>
> Key: ARROW-3785
> URL: https://issues.apache.org/jira/browse/ARROW-3785
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> This is being built from the EP currently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-3785) [C++] Use double-conversion conda package in CI toolchain

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-3785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692287#comment-16692287
 ] 

Wes McKinney commented on ARROW-3785:
-

This doesn't work with ARROW_BUILD_TOOLCHAIN. Looking

> [C++] Use double-conversion conda package in CI toolchain
> -
>
> Key: ARROW-3785
> URL: https://issues.apache.org/jira/browse/ARROW-3785
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> This is being built from the EP currently



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (ARROW-3836) [C++] Add PREFIX option to ADD_ARROW_BENCHMARK

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-3836.
-
Resolution: Fixed

Issue resolved by pull request 2993
[https://github.com/apache/arrow/pull/2993]

> [C++] Add PREFIX option to ADD_ARROW_BENCHMARK
> --
>
> Key: ARROW-3836
> URL: https://issues.apache.org/jira/browse/ARROW-3836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> See option in ADD_ARROW_TEST



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3593) [R] CI builds failing due to GitHub API rate limits

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-3593:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [R] CI builds failing due to GitHub API rate limits
> ---
>
> Key: ARROW-3593
> URL: https://issues.apache.org/jira/browse/ARROW-3593
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: R
>Reporter: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Could be due to other GitHub issues of late. [~romainfrancois] 
> [~javierluraschi] could you have a look?
> https://travis-ci.org/apache/arrow/jobs/445003873#L2325



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3631) [C#] Add Appveyor build for C#

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-3631:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [C#] Add Appveyor build for C#
> --
>
> Key: ARROW-3631
> URL: https://issues.apache.org/jira/browse/ARROW-3631
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: C#
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.13.0
>
>
> Test C# library on Windows



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3824) [R] Document developer workflow for building project, running unit tests in r/README.md

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-3824:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [R] Document developer workflow for building project, running unit tests in 
> r/README.md
> ---
>
> Key: ARROW-3824
> URL: https://issues.apache.org/jira/browse/ARROW-3824
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: R
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.13.0
>
>
> Not being a regular R developer, it's not clear to me how to build and run 
> the test suite if I wanted to contribute to the project



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2879) [Python] Arrow plasma can only use a small part of specified shared memory

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2879:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] Arrow plasma can only use a small part of specified shared memory
> --
>
> Key: ARROW-2879
> URL: https://issues.apache.org/jira/browse/ARROW-2879
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: chineking
>Priority: Major
> Fix For: 0.13.0
>
>
> Hi, thanks for the great job of arrow, it helps us a lot.
> However, we encounter a problem when we were using plasma.
> The sample code:
> {code:python}
> import numpy as np
> import pyarrow as pa
> import pyarrow.plasma as plasma
> client = plasma.connect("/tmp/plasma", "", 0)
> puts = []
> nbytes = 0
> while True:
> a = np.ones((1000, 1000))
> try:
> oid = client.put(a)
> puts.append(client.get(oid))
> nbytes += a.nbytes
> except pa.lib.PlasmaStoreFull:
> print('use nbytes', nbytes)
> break
> {code}
> We start a plasma store with 1G memory, but the nbytes output above is only 
> 49600, which cannot even reach half of the memory we specified.
> I cannot figure out why plasma can only use such a small part of shared 
> memory. Could anybody help me? Thanks a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-3295) [Packaging] Package gRPC libraries in conda-forge for use in builds, packaging

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692193#comment-16692193
 ] 

Wes McKinney commented on ARROW-3295:
-

Can anyone help with this?

cc [~xhochy] [~kszucs] . I think all that is remaining is to set up 
grpc-feedstock (or maybe grpc-cpp-feedstock)?

There is recently support for building this for Python, which has the same 
dependency stack. The build scripts would obviously be a bit different though

https://github.com/conda-forge/grpcio-feedstock

> [Packaging] Package gRPC libraries in conda-forge for use in builds, packaging
> --
>
> Key: ARROW-3295
> URL: https://issues.apache.org/jira/browse/ARROW-3295
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Packaging
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> This includes Linux, macOS, and Windows packages, along with gRPC's 
> dependencies (some of which, like BoringSSL, are not in conda-forge yet). 
> This may require patching gRPC's build system (or copying files manually) 
> since it wants to install all its dependencies when you {{make install}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2913) [Python] Exported buffers don't expose type information

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2913:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] Exported buffers don't expose type information
> ---
>
> Key: ARROW-2913
> URL: https://issues.apache.org/jira/browse/ARROW-2913
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Python
>Affects Versions: 0.10.0
>Reporter: Antoine Pitrou
>Priority: Major
> Fix For: 0.13.0
>
>
> Using the {{buffers()}} method on array gives you a list of buffers backing 
> the array, but those buffers lose typing information:
> {code:python}
> >>> a = pa.array(range(10))
> >>> a.type
> DataType(int64)
> >>> buffers = a.buffers()
> >>> [(memoryview(buf).format, memoryview(buf).shape) for buf in buffers]
> [('b', (2,)), ('b', (80,))]
> {code}
> Conversely, Numpy exposes type information in the Python buffer protocol:
> {code:python}
> >>> a = pa.array(range(10))
> >>> memoryview(a.to_numpy()).format
> 'l'
> >>> memoryview(a.to_numpy()).shape
> (10,)
> {code}
> Exposing type information on buffers could be important for third-party 
> systems, such as Dask/distributed, for type-based data compression when 
> serializing.
> Since our C++ buffers are not typed, it's not obvious how to solve this. 
> Should we return tensors instead?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2959) Dockerize verify-release-candidate.{sh,bat}

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2959:

Fix Version/s: (was: 0.12.0)

> Dockerize verify-release-candidate.{sh,bat}
> ---
>
> Key: ARROW-2959
> URL: https://issues.apache.org/jira/browse/ARROW-2959
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++, Packaging
>Affects Versions: 0.9.0
>Reporter: Phillip Cloud
>Priority: Major
>
> There are a number of issues with the linux version of this script that would 
> disappear if the commands were all being run in a docker container.
> Anyone with docker installed should be able to verify the release candidate
> We could probably do the same for windows as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-2910) [Packaging] Build from official apache archive

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692187#comment-16692187
 ] 

Wes McKinney commented on ARROW-2910:
-

Seems we may not be ready for this in 0.12. Moving to 0.13

> [Packaging] Build from official apache archive
> --
>
> Key: ARROW-2910
> URL: https://issues.apache.org/jira/browse/ARROW-2910
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2887) [Plasma] Methods in plasma/store.h returning PlasmaError should return Status instead

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2887:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Plasma] Methods in plasma/store.h returning PlasmaError should return Status 
> instead
> -
>
> Key: ARROW-2887
> URL: https://issues.apache.org/jira/browse/ARROW-2887
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.13.0
>
>
> These functions are not able to return other kinds of errors (e.g. 
> CUDA-related errors) as a result of this. I encountered this while working on 
> ARROW-2883



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-2879) [Python] Arrow plasma can only use a small part of specified shared memory

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692184#comment-16692184
 ] 

Wes McKinney commented on ARROW-2879:
-

Any update on this?

> [Python] Arrow plasma can only use a small part of specified shared memory
> --
>
> Key: ARROW-2879
> URL: https://issues.apache.org/jira/browse/ARROW-2879
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: chineking
>Priority: Major
> Fix For: 0.13.0
>
>
> Hi, thanks for the great job of arrow, it helps us a lot.
> However, we encounter a problem when we were using plasma.
> The sample code:
> {code:python}
> import numpy as np
> import pyarrow as pa
> import pyarrow.plasma as plasma
> client = plasma.connect("/tmp/plasma", "", 0)
> puts = []
> nbytes = 0
> while True:
> a = np.ones((1000, 1000))
> try:
> oid = client.put(a)
> puts.append(client.get(oid))
> nbytes += a.nbytes
> except pa.lib.PlasmaStoreFull:
> print('use nbytes', nbytes)
> break
> {code}
> We start a plasma store with 1G memory, but the nbytes output above is only 
> 49600, which cannot even reach half of the memory we specified.
> I cannot figure out why plasma can only use such a small part of shared 
> memory. Could anybody help me? Thanks a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2910) [Packaging] Build from official apache archive

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2910:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Packaging] Build from official apache archive
> --
>
> Key: ARROW-2910
> URL: https://issues.apache.org/jira/browse/ARROW-2910
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Packaging
>Reporter: Krisztian Szucs
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (ARROW-2831) [Plasma] MemoryError in teardown

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-2831.
-
Resolution: Cannot Reproduce

Haven't seen this error in a while, seems to have been transient issue in 
Travis CI

> [Plasma] MemoryError in teardown
> 
>
> Key: ARROW-2831
> URL: https://issues.apache.org/jira/browse/ARROW-2831
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.12.0
>
>
> There seems to be some flakiness in Plasma tests, e.g. see: 
> https://api.travis-ci.org/v3/job/402544643/log.txt
> {code}
>  ERRORS 
> 
> _ ERROR at teardown of TestPlasmaClient.test_subscribe 
> _
> self = 
> test_method =  >
> [1mdef teardown_method(self, test_method):[0m
> [1mtry:[0m
> [1m# Check that the Plasma store is still alive.[0m
> [1massert self.p.poll() is None[0m
> [1m# Ensure Valgrind and/or coverage have a clean exit[0m
> [1mself.p.send_signal(signal.SIGTERM)[0m
> [1mif sys.version_info >= (3, 3):[0m
> [1mself.p.wait(timeout=5)[0m
> [1melse:[0m
> [1mself.p.wait()[0m
> [1m>   assert self.p.returncode == 0[0m
> [1m[31mE   assert 1 == 0[0m
> [1m[31mE+  where 1 =  0x7f9a3dcd5850>.returncode[0m
> [1m[31mE+where  
> = .p[0m
> [1m[31mpyarrow/tests/test_plasma.py[0m:132: AssertionError
>  Captured stderr setup 
> -
> ==20909== Memcheck, a memory error detector
> ==20909== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
> ==20909== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
> ==20909== Command: 
> /home/travis/build/apache/arrow/python/pyarrow/plasma_store -s 
> /tmp/test_plasma-Dzj8IQ/plasma.sock -m 1
> ==20909== 
> Allowing the Plasma store to use up to 0.1GB of memory.
> Connection to IPC socket failed for pathname 
> /tmp/test_plasma-Dzj8IQ/plasma.sock, retrying 50 more times
> Starting object store with directory /dev/shm and huge page support disabled
> Connection to IPC socket failed for pathname 
> /tmp/test_plasma-Dzj8IQ/plasma.sock, retrying 49 more times
> --- Captured stderr teardown 
> ---
> ==20909== Invalid free() / delete / delete[] / realloc()
> ==20909==at 0x4C2C83C: operator delete[](void*) (in 
> /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==20909==by 0x4A870A: std::default_delete []>::operator()(unsigned char*) const (unique_ptr.h:99)
> ==20909==by 0x4A0D6C: std::unique_ptr std::default_delete >::~unique_ptr() (unique_ptr.h:377)
> ==20909==by 0x4C3AE1: void std::_Destroy [], std::default_delete > >(std::unique_ptr [], std::default_delete >*) (stl_construct.h:93)
> ==20909==by 0x4C2DD4: void 
> std::_Destroy_aux::__destroy std::default_delete >*>(std::unique_ptr std::default_delete >*, std::unique_ptr std::default_delete >*) (stl_construct.h:103)
> ==20909==by 0x4C1999: void std::_Destroy [], std::default_delete >*>(std::unique_ptr [], std::default_delete >*, std::unique_ptr [], std::default_delete >*) (stl_construct.h:126)
> ==20909==by 0x4BF460: void std::_Destroy [], std::default_delete >*, std::unique_ptr [], std::default_delete > >(std::unique_ptr [], std::default_delete >*, std::unique_ptr [], std::default_delete >*, 
> std::allocator char []> > >&) (stl_construct.h:151)
> ==20909==by 0x4B9D41: std::deque std::default_delete >, 
> std::allocator char []> > > 
> >::_M_destroy_data_aux(std::_Deque_iterator std::default_delete >, std::unique_ptr std::default_delete >&, std::unique_ptr std::default_delete >*>, 
> std::_Deque_iterator std::default_delete >, std::unique_ptr std::default_delete >&, std::unique_ptr std::default_delete >*>) (deque.tcc:806)
> ==20909==by 0x4B17DA: std::deque std::default_delete >, 
> std::allocator char []> > > >::_M_destroy_data(std::_Deque_iterator char [], std::default_delete >, std::unique_ptr char [], std::default_delete >&, std::unique_ptr char [], std::default_delete >*>, 
> std::_Deque_iterator std::default_delete >, std::unique_ptr std::default_delete >&, std::unique_ptr std::default_delete >*>, 
> std::allocator char []> > > const&) (stl_deque.h:1853)
> ==20909==by 0x4C06F8: std::deque std::default_delete >, 
> std::allocator char []> > > >::~deque() (stl_deque.h:918)
> ==20909==by 0x4BBC83: plasma::NotificationQueue::~NotificationQueue() 
> (store.h:38)
> ==20909==by 0x4BBCC5: std::pair plasma::NotificationQueue>::~pair() (stl_pair.h:96)
> ==20909==  Address

[jira] [Commented] (ARROW-2818) [Python] Better error message when passing SparseDataFrame into Table.from_pandas

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692181#comment-16692181
 ] 

Wes McKinney commented on ARROW-2818:
-

[~kszucs] can you have a look at this? Might be easiest just to reject any 
subclass of {{pandas.DataFrame}}

> [Python] Better error message when passing SparseDataFrame into 
> Table.from_pandas
> -
>
> Key: ARROW-2818
> URL: https://issues.apache.org/jira/browse/ARROW-2818
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> This can be a rough edge for users. Note that pandas sparse support is being 
> considered for deprecation
> original issue https://github.com/apache/arrow/issues/1894



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2796) [C++] Simplify symbols.map file, use when building libarrow_python

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2796:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [C++] Simplify symbols.map file, use when building libarrow_python
> --
>
> Key: ARROW-2796
> URL: https://issues.apache.org/jira/browse/ARROW-2796
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.13.0
>
>
> I did a little work on this in https://github.com/apache/arrow/pull/2096. 
> While that patch was not merged, the changes related to symbol visibility 
> ought to be plucked into a new patch



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2720) [C++] Clean up cmake CXX_STANDARD and PIC flag setting

2018-11-19 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-2720:
--
Labels: pull-request-available  (was: )

> [C++] Clean up cmake CXX_STANDARD and PIC flag setting
> --
>
> Key: ARROW-2720
> URL: https://issues.apache.org/jira/browse/ARROW-2720
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++
>Reporter: Phillip Cloud
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> We're using {{-std=c++11}} in a few non-external project places as well as 
> setting {{-fPIC}}. CMake provides the {{CMAKE_CXX_STANDARD}} flag (which we 
> are also using) and the {{CMAKE_POSITION_INDEPENDENT_CODE}} flag for setting 
> these options in a cross platform way (where it matters).
> We should use these flags instead of using platform conditional checks to set 
> their values explicitly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2523) [Rust] Implement CAST operations for arrays

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2523:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Rust] Implement CAST operations for arrays
> ---
>
> Key: ARROW-2523
> URL: https://issues.apache.org/jira/browse/ARROW-2523
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Minor
> Fix For: 0.13.0
>
>
> I have implemented CAST operations in DataFusion but I would like to 
> re-implement this now directly in Arrow. I will create a PR after the Rust 
> refactor is complete.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2620) [Rust] Integrate memory pool abstraction with rest of codebase

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2620:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Rust] Integrate memory pool abstraction with rest of codebase
> --
>
> Key: ARROW-2620
> URL: https://issues.apache.org/jira/browse/ARROW-2620
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Andy Grove
>Priority: Major
> Fix For: 0.13.0
>
>
> A memory pool abstraction was contributed but is not actually used by the 
> rest of the code base.
> We should either remove it or integrate it.
> If we integrate it, it should be done in a similar way to the C++ API.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-2487) [C++] Provide a variant of AppendValues that takes bytemaps for the nullability

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692162#comment-16692162
 ] 

Wes McKinney commented on ARROW-2487:
-

Is this still an issue? We have AppendValues methods that accepts a byte vector 
argument for the validity markers

> [C++] Provide a variant of AppendValues that takes bytemaps for the 
> nullability
> ---
>
> Key: ARROW-2487
> URL: https://issues.apache.org/jira/browse/ARROW-2487
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Uwe L. Korn
>Priority: Major
>  Labels: beginner
> Fix For: 0.13.0
>
>
> Instead of only accepting bitmaps, we should provide users with a variant of 
> {{AppendValues}} that can work on bytemaps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-2720) [C++] Clean up cmake CXX_STANDARD and PIC flag setting

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-2720:
---

Assignee: Wes McKinney  (was: Phillip Cloud)

> [C++] Clean up cmake CXX_STANDARD and PIC flag setting
> --
>
> Key: ARROW-2720
> URL: https://issues.apache.org/jira/browse/ARROW-2720
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++
>Reporter: Phillip Cloud
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> We're using {{-std=c++11}} in a few non-external project places as well as 
> setting {{-fPIC}}. CMake provides the {{CMAKE_CXX_STANDARD}} flag (which we 
> are also using) and the {{CMAKE_POSITION_INDEPENDENT_CODE}} flag for setting 
> these options in a cross platform way (where it matters).
> We should use these flags instead of using platform conditional checks to set 
> their values explicitly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2487) [C++] Provide a variant of AppendValues that takes bytemaps for the nullability

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2487:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [C++] Provide a variant of AppendValues that takes bytemaps for the 
> nullability
> ---
>
> Key: ARROW-2487
> URL: https://issues.apache.org/jira/browse/ARROW-2487
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Uwe L. Korn
>Priority: Major
>  Labels: beginner
> Fix For: 0.13.0
>
>
> Instead of only accepting bitmaps, we should provide users with a variant of 
> {{AppendValues}} that can work on bytemaps.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2600) [Python] Add additional LocalFileSystem filesystem methods

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2600:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] Add additional LocalFileSystem filesystem methods
> --
>
> Key: ARROW-2600
> URL: https://issues.apache.org/jira/browse/ARROW-2600
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Alex Hagerman
>Priority: Minor
>  Labels: filesystem, pull-request-available
> Fix For: 0.13.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Related to https://issues.apache.org/jira/browse/ARROW-1319 I noticed the 
> methods Martin listed are also not part of the LocalFileSystem class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2560) [Rust] The Rust README should include Rust-specific information on contributing

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2560:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Rust] The Rust README should include Rust-specific information on 
> contributing
> ---
>
> Key: ARROW-2560
> URL: https://issues.apache.org/jira/browse/ARROW-2560
> Project: Apache Arrow
>  Issue Type: Task
>  Components: Rust
>Reporter: Andy Grove
>Priority: Trivial
>  Labels: beginner
> Fix For: 0.13.0
>
>
> Every new contributor has their first build fail because they didn't know to 
> use cargo fmt.
> We should explain this in the Rust README along with any other pertinent 
> information specific to Rust contributions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2618) [Rust] Bitmap constructor should accept for flag for default state (0 or 1)

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2618:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Rust] Bitmap constructor should accept for flag for default state (0 or 1)
> ---
>
> Key: ARROW-2618
> URL: https://issues.apache.org/jira/browse/ARROW-2618
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Trivial
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-2374) [Rust] Add support for array of List

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692160#comment-16692160
 ] 

Wes McKinney commented on ARROW-2374:
-

[~andygrove] can this be closed?

> [Rust] Add support for array of List
> ---
>
> Key: ARROW-2374
> URL: https://issues.apache.org/jira/browse/ARROW-2374
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add support for List in Array types. Look at Utf8 which wraps List to 
> see how this works.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2460) [Rust] Schema and DataType::Struct should use Vec>

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2460:

Component/s: Rust

> [Rust] Schema and DataType::Struct should use Vec>
> 
>
> Key: ARROW-2460
> URL: https://issues.apache.org/jira/browse/ARROW-2460
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Reporter: Andy Grove
>Priority: Minor
> Fix For: 0.13.0
>
>
> Currently we use Vec instead of Vec> which is resulting in 
> having to clone fields in some use cases, which could be expensive for 
> structs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2460) [Rust] Schema and DataType::Struct should use Vec>

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2460:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Rust] Schema and DataType::Struct should use Vec>
> 
>
> Key: ARROW-2460
> URL: https://issues.apache.org/jira/browse/ARROW-2460
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Reporter: Andy Grove
>Priority: Minor
> Fix For: 0.13.0
>
>
> Currently we use Vec instead of Vec> which is resulting in 
> having to clone fields in some use cases, which could be expensive for 
> structs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2366) [Python] Support reading Parquet files having a permutation of column order

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2366:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] Support reading Parquet files having a permutation of column order
> ---
>
> Key: ARROW-2366
> URL: https://issues.apache.org/jira/browse/ARROW-2366
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: parquet
> Fix For: 0.13.0
>
>
> See discussion in https://github.com/dask/fastparquet/issues/320



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2399) [Rust] Builder should not provide a set() method

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2399:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Rust] Builder should not provide a set() method
> ---
>
> Key: ARROW-2399
> URL: https://issues.apache.org/jira/browse/ARROW-2399
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Rust
>Reporter: Andy Grove
>Priority: Major
> Fix For: 0.13.0
>
>
> Arrays should be immutable, but we have a `set` method on Buffer that 
> should not be there.
> This is only used from the Bitmap struct. Perhaps Bitmap should maintain its 
> own memory instead and not use Buffer?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-1989) [Python] Better UX on timestamp conversion to Pandas

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-1989:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] Better UX on timestamp conversion to Pandas
> 
>
> Key: ARROW-1989
> URL: https://issues.apache.org/jira/browse/ARROW-1989
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Uwe L. Korn
>Priority: Major
> Fix For: 0.13.0
>
>
> Converting timestamp columns to Pandas, users often have the problem that 
> they have dates that are larger than Pandas can represent with their 
> nanosecond representation. Currently they simply see an Arrow exception and 
> think that this problem is caused by Arrow. We should try to change the error 
> from
> {code}
> ArrowInvalid: Casting from timestamp[ns] to timestamp[us] would lose data: XX
> {code}
> to something along the lines of 
> {code}
> ArrowInvalid: Casting from timestamp[ns] to timestamp[us] would lose data: 
> XX. This conversion is needed as Pandas does only support nanosecond 
> timestamps. Your data is likely out of the range that can be represented with 
> nanosecond resolution.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-3840) [C++] Run fuzzer tests with docker-compose

2018-11-19 Thread Wes McKinney (JIRA)

Wes McKinney created ARROW-3840:
---

 Summary: [C++] Run fuzzer tests with docker-compose
 Key: ARROW-3840
 URL: https://issues.apache.org/jira/browse/ARROW-3840
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.13.0


These are not being run regularly right now



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2365) [Plasma] Return status codes instead of crashing

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2365:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Plasma] Return status codes instead of crashing
> 
>
> Key: ARROW-2365
> URL: https://issues.apache.org/jira/browse/ARROW-2365
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Plasma (C++)
>Reporter: Antoine Pitrou
>Priority: Major
> Fix For: 0.13.0
>
>
> When certain {{PlasmaClient}} methods are called with bad arguments, 
> PlasmaClient crashes instead of returning an error Status. For example, try 
> calling {{Seal()}} with a non-existent object id.
> This is hostile towards users of high-level languages such as Python.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2038) [Python] Follow-up bug fixes for s3fs Parquet support

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2038:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] Follow-up bug fixes for s3fs Parquet support
> -
>
> Key: ARROW-2038
> URL: https://issues.apache.org/jira/browse/ARROW-2038
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
>  Labels: aws, parquet
> Fix For: 0.13.0
>
>
> see discussion in 
> https://github.com/apache/arrow/pull/916#issuecomment-360558248



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-2237) [Python] [Plasma] Huge pages test failure

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-2237:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] [Plasma] Huge pages test failure
> -
>
> Key: ARROW-2237
> URL: https://issues.apache.org/jira/browse/ARROW-2237
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Antoine Pitrou
>Priority: Major
> Fix For: 0.13.0
>
>
> This is a new failure here (Ubuntu 16.04, x86-64):
> {code}
> _ test_use_huge_pages 
> _
> Traceback (most recent call last):
>   File "/home/antoine/arrow/python/pyarrow/tests/test_plasma.py", line 779, 
> in test_use_huge_pages
> create_object(plasma_client, 1)
>   File "/home/antoine/arrow/python/pyarrow/tests/test_plasma.py", line 80, in 
> create_object
> seal=seal)
>   File "/home/antoine/arrow/python/pyarrow/tests/test_plasma.py", line 69, in 
> create_object_with_id
> memory_buffer = client.create(object_id, data_size, metadata)
>   File "plasma.pyx", line 302, in pyarrow.plasma.PlasmaClient.create
>   File "error.pxi", line 79, in pyarrow.lib.check_status
> pyarrow.lib.ArrowIOError: /home/antoine/arrow/cpp/src/plasma/client.cc:192 
> code: PlasmaReceive(store_conn_, MessageType_PlasmaCreateReply, )
> /home/antoine/arrow/cpp/src/plasma/protocol.cc:46 code: ReadMessage(sock, 
> , buffer)
> Encountered unexpected EOF
>  Captured stderr call 
> -
> Allowing the Plasma store to use up to 0.1GB of memory.
> Starting object store with directory /mnt/hugepages and huge page support 
> enabled
> create_buffer failed to open file /mnt/hugepages/plasmapSNc0X
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-1796) [Python] RowGroup filtering on file level

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-1796:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] RowGroup filtering on file level
> -
>
> Key: ARROW-1796
> URL: https://issues.apache.org/jira/browse/ARROW-1796
> Project: Apache Arrow
>  Issue Type: Improvement
>Reporter: Uwe L. Korn
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: parquet, pull-request-available
> Fix For: 0.13.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> We can build upon the API defined in {{fastparquet}} for defining RowGroup 
> filters: 
> https://github.com/dask/fastparquet/blob/master/fastparquet/api.py#L296-L300 
> and translate them into the C++ enums we will define in 
> https://issues.apache.org/jira/browse/PARQUET-1158 . This should enable us to 
> provide the user with a simple predicate pushdown API that we can extend in 
> the background from RowGroup to Page level later on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-1896) [C++] Do not allocate memory for primitive outputs in CastKernel::Call implementation

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-1896:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [C++] Do not allocate memory for primitive outputs in CastKernel::Call 
> implementation
> -
>
> Key: ARROW-1896
> URL: https://issues.apache.org/jira/browse/ARROW-1896
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Priority: Major
> Fix For: 0.13.0
>
>
> This is some refactoring / tidying. Unless an output of cast has a 
> non-determinate size (e.g. is Binary or something else), the 
> {{CastKernel::Call}} implementation should assume that it is writing into 
> pre-allocated memory. The corresponding memory allocation can be lifted into 
> the {{arrow::compute::Cast}} API



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-1425) [Python] Document semantic differences between Spark timestamps and Arrow timestamps

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-1425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-1425:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] Document semantic differences between Spark timestamps and Arrow 
> timestamps
> 
>
> Key: ARROW-1425
> URL: https://issues.apache.org/jira/browse/ARROW-1425
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Li Jin
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.13.0
>
>
> The way that Spark treats non-timezone-aware timestamps as session local can 
> be problematic when using pyarrow which may view the data coming from 
> toPandas() as time zone naive (but with fields as though it were UTC, not 
> session local). We should document carefully how to properly handle the data 
> coming from Spark to avoid problems.
> cc [~bryanc] [~holdenkarau]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-976) [Python] Provide API for defining and reading Parquet datasets with more ad hoc partition schemes

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-976:
---
Fix Version/s: (was: 0.12.0)
   0.13.0

> [Python] Provide API for defining and reading Parquet datasets with more ad 
> hoc partition schemes
> -
>
> Key: ARROW-976
> URL: https://issues.apache.org/jira/browse/ARROW-976
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Python
>Reporter: Wes McKinney
>Priority: Major
>  Labels: parquet
> Fix For: 0.13.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-1266) [Plasma] Move heap allocations to arrow memory pool

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-1266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-1266:

Fix Version/s: (was: 0.12.0)
   0.13.0

> [Plasma] Move heap allocations to arrow memory pool
> ---
>
> Key: ARROW-1266
> URL: https://issues.apache.org/jira/browse/ARROW-1266
> Project: Apache Arrow
>  Issue Type: Bug
>Reporter: Philipp Moritz
>Priority: Major
> Fix For: 0.13.0
>
>
> At the moment we are allocating memory with std::vectors and even new in some 
> places, this should be cleaned up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3836) [C++] Add PREFIX option to ADD_ARROW_BENCHMARK

2018-11-19 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3836:
--
Labels: pull-request-available  (was: )

> [C++] Add PREFIX option to ADD_ARROW_BENCHMARK
> --
>
> Key: ARROW-3836
> URL: https://issues.apache.org/jira/browse/ARROW-3836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> See option in ADD_ARROW_TEST



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3836) [C++] Add PREFIX option to ADD_ARROW_BENCHMARK

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-3836:
---

Assignee: Wes McKinney

> [C++] Add PREFIX option to ADD_ARROW_BENCHMARK
> --
>
> Key: ARROW-3836
> URL: https://issues.apache.org/jira/browse/ARROW-3836
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: C++
>Reporter: Wes McKinney
>Assignee: Wes McKinney
>Priority: Major
> Fix For: 0.12.0
>
>
> See option in ADD_ARROW_TEST



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-3831) [C++] arrow::util::Codec::Decompress() doesn't return decompressed data size

2018-11-19 Thread Wes McKinney (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692089#comment-16692089
 ] 

Wes McKinney commented on ARROW-3831:
-

What do you think about adding a second {{Decompress}} virtual method that will 
return the output length?

https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/compression.h#L106

The default implementation could be NotImplemented for codecs that do not 
support this

> [C++] arrow::util::Codec::Decompress() doesn't return decompressed data size
> 
>
> Key: ARROW-3831
> URL: https://issues.apache.org/jira/browse/ARROW-3831
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.11.1
>Reporter: Kouhei Sutou
>Priority: Major
>
> We can't know decompressed data size when we only have compressed data. The 
> current {{arrow::util::Codec::Decompress()}} doesn't return decompressed data 
> size. So we can't know which data in {{output_buffer}} can be used.
> FYI: {{arrow::util::Codec::Compress()}} returns compressed data size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-3799) [Gandiva] Improve `make_in_expression`

2018-11-19 Thread Pindikura Ravindra (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691956#comment-16691956
 ] 

Pindikura Ravindra commented on ARROW-3799:
---

can I change the summary to : in-expression needs to add support date/time 
datatypes ? 

 

> [Gandiva] Improve `make_in_expression`
> --
>
> Key: ARROW-3799
> URL: https://issues.apache.org/jira/browse/ARROW-3799
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Gandiva
>Reporter: Siyuan Zhuang
>Priority: Major
>
> The `make_in_expression` in gandiva was not implemented correctly. Although 
> [ARROW-3751|https://issues.apache.org/jira/projects/ARROW/issues/ARROW-3751] 
> has fixed part of it, further improvement is still necessary. See 
> `test_in_expr_todo` in 
> [python/pyarrow/tests/test_gandiva.py|https://github.com/apache/arrow/pull/2936/files#diff-9ab0e0dc1f329321ff4555b043ee0f41]
>  for details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3838) [Rust] Implement CSV Writer

2018-11-19 Thread Andy Grove (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Grove updated ARROW-3838:
--
Summary: [Rust] Implement CSV Writer  (was: Implement CSV Writer)

> [Rust] Implement CSV Writer
> ---
>
> Key: ARROW-3838
> URL: https://issues.apache.org/jira/browse/ARROW-3838
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Rust
>Affects Versions: 0.11.1
>Reporter: Andy Grove
>Assignee: Andy Grove
>Priority: Minor
>
> A CSV reader is being implemented in ARROW-3726 and this ticket is to add the 
> corresponding writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-3839) [Rust] Add ability to infer schema in CSV reader

2018-11-19 Thread Andy Grove (JIRA)

Andy Grove created ARROW-3839:
-

 Summary: [Rust] Add ability to infer schema in CSV reader
 Key: ARROW-3839
 URL: https://issues.apache.org/jira/browse/ARROW-3839
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Affects Versions: 0.11.1
Reporter: Andy Grove


A CSV reader is being added in ARROW-3726 and it currently requires an explicit 
schema to be provided.

It would be nice to have an option where the schema can be inferred 
automatically.

The user should be able to specify some defaults, such as date/time formats.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-1993) [Python] Add function for determining implied Arrow schema from pandas.DataFrame

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney reassigned ARROW-1993:
---

Assignee: Krisztian Szucs  (was: Uwe L. Korn)

> [Python] Add function for determining implied Arrow schema from 
> pandas.DataFrame
> 
>
> Key: ARROW-1993
> URL: https://issues.apache.org/jira/browse/ARROW-1993
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: beginner, pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently the only option is to use {{Table/Array.from_pandas}} which does 
> significant unnecessary work and allocates memory. If only the schema is of 
> interest, then we could do less work and not allocate memory.
> We should provide the user a function {{pyarrow.Schema.from_pandas}} which 
> takes a DataFrame as an input and returns the respective Arrow schema. The 
> functionality for determing the schema is already available in the Python 
> code, it is at moment just very tightly bound to the conversion 
> infrastructure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-3838) Implement CSV Writer

2018-11-19 Thread Andy Grove (JIRA)

Andy Grove created ARROW-3838:
-

 Summary: Implement CSV Writer
 Key: ARROW-3838
 URL: https://issues.apache.org/jira/browse/ARROW-3838
 Project: Apache Arrow
  Issue Type: Improvement
  Components: Rust
Affects Versions: 0.11.1
Reporter: Andy Grove
Assignee: Andy Grove


A CSV reader is being implemented in ARROW-3726 and this ticket is to add the 
corresponding writer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-1993) [Python] Add function for determining implied Arrow schema from pandas.DataFrame

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney updated ARROW-1993:

Fix Version/s: (was: 0.13.0)
   0.12.0

> [Python] Add function for determining implied Arrow schema from 
> pandas.DataFrame
> 
>
> Key: ARROW-1993
> URL: https://issues.apache.org/jira/browse/ARROW-1993
> Project: Apache Arrow
>  Issue Type: Improvement
>  Components: Python
>Reporter: Wes McKinney
>Assignee: Uwe L. Korn
>Priority: Major
>  Labels: beginner, pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently the only option is to use {{Table/Array.from_pandas}} which does 
> significant unnecessary work and allocates memory. If only the schema is of 
> interest, then we could do less work and not allocate memory.
> We should provide the user a function {{pyarrow.Schema.from_pandas}} which 
> takes a DataFrame as an input and returns the respective Arrow schema. The 
> functionality for determing the schema is already available in the Python 
> code, it is at moment just very tightly bound to the conversion 
> infrastructure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (ARROW-3835) [C++] arrow::io::CompressedOutputStream::raw() impementation is missing

2018-11-19 Thread Wes McKinney (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wes McKinney resolved ARROW-3835.
-
   Resolution: Fixed
Fix Version/s: 0.12.0

Issue resolved by pull request 2990
[https://github.com/apache/arrow/pull/2990]

> [C++] arrow::io::CompressedOutputStream::raw() impementation is missing
> ---
>
> Key: ARROW-3835
> URL: https://issues.apache.org/jira/browse/ARROW-3835
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: C++
>Affects Versions: 0.11.1
>Reporter: Kouhei Sutou
>Assignee: Kouhei Sutou
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-3837) [C++] gflags link errors on Windows

2018-11-19 Thread Wes McKinney (JIRA)

Wes McKinney created ARROW-3837:
---

 Summary: [C++] gflags link errors on Windows
 Key: ARROW-3837
 URL: https://issues.apache.org/jira/browse/ARROW-3837
 Project: Apache Arrow
  Issue Type: Bug
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.12.0


These errors have been occurring in the last few days

https://ci.appveyor.com/project/ApacheSoftwareFoundation/arrow/builds/20402981/job/cygaqwbjulgaxcn8



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ARROW-3836) [C++] Add PREFIX option to ADD_ARROW_BENCHMARK

2018-11-19 Thread Wes McKinney (JIRA)

Wes McKinney created ARROW-3836:
---

 Summary: [C++] Add PREFIX option to ADD_ARROW_BENCHMARK
 Key: ARROW-3836
 URL: https://issues.apache.org/jira/browse/ARROW-3836
 Project: Apache Arrow
  Issue Type: Improvement
  Components: C++
Reporter: Wes McKinney
 Fix For: 0.12.0


See option in ADD_ARROW_TEST



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-3726) [Rust] CSV Reader & Writer

2018-11-19 Thread Andy Grove (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691901#comment-16691901
 ] 

Andy Grove commented on ARROW-3726:
---

I have not implemented schema inference yet. I think we should create a 
separate ticket for this feature where we can figure out the requirements, like 
the timestamp issue you mentioned.

> [Rust] CSV Reader & Writer
> --
>
> Key: ARROW-3726
> URL: https://issues.apache.org/jira/browse/ARROW-3726
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: nevi_me
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As an Arrow Rust user, I would like to be able to read and write CSV files, 
> so that I can quickly ingest data into an Arrow format for futher use, and 
> save outputs in CSV.
> As there aren't yet many options for working with tabular/df structures in 
> Rust (other than Andy's DataFusion), I'm struggling to motivate for this 
> feature. However, I think building a csv parser into Rust would reduce effort 
> for future libs (incl DataFusion).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (ARROW-3726) [Rust] CSV Reader & Writer

2018-11-19 Thread nevi_me (JIRA)



[ 
https://issues.apache.org/jira/browse/ARROW-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16691880#comment-16691880
 ] 

nevi_me commented on ARROW-3726:


Thanks [~andygrove], does the reader support inferring data schema? I got 
schema inference working through sampling a csv and using the regex crate to 
match fields by some hierarchy. 

It worked relatively well on primitive types, but I struggled with timestamps 
because I was trying to read from csv to parquet. For timestamps we could allow 
the user to specify the format (-mm-dd-... etc), or default to an ISO 
format. What do you think?

> [Rust] CSV Reader & Writer
> --
>
> Key: ARROW-3726
> URL: https://issues.apache.org/jira/browse/ARROW-3726
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: nevi_me
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> As an Arrow Rust user, I would like to be able to read and write CSV files, 
> so that I can quickly ingest data into an Arrow format for futher use, and 
> save outputs in CSV.
> As there aren't yet many options for working with tabular/df structures in 
> Rust (other than Andy's DataFusion), I'm struggling to motivate for this 
> feature. However, I think building a csv parser into Rust would reduce effort 
> for future libs (incl DataFusion).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3726) [Rust] CSV Reader & Writer

2018-11-19 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3726:
--
Labels: pull-request-available  (was: )

> [Rust] CSV Reader & Writer
> --
>
> Key: ARROW-3726
> URL: https://issues.apache.org/jira/browse/ARROW-3726
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: nevi_me
>Assignee: Andy Grove
>Priority: Major
>  Labels: pull-request-available
>
> As an Arrow Rust user, I would like to be able to read and write CSV files, 
> so that I can quickly ingest data into an Arrow format for futher use, and 
> save outputs in CSV.
> As there aren't yet many options for working with tabular/df structures in 
> Rust (other than Andy's DataFusion), I'm struggling to motivate for this 
> feature. However, I think building a csv parser into Rust would reduce effort 
> for future libs (incl DataFusion).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3726) [Rust] CSV Reader & Writer

2018-11-19 Thread Andy Grove (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Grove reassigned ARROW-3726:
-

Assignee: Andy Grove  (was: Chao Sun)

> [Rust] CSV Reader & Writer
> --
>
> Key: ARROW-3726
> URL: https://issues.apache.org/jira/browse/ARROW-3726
> Project: Apache Arrow
>  Issue Type: New Feature
>  Components: Rust
>Reporter: nevi_me
>Assignee: Andy Grove
>Priority: Major
>
> As an Arrow Rust user, I would like to be able to read and write CSV files, 
> so that I can quickly ingest data into an Arrow format for futher use, and 
> save outputs in CSV.
> As there aren't yet many options for working with tabular/df structures in 
> Rust (other than Andy's DataFusion), I'm struggling to motivate for this 
> feature. However, I think building a csv parser into Rust would reduce effort 
> for future libs (incl DataFusion).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (ARROW-3766) [Python] pa.Table.from_pandas doesn't use schema ordering

2018-11-19 Thread Krisztian Szucs (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Krisztian Szucs reassigned ARROW-3766:
--

Assignee: Krisztian Szucs

> [Python] pa.Table.from_pandas doesn't use schema ordering
> -
>
> Key: ARROW-3766
> URL: https://issues.apache.org/jira/browse/ARROW-3766
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Reporter: Christian Thiel
>Assignee: Krisztian Szucs
>Priority: Major
>  Labels: parquet, pull-request-available
> Fix For: 0.12.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Pyarrow is sensitive to the order of the columns upon load of partitioned 
> Files.
> With the function {{pa.Table.from_pandas(dataframe, schema=my_schema)}} we 
> can apply a schema to a dataframe. I noticed that the returned {{pa.Table}} 
> object does use the ordering of pandas columns rather than the schema 
> columns. Furthermore it is possible to have columns in the schema but not in 
> the DataFrame (and hence in the resulting pa.Table).
> This behaviour requires a lot of fiddling with the pandas Frame in the first 
> place if we like to write compatible partitioned files. Hence I argue that 
> for {{pa.Table.from_pandas}}, and any other comparable function, the schema 
> should be the principal source for the Table structure and not the columns 
> and the ordering in the pandas DataFrame. If I specify a schema I simply 
> expect that the resulting Table actually has this schema.
> Here is a little example. If you remove the reordering of df2 everything 
> works fine:
> {code:python}
> import pyarrow as pa
> import pyarrow.parquet as pq
> import pandas as pd
> import os
> import numpy as np
> import shutil
> PATH_PYARROW_MANUAL = '/tmp/pyarrow_manual.pa/'
> if os.path.exists(PATH_PYARROW_MANUAL):
> shutil.rmtree(PATH_PYARROW_MANUAL)
> os.mkdir(PATH_PYARROW_MANUAL)
> arrays = np.array([np.array([0, 1, 2]), np.array([3, 4]), np.nan, np.nan])
> strings = np.array([np.nan, np.nan, 'a', 'b'])
> df = pd.DataFrame([0, 0, 1, 1], columns=['partition_column'])
> df.index.name='DPRD_ID'
> df['arrays'] = pd.Series(arrays)
> df['strings'] = pd.Series(strings)
> my_schema = pa.schema([('DPRD_ID', pa.int64()),
>('partition_column', pa.int32()),
>('arrays', pa.list_(pa.int32())),
>('strings', pa.string()),
>('new_column', pa.string())])
> df1 = df[df.partition_column==0]
> df2 = df[df.partition_column==1][['strings', 'partition_column', 'arrays']]
> table1 = pa.Table.from_pandas(df1, schema=my_schema)
> table2 = pa.Table.from_pandas(df2, schema=my_schema)
> pq.write_table(table1, os.path.join(PATH_PYARROW_MANUAL, '1.pa'))
> pq.write_table(table2, os.path.join(PATH_PYARROW_MANUAL, '2.pa'))
> pd.read_parquet(PATH_PYARROW_MANUAL)
> {code}
> If 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-3609) [Gandiva] Move benchmark tests out of unit test

2018-11-19 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/ARROW-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ARROW-3609:
--
Labels: pull-request-available  (was: )

> [Gandiva] Move benchmark tests out of unit test
> ---
>
> Key: ARROW-3609
> URL: https://issues.apache.org/jira/browse/ARROW-3609
> Project: Apache Arrow
>  Issue Type: Task
>  Components: C++, Gandiva
>Reporter: Praveen Kumar Desabandu
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.12.0
>
>
> Currently the benchmarks are run as integ tests. We should move them out as 
> gbenchmark tests.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

81 matches

Mail list logo