[jira] [Updated] (ARROW-8906) [Rust] Support reading multiple CSV files for schema inference

2020-05-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8906: -- Labels: pull-request-available (was: ) > [Rust] Support reading multiple CSV files for schema

[jira] [Created] (ARROW-8906) [Rust] Support reading multiple CSV files for schema inference

2020-05-22 Thread QP Hou (Jira)
QP Hou created ARROW-8906: - Summary: [Rust] Support reading multiple CSV files for schema inference Key: ARROW-8906 URL: https://issues.apache.org/jira/browse/ARROW-8906 Project: Apache Arrow Issue

[jira] [Commented] (ARROW-8901) [C++] Reduce number of take kernels

2020-05-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114518#comment-17114518 ] Wes McKinney commented on ARROW-8901: - We probably need at least int8 through int64 (so we can use

[jira] [Updated] (ARROW-8905) [C++] Collapse Take APIs from 8 to 1 or 2

2020-05-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-8905: Description: There are currently 8 {{arrow::compute::Take}} functions with different function

[jira] [Created] (ARROW-8905) [C++] Collapse Take APIs from 8 to 1 or 2

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8905: --- Summary: [C++] Collapse Take APIs from 8 to 1 or 2 Key: ARROW-8905 URL: https://issues.apache.org/jira/browse/ARROW-8905 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-6775) [C++] [Python] Proposal for several Array utility functions

2020-05-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114510#comment-17114510 ] Wes McKinney commented on ARROW-6775: - I think these can all be implemented as kernels with the new

[jira] [Commented] (ARROW-3520) [C++] Implement List Flatten kernel

2020-05-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-3520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114509#comment-17114509 ] Wes McKinney commented on ARROW-3520: - This would be fine as a {{VectorFunction}} > [C++] Implement

[jira] [Created] (ARROW-8904) [Python] Fix usages of deprecated C++ APIs related to child/field

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8904: --- Summary: [Python] Fix usages of deprecated C++ APIs related to child/field Key: ARROW-8904 URL: https://issues.apache.org/jira/browse/ARROW-8904 Project: Apache Arrow

[jira] [Created] (ARROW-8903) [C++] Implement optimized "unsafe take" for use with selection vectors for kernel execution

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8903: --- Summary: [C++] Implement optimized "unsafe take" for use with selection vectors for kernel execution Key: ARROW-8903 URL: https://issues.apache.org/jira/browse/ARROW-8903

[jira] [Resolved] (ARROW-8815) [Dev][Release] Binary upload script should retry on unexpected bintray request error

2020-05-22 Thread Kouhei Sutou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou resolved ARROW-8815. - Resolution: Fixed Issue resolved by pull request 7192

[jira] [Created] (ARROW-8902) [rust][datafusion] optimize count(*) queries on parquet sources

2020-05-22 Thread Alex Gaynor (Jira)
Alex Gaynor created ARROW-8902: -- Summary: [rust][datafusion] optimize count(*) queries on parquet sources Key: ARROW-8902 URL: https://issues.apache.org/jira/browse/ARROW-8902 Project: Apache Arrow

[jira] [Created] (ARROW-8901) [C++] Reduce number of take kernels

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8901: --- Summary: [C++] Reduce number of take kernels Key: ARROW-8901 URL: https://issues.apache.org/jira/browse/ARROW-8901 Project: Apache Arrow Issue Type:

[jira] [Created] (ARROW-8900) Respect HTTP(S)_PROXY for S3 Filesystems and/or expose proxy options as parameters

2020-05-22 Thread Daniel Nugent (Jira)
Daniel Nugent created ARROW-8900: Summary: Respect HTTP(S)_PROXY for S3 Filesystems and/or expose proxy options as parameters Key: ARROW-8900 URL: https://issues.apache.org/jira/browse/ARROW-8900

[jira] [Commented] (ARROW-4390) [R] Serialize "labeled" metadata in Feather files, IPC messages

2020-05-22 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-4390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114428#comment-17114428 ] Neal Richardson commented on ARROW-4390: After exploring more, I don't think this requires an

[jira] [Created] (ARROW-8899) [R] Add R metadata like pandas metadata for round-trip fidelity

2020-05-22 Thread Neal Richardson (Jira)
Neal Richardson created ARROW-8899: -- Summary: [R] Add R metadata like pandas metadata for round-trip fidelity Key: ARROW-8899 URL: https://issues.apache.org/jira/browse/ARROW-8899 Project: Apache

[jira] [Created] (ARROW-8898) [C++] Determine desirable maximum length for ExecBatch in pipelined and parallel execution of kernels

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8898: --- Summary: [C++] Determine desirable maximum length for ExecBatch in pipelined and parallel execution of kernels Key: ARROW-8898 URL: https://issues.apache.org/jira/browse/ARROW-8898

[jira] [Created] (ARROW-8897) [C++] Determine strategy for propagating failures in initializing built-in function registry in arrow/compute

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8897: --- Summary: [C++] Determine strategy for propagating failures in initializing built-in function registry in arrow/compute Key: ARROW-8897 URL:

[jira] [Created] (ARROW-8896) [C++] Reimplement dictionary unpacking in Cast kernels using Take

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8896: --- Summary: [C++] Reimplement dictionary unpacking in Cast kernels using Take Key: ARROW-8896 URL: https://issues.apache.org/jira/browse/ARROW-8896 Project: Apache Arrow

[jira] [Updated] (ARROW-8895) [C++] Add C++ unit tests for filter and take functions on temporal type inputs, including timestamps

2020-05-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-8895: Summary: [C++] Add C++ unit tests for filter and take functions on temporal type inputs, including

[jira] [Created] (ARROW-8895) [C++] Add C++ unit tests for filter function on temporal type inputs, including timestamps

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8895: --- Summary: [C++] Add C++ unit tests for filter function on temporal type inputs, including timestamps Key: ARROW-8895 URL: https://issues.apache.org/jira/browse/ARROW-8895

[jira] [Updated] (ARROW-8894) [C++] C++ array kernels framework and execution buildout (umbrella issue)

2020-05-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-8894: Description: In the wake of ARROW-8792, this issue is to serve as an umbrella issue for follow up

[jira] [Created] (ARROW-8894) [C++] C++ array kernels framework and execution buildout (umbrella issue)

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8894: --- Summary: [C++] C++ array kernels framework and execution buildout (umbrella issue) Key: ARROW-8894 URL: https://issues.apache.org/jira/browse/ARROW-8894 Project:

[jira] [Resolved] (ARROW-8890) [R] Fix C++ lint issue

2020-05-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-8890. --- Resolution: Fixed Issue resolved by pull request 7251

[jira] [Closed] (ARROW-8893) [R] Fix cpplint issues introduced by ARROW-8885

2020-05-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney closed ARROW-8893. --- Fix Version/s: (was: 1.0.0) Resolution: Duplicate dup of ARROW-8890 > [R] Fix cpplint

[jira] [Resolved] (ARROW-8455) [Rust] [Parquet] Arrow column read on partially compatible files

2020-05-22 Thread Chao Sun (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun resolved ARROW-8455. - Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 6935

[jira] [Created] (ARROW-8893) [R] Fix cpplint issues introduced by ARROW-8885

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8893: --- Summary: [R] Fix cpplint issues introduced by ARROW-8885 Key: ARROW-8893 URL: https://issues.apache.org/jira/browse/ARROW-8893 Project: Apache Arrow Issue

[jira] [Created] (ARROW-8892) [C++][CI] CI builds for MSVC do not build benchmarks

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8892: --- Summary: [C++][CI] CI builds for MSVC do not build benchmarks Key: ARROW-8892 URL: https://issues.apache.org/jira/browse/ARROW-8892 Project: Apache Arrow

[jira] [Commented] (ARROW-555) [C++] String algorithm library for StringArray/BinaryArray

2020-05-22 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114217#comment-17114217 ] Wes McKinney commented on ARROW-555: Yes, that's the idea. I can try to implement {{str.split}} which

[jira] [Commented] (ARROW-8878) [R] how to install when behind a firewall?

2020-05-22 Thread Olaf (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114186#comment-17114186 ] Olaf commented on ARROW-8878: - Hi [~npr], thanks for replying back. Please see below:   *

[jira] [Commented] (ARROW-555) [C++] String algorithm library for StringArray/BinaryArray

2020-05-22 Thread Maarten Breddels (Jira)
[ https://issues.apache.org/jira/browse/ARROW-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17114105#comment-17114105 ] Maarten Breddels commented on ARROW-555: Sounds good. I think it would help me a lot to see

[jira] [Created] (ARROW-8891) [C++] Split non-cast compute kernels into a separate shared library

2020-05-22 Thread Wes McKinney (Jira)
Wes McKinney created ARROW-8891: --- Summary: [C++] Split non-cast compute kernels into a separate shared library Key: ARROW-8891 URL: https://issues.apache.org/jira/browse/ARROW-8891 Project: Apache

[jira] [Assigned] (ARROW-8510) [C++] arrow/dataset/file_base.cc fails to compile with internal compiler error with "Visual Studio 15 2017 Win64" generator

2020-05-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-8510: - Assignee: Francois Saint-Jacques > [C++] arrow/dataset/file_base.cc

[jira] [Updated] (ARROW-8890) [R] Fix C++ lint issue

2020-05-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8890: -- Labels: pull-request-available (was: ) > [R] Fix C++ lint issue > --- >

[jira] [Assigned] (ARROW-8889) [Python] Python 3.7 SIGSEGV when comparing RecordBatch to None

2020-05-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-8889: - Assignee: David Li > [Python] Python 3.7 SIGSEGV when comparing

[jira] [Resolved] (ARROW-8889) [Python] Python 3.7 SIGSEGV when comparing RecordBatch to None

2020-05-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-8889. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request

[jira] [Created] (ARROW-8890) [R] Fix C++ lint issue

2020-05-22 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8890: - Summary: [R] Fix C++ lint issue Key: ARROW-8890 URL: https://issues.apache.org/jira/browse/ARROW-8890 Project: Apache Arrow Issue Type:

[jira] [Updated] (ARROW-8889) [Python] Python 3.7 SIGSEGV when comparing RecordBatch to None

2020-05-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8889: -- Labels: pull-request-available (was: ) > [Python] Python 3.7 SIGSEGV when comparing

[jira] [Commented] (ARROW-8889) [Python] Python 3.7 SIGSEGV when comparing RecordBatch to None

2020-05-22 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113981#comment-17113981 ] David Li commented on ARROW-8889: - I tried with a wheel for 0.15.1 and it happens as well. (It doesn't

[jira] [Updated] (ARROW-8889) [Python] Python 3.7 SIGSEGV when comparing RecordBatch to None

2020-05-22 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Li updated ARROW-8889: Affects Version/s: 0.15.1 > [Python] Python 3.7 SIGSEGV when comparing RecordBatch to None >

[jira] [Commented] (ARROW-8889) [Python] Python 3.7 SIGSEGV when comparing RecordBatch to None

2020-05-22 Thread David Li (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113977#comment-17113977 ] David Li commented on ARROW-8889: - I have a core dump but it's too large. Let me upload it somewhere

[jira] [Created] (ARROW-8889) [Python] Python 3.7 SIGSEGV when comparing RecordBatch to None

2020-05-22 Thread David Li (Jira)
David Li created ARROW-8889: --- Summary: [Python] Python 3.7 SIGSEGV when comparing RecordBatch to None Key: ARROW-8889 URL: https://issues.apache.org/jira/browse/ARROW-8889 Project: Apache Arrow

[jira] [Resolved] (ARROW-8885) [R] Don't include everything everywhere

2020-05-22 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques resolved ARROW-8885. --- Resolution: Fixed Issue resolved by pull request 7245

[jira] [Resolved] (ARROW-8696) [Java] Convert tests to integration tests

2020-05-22 Thread Ryan Murray (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Murray resolved ARROW-8696. Resolution: Fixed > [Java] Convert tests to integration tests >

[jira] [Commented] (ARROW-8696) [Java] Convert tests to integration tests

2020-05-22 Thread Ryan Murray (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17113870#comment-17113870 ] Ryan Murray commented on ARROW-8696: Closed in https://github.com/apache/arrow/pull/7100 via 93ba086

[jira] [Updated] (ARROW-8888) [Python] Heuristic in dataframe_to_arrays that decides to multithread convert cause slow conversions

2020-05-22 Thread Kevin Glasson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Glasson updated ARROW-: - Description: When calling pa.Table.from_pandas() the code path that uses the ThreadPoolExecutor

[jira] [Updated] (ARROW-8888) Heuristic in dataframe_to_arrays that decides to multithread convert cause slow conversions

2020-05-22 Thread Kevin Glasson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Glasson updated ARROW-: - Description: When calling pa.Table.from_pandas() the code path that uses the ThreadPoolExecutor

[jira] [Created] (ARROW-8888) Heuristic in dataframe_to_arrays that decides to multithread convert cause slow conversions

2020-05-22 Thread Kevin Glasson (Jira)
Kevin Glasson created ARROW-: Summary: Heuristic in dataframe_to_arrays that decides to multithread convert cause slow conversions Key: ARROW- URL: https://issues.apache.org/jira/browse/ARROW-

[jira] [Updated] (ARROW-8402) [Java] Support ValidateFull methods in Java

2020-05-22 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8402: -- Labels: pull-request-available (was: ) > [Java] Support ValidateFull methods in Java >

[jira] [Resolved] (ARROW-8887) [Java] Buffer size for complex vectors increases rapidly in case of clear/write loop

2020-05-22 Thread Pindikura Ravindra (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pindikura Ravindra resolved ARROW-8887. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 7247