[jira] [Updated] (ARROW-10331) [Rust] [DataFusion] Re-organize errors

2020-10-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-10331: --- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Re-organize errors >

[jira] [Created] (ARROW-10331) [Rust] [DataFusion] Re-organize errors

2020-10-16 Thread Jira
Jorge Leitão created ARROW-10331: Summary: [Rust] [DataFusion] Re-organize errors Key: ARROW-10331 URL: https://issues.apache.org/jira/browse/ARROW-10331 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-10330) [Rust][Datafusion] Implement nullif() function for DataFusion

2020-10-16 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-10330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215738#comment-17215738 ] Jorge Leitão commented on ARROW-10330: -- Good idea. (y) I moved this to 3.0.0 to not block the

[jira] [Updated] (ARROW-10330) [Rust][Datafusion] Implement nullif() function for DataFusion

2020-10-16 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-10330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Leitão updated ARROW-10330: - Fix Version/s: (was: 2.0.0) 3.0.0 > [Rust][Datafusion] Implement

[jira] [Closed] (ARROW-10327) [Rust] [DataFusion] Iterator of futures

2020-10-16 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-10327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Leitão closed ARROW-10327. Resolution: Won't Fix As discussed in #8473 and #8480, this is better handled via buffering, to

[jira] [Created] (ARROW-10330) [Rust][Datafusion] Implement nullif() function for DataFusion

2020-10-16 Thread Evan Chan (Jira)
Evan Chan created ARROW-10330: - Summary: [Rust][Datafusion] Implement nullif() function for DataFusion Key: ARROW-10330 URL: https://issues.apache.org/jira/browse/ARROW-10330 Project: Apache Arrow

[jira] [Updated] (ARROW-10320) [Rust] Convert RecordBatchIterator to a Stream

2020-10-16 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-10320: - Summary: [Rust] Convert RecordBatchIterator to a Stream (was: Convert RecordBatchIterator to a

[jira] [Assigned] (ARROW-5409) [C++] Improvement for IsIn Kernel when right array is small

2020-10-16 Thread David Sherrier (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Sherrier reassigned ARROW-5409: - Assignee: David Sherrier > [C++] Improvement for IsIn Kernel when right array is small

[jira] [Commented] (ARROW-5409) [C++] Improvement for IsIn Kernel when right array is small

2020-10-16 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215627#comment-17215627 ] Wes McKinney commented on ARROW-5409: - Please go ahead. We'll need some benchmarks to get written so

[jira] [Updated] (ARROW-10329) [Rust][Datafusion] Datafusion queries involving a column name that begins with a number produces unexpected results

2020-10-16 Thread Morgan Cassels (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Morgan Cassels updated ARROW-10329: --- Summary: [Rust][Datafusion] Datafusion queries involving a column name that begins with a

[jira] [Updated] (ARROW-10329) Datafusion queries involving a column name that begins with a number produces unexpected results

2020-10-16 Thread Morgan Cassels (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Morgan Cassels updated ARROW-10329: --- Description: This bug can be worked around by wrapping column names in quotes. Example:

[jira] [Created] (ARROW-10329) Datafusion queries involving a column name that begins with a number produces unexpected results

2020-10-16 Thread Morgan Cassels (Jira)
Morgan Cassels created ARROW-10329: -- Summary: Datafusion queries involving a column name that begins with a number produces unexpected results Key: ARROW-10329 URL:

[jira] [Updated] (ARROW-10321) [C++] Building AVX512 code when we should not

2020-10-16 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-10321: Fix Version/s: (was: 2.0.0) 3.0.0 > [C++] Building AVX512 code

[jira] [Resolved] (ARROW-10321) [C++] Building AVX512 code when we should not

2020-10-16 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-10321. - Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8478

[jira] [Commented] (ARROW-10308) [Python] read_csv from python is slow on some work loads

2020-10-16 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215519#comment-17215519 ] Antoine Pitrou commented on ARROW-10308: > Antoine, do you think this is a good idea? Do you

[jira] [Commented] (ARROW-10308) [Python] read_csv from python is slow on some work loads

2020-10-16 Thread Dror Speiser (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215483#comment-17215483 ] Dror Speiser commented on ARROW-10308: -- Yeah, Azure doesn't tell me how many physical cores are at

[jira] [Updated] (ARROW-10313) [C++] Improve UTF8 validation speed and CSV string conversion

2020-10-16 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-10313: Fix Version/s: (was: 2.0.0) 3.0.0 > [C++] Improve UTF8 validation

[jira] [Closed] (ARROW-10324) function read_parquet(*,as_data_frame=TRUE) fails when embedded nuls present.

2020-10-16 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson closed ARROW-10324. --- Assignee: Neal Richardson Resolution: Duplicate > function

[jira] [Commented] (ARROW-10324) function read_parquet(*,as_data_frame=TRUE) fails when embedded nuls present.

2020-10-16 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215444#comment-17215444 ] Neal Richardson commented on ARROW-10324: - This is the same as ARROW-6582. We're working on a

[jira] [Updated] (ARROW-10328) [C++] Consider using fast-double-parser

2020-10-16 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-10328: --- Description: We use Google's double-conversion library for parsing strings to doubles. We

[jira] [Created] (ARROW-10328) [C++] Consider using fast-double-parser

2020-10-16 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-10328: -- Summary: [C++] Consider using fast-double-parser Key: ARROW-10328 URL: https://issues.apache.org/jira/browse/ARROW-10328 Project: Apache Arrow Issue

[jira] [Updated] (ARROW-10327) [Rust] [DataFusion] Iterator of futures

2020-10-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-10327: --- Labels: pull-request-available (was: ) > [Rust] [DataFusion] Iterator of futures >

[jira] [Commented] (ARROW-10308) [Python] read_csv from python is slow on some work loads

2020-10-16 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215407#comment-17215407 ] Antoine Pitrou commented on ARROW-10308: For the record, on a 12-core 24-thread CPU, I get

[jira] [Created] (ARROW-10327) [Rust] [DataFusion] Iterator of futures

2020-10-16 Thread Jira
Jorge Leitão created ARROW-10327: Summary: [Rust] [DataFusion] Iterator of futures Key: ARROW-10327 URL: https://issues.apache.org/jira/browse/ARROW-10327 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-10197) [Gandiva][python] Execute expression on filtered data

2020-10-16 Thread Kirill Lykov (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215400#comment-17215400 ] Kirill Lykov commented on ARROW-10197: -- To simplify navigation, PR is there 

[jira] [Commented] (ARROW-10308) [Python] read_csv from python is slow on some work loads

2020-10-16 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215401#comment-17215401 ] Antoine Pitrou commented on ARROW-10308: "vcpu" doesn't mean anything precise unfortunately.

[jira] [Commented] (ARROW-10308) [Python] read_csv from python is slow on some work loads

2020-10-16 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215396#comment-17215396 ] Wes McKinney commented on ARROW-10308: -- I do think we should be doing better here than we are so it

[jira] [Commented] (ARROW-10308) [Python] read_csv from python is slow on some work loads

2020-10-16 Thread Dror Speiser (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215393#comment-17215393 ] Dror Speiser commented on ARROW-10308: -- Thanks for the suggestions :) I am indeed getting the files

[jira] [Updated] (ARROW-10326) [Rust] Add missing method docs for Arrays

2020-10-16 Thread Mahmut Bulut (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahmut Bulut updated ARROW-10326: - Description: Whenever a PR comes we don't inspect documentation thus some of the methods are

[jira] [Updated] (ARROW-10326) [Rust] Add missing method docs for Arrays

2020-10-16 Thread Mahmut Bulut (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mahmut Bulut updated ARROW-10326: - Description: Currently, whenever a PR comes we don't inspect documentation thus some of the

[jira] [Created] (ARROW-10326) [Rust] Add missing method docs for Arrays

2020-10-16 Thread Mahmut Bulut (Jira)
Mahmut Bulut created ARROW-10326: Summary: [Rust] Add missing method docs for Arrays Key: ARROW-10326 URL: https://issues.apache.org/jira/browse/ARROW-10326 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-9707) [Rust] [DataFusion] Re-implement threading model

2020-10-16 Thread Andrew Lamb (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17215335#comment-17215335 ] Andrew Lamb commented on ARROW-9707: FWIW now that DataFusion uses `async` --

[jira] [Assigned] (ARROW-10311) [Release] Update crossbow verification process

2020-10-16 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs reassigned ARROW-10311: --- Assignee: Krisztian Szucs > [Release] Update crossbow verification process >

[jira] [Resolved] (ARROW-10311) [Release] Update crossbow verification process

2020-10-16 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs resolved ARROW-10311. - Resolution: Fixed Issue resolved by pull request 8464

[jira] [Created] (ARROW-10325) [C++][Compute] Separate aggregate kernel registration

2020-10-16 Thread Yibo Cai (Jira)
Yibo Cai created ARROW-10325: Summary: [C++][Compute] Separate aggregate kernel registration Key: ARROW-10325 URL: https://issues.apache.org/jira/browse/ARROW-10325 Project: Apache Arrow Issue

[jira] [Resolved] (ARROW-9898) [C++][Gandiva] Error handling in castINT fails in some enviroments

2020-10-16 Thread Praveen Kumar (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Praveen Kumar resolved ARROW-9898. -- Fix Version/s: 2.0.0 Resolution: Fixed Issue resolved by pull request 8096

[jira] [Updated] (ARROW-9898) [C++][Gandiva] Error handling in castINT fails in some enviroments

2020-10-16 Thread Praveen Kumar (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Praveen Kumar updated ARROW-9898: - Component/s: C++ - Gandiva > [C++][Gandiva] Error handling in castINT fails in some enviroments

[jira] [Resolved] (ARROW-10313) [C++] Improve UTF8 validation speed and CSV string conversion

2020-10-16 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-10313. Fix Version/s: (was: 3.0.0) 2.0.0 Resolution: Fixed Issue

[jira] [Updated] (ARROW-10321) [C++] Building AVX512 code when we should not

2020-10-16 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-10321: --- Labels: pull-request-available (was: ) > [C++] Building AVX512 code when we should not >

[jira] [Updated] (ARROW-10324) function read_parquet(*,as_data_frame=TRUE) fails when embedded nuls present.

2020-10-16 Thread Akash Shah (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10324?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akash Shah updated ARROW-10324: --- Docs Text: > sessionInfo() R version 3.4.4 (2018-03-15) Platform: x86_64-pc-linux-gnu (64-bit)

[jira] [Assigned] (ARROW-10321) [C++] Building AVX512 code when we should not

2020-10-16 Thread Frank Du (Jira)
[ https://issues.apache.org/jira/browse/ARROW-10321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Du reassigned ARROW-10321: Assignee: Frank Du > [C++] Building AVX512 code when we should not >

[jira] [Created] (ARROW-10324) function read_parquet(*,as_data_frame=TRUE) fails when embedded nuls present.

2020-10-16 Thread Akash Shah (Jira)
Akash Shah created ARROW-10324: -- Summary: function read_parquet(*,as_data_frame=TRUE) fails when embedded nuls present. Key: ARROW-10324 URL: https://issues.apache.org/jira/browse/ARROW-10324 Project: