[jira] [Updated] (ARROW-6155) [Java] Extract a super interface for vectors whose elements reside in continuous memory segments

2019-08-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6155: -- Labels: pull-request-available (was: ) > [Java] Extract a super interface for vectors whose

[jira] [Created] (ARROW-6155) [Java] Extract a super interface for vectors whose elements reside in continuous memory segments

2019-08-06 Thread Liya Fan (JIRA)
Liya Fan created ARROW-6155: --- Summary: [Java] Extract a super interface for vectors whose elements reside in continuous memory segments Key: ARROW-6155 URL: https://issues.apache.org/jira/browse/ARROW-6155

[jira] [Updated] (ARROW-6154) [Rust] Too many open files (os error 24)

2019-08-06 Thread Micah Kornfield (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Micah Kornfield updated ARROW-6154: --- Summary: [Rust] Too many open files (os error 24) (was: Too many open files (os error 24))

[jira] [Resolved] (ARROW-5772) [GLib][Plasma][CUDA] Plasma::Client#refer_object test is failed

2019-08-06 Thread Yosuke Shiro (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yosuke Shiro resolved ARROW-5772. - Resolution: Fixed Fix Version/s: (was: 1.0.0) 0.15.0 Issue

[jira] [Created] (ARROW-6154) Too many open files (os error 24)

2019-08-06 Thread Yesh (JIRA)
Yesh created ARROW-6154: --- Summary: Too many open files (os error 24) Key: ARROW-6154 URL: https://issues.apache.org/jira/browse/ARROW-6154 Project: Apache Arrow Issue Type: Bug Components:

[jira] [Updated] (ARROW-6142) [R] Install instructions on linux could be clearer

2019-08-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6142: -- Labels: documentation pull-request-available (was: documentation) > [R] Install instructions

[jira] [Created] (ARROW-6153) [R] Address parquet deprecation warning

2019-08-06 Thread Neal Richardson (JIRA)
Neal Richardson created ARROW-6153: -- Summary: [R] Address parquet deprecation warning Key: ARROW-6153 URL: https://issues.apache.org/jira/browse/ARROW-6153 Project: Apache Arrow Issue Type:

[jira] [Commented] (ARROW-3246) [Python][Parquet] direct reading/writing of pandas categoricals in parquet

2019-08-06 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901508#comment-16901508 ] Wes McKinney commented on ARROW-3246: - I created ARROW-6152 to cover the initial feature-preserving

[jira] [Created] (ARROW-6152) [C++][Parquet] Write arrow::Array directly into parquet::TypedColumnWriter

2019-08-06 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6152: --- Summary: [C++][Parquet] Write arrow::Array directly into parquet::TypedColumnWriter Key: ARROW-6152 URL: https://issues.apache.org/jira/browse/ARROW-6152 Project:

[jira] [Commented] (ARROW-3246) [Python][Parquet] direct reading/writing of pandas categoricals in parquet

2019-08-06 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901506#comment-16901506 ] Wes McKinney commented on ARROW-3246: - I've been looking at what's required to write

[jira] [Commented] (ARROW-6131) [C++] Optimize the Arrow UTF-8-string-validation

2019-08-06 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901471#comment-16901471 ] Wes McKinney commented on ARROW-6131: - In principle this seems OK to me. We can discuss further in a

[jira] [Commented] (ARROW-6151) [R] See if possible to generate r/inst/NOTICE.txt rather than duplicate information

2019-08-06 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901434#comment-16901434 ] Neal Richardson commented on ARROW-6151: Me too. This was discussed here: 

[jira] [Created] (ARROW-6151) [R] See if possible to generate r/inst/NOTICE.txt rather than duplicate information

2019-08-06 Thread Wes McKinney (JIRA)
Wes McKinney created ARROW-6151: --- Summary: [R] See if possible to generate r/inst/NOTICE.txt rather than duplicate information Key: ARROW-6151 URL: https://issues.apache.org/jira/browse/ARROW-6151

[jira] [Commented] (ARROW-6150) [Python] Intermittent HDFS error

2019-08-06 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901399#comment-16901399 ] Wes McKinney commented on ARROW-6150: - The RPC port depends on your Hadoop configuration. You can

[jira] [Comment Edited] (ARROW-6150) [Python] Intermittent HDFS error

2019-08-06 Thread Saurabh Bajaj (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901397#comment-16901397 ] Saurabh Bajaj edited comment on ARROW-6150 at 8/6/19 7:12 PM: -- I tried

[jira] [Commented] (ARROW-6150) [Python] Intermittent HDFS error

2019-08-06 Thread Saurabh Bajaj (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901397#comment-16901397 ] Saurabh Bajaj commented on ARROW-6150: -- I tried setting `port=8020` in `pa.hdfs.connect()`, but same

[jira] [Updated] (ARROW-6150) [Python] Intermittent HDFS error

2019-08-06 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6150: Summary: [Python] Intermittent HDFS error (was: Intermittent Pyarrow HDFS IO error) > [Python]

[jira] [Commented] (ARROW-6150) Intermittent Pyarrow HDFS IO error

2019-08-06 Thread Saurabh Bajaj (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901394#comment-16901394 ] Saurabh Bajaj commented on ARROW-6150: -- [~wesmckinn] Thanks for your response!  I found 

[jira] [Commented] (ARROW-6150) Intermittent Pyarrow HDFS IO error

2019-08-06 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901386#comment-16901386 ] Wes McKinney commented on ARROW-6150: - Our use of libhdfs is pretty straightforward, so the issue

[jira] [Resolved] (ARROW-6084) [Python] Support LargeList

2019-08-06 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-6084. - Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request 4979

[jira] [Created] (ARROW-6150) Intermittent Pyarrow HDFS IO error

2019-08-06 Thread Saurabh Bajaj (JIRA)
Saurabh Bajaj created ARROW-6150: Summary: Intermittent Pyarrow HDFS IO error Key: ARROW-6150 URL: https://issues.apache.org/jira/browse/ARROW-6150 Project: Apache Arrow Issue Type: Bug

[jira] [Closed] (ARROW-5922) [Python] Unable to connect to HDFS from a worker/data node on a Kerberized cluster using pyarrow' hdfs API

2019-08-06 Thread Saurabh Bajaj (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Saurabh Bajaj closed ARROW-5922. Resolution: Works for Me > [Python] Unable to connect to HDFS from a worker/data node on a

[jira] [Resolved] (ARROW-6088) [Rust] [DataFusion] Implement parallel execution for projection

2019-08-06 Thread Andy Grove (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Grove resolved ARROW-6088. --- Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request 4988

[jira] [Updated] (ARROW-6149) [Parquet] Decimal comparisons used for min/max statistics are not correct

2019-08-06 Thread Philip Felton (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Felton updated ARROW-6149: - Affects Version/s: 0.14.1 > [Parquet] Decimal comparisons used for min/max statistics are not

[jira] [Updated] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5977: -- Labels: csv pull-request-available (was: csv) > [C++] [Python] Method for read_csv to limit

[jira] [Commented] (ARROW-6055) [C++] Refactor arrow/io/hdfs.h to use common FileSystem API

2019-08-06 Thread Benjamin Kietzman (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901210#comment-16901210 ] Benjamin Kietzman commented on ARROW-6055: -- In addition to removing io::FileSystem and

[jira] [Assigned] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-5977: - Assignee: Antoine Pitrou > [C++] [Python] Method for read_csv to limit which columns

[jira] [Updated] (ARROW-6149) [Parquet] Decimal comparisons used for min/max statistics are not correct

2019-08-06 Thread Philip Felton (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Felton updated ARROW-6149: - Description: The [Parquet Format

[jira] [Created] (ARROW-6149) [Parquet] Decimal comparisons used for min/max statistics are not correct

2019-08-06 Thread Philip Felton (JIRA)
Philip Felton created ARROW-6149: Summary: [Parquet] Decimal comparisons used for min/max statistics are not correct Key: ARROW-6149 URL: https://issues.apache.org/jira/browse/ARROW-6149 Project:

[jira] [Assigned] (ARROW-6055) [C++] Refactor arrow/io/hdfs.h to use common FileSystem API

2019-08-06 Thread Benjamin Kietzman (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Kietzman reassigned ARROW-6055: Assignee: Benjamin Kietzman > [C++] Refactor arrow/io/hdfs.h to use common

[jira] [Commented] (ARROW-6055) [C++] Refactor arrow/io/hdfs.h to use common FileSystem API

2019-08-06 Thread Benjamin Kietzman (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901168#comment-16901168 ] Benjamin Kietzman commented on ARROW-6055: -- [~wesmckinn] should io::FileSystem be deprecated or

[jira] [Commented] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901145#comment-16901145 ] Neal Richardson commented on ARROW-5977: Right, that was the other option that I said that all of

[jira] [Commented] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread Francois Saint-Jacques (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901142#comment-16901142 ] Francois Saint-Jacques commented on ARROW-5977: --- Can we re-use the

[jira] [Commented] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901141#comment-16901141 ] Neal Richardson commented on ARROW-5977: Yeah I agree that just an {{include_columns}} argument

[jira] [Commented] (ARROW-5766) [Python] Unpin jpype1 version

2019-08-06 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901139#comment-16901139 ] Antoine Pitrou commented on ARROW-5766: --- [~xhochy] > [Python] Unpin jpype1 version >

[jira] [Commented] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901137#comment-16901137 ] Antoine Pitrou commented on ARROW-5977: --- So, to make things clear, a column in {{include_columns}}

[jira] [Commented] (ARROW-6142) [R] Install instructions on linux could be clearer

2019-08-06 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901135#comment-16901135 ] Neal Richardson commented on ARROW-6142: I've revised the README along these lines in 

[jira] [Commented] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread Wes McKinney (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901133#comment-16901133 ] Wes McKinney commented on ARROW-5977: - I think just include is okay. It might make sense to

[jira] [Assigned] (ARROW-6142) [R] Install instructions on linux could be clearer

2019-08-06 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-6142: -- Assignee: Neal Richardson > [R] Install instructions on linux could be clearer >

[jira] [Updated] (ARROW-6039) [GLib] Add garrow_array_filter()

2019-08-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6039: -- Labels: pull-request-available (was: ) > [GLib] Add garrow_array_filter() >

[jira] [Commented] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901126#comment-16901126 ] Antoine Pitrou commented on ARROW-5977: --- Ok, so what kind of ergonomics would you favour? Simply a

[jira] [Commented] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread Neal Richardson (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901124#comment-16901124 ] Neal Richardson commented on ARROW-5977: All of R's main CSV readers support this. One way they

[jira] [Updated] (ARROW-6039) [GLib] Add garrow_array_filter()

2019-08-06 Thread Yosuke Shiro (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yosuke Shiro updated ARROW-6039: Fix Version/s: (was: 1.0.0) 0.15.0 > [GLib] Add garrow_array_filter() >

[jira] [Commented] (ARROW-5977) [C++] [Python] Method for read_csv to limit which columns are read?

2019-08-06 Thread Antoine Pitrou (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901074#comment-16901074 ] Antoine Pitrou commented on ARROW-5977: --- [~npr] Ping. > [C++] [Python] Method for read_csv to

[jira] [Assigned] (ARROW-6148) Missing debian build dependencies

2019-08-06 Thread Marcin Juszkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcin Juszkiewicz reassigned ARROW-6148: - Assignee: Marcin Juszkiewicz > Missing debian build dependencies >

[jira] [Updated] (ARROW-6148) Missing debian build dependencies

2019-08-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6148: -- Labels: pull-request-available (was: ) > Missing debian build dependencies >

[jira] [Created] (ARROW-6148) Missing debian build dependencies

2019-08-06 Thread Francois Saint-Jacques (JIRA)
Francois Saint-Jacques created ARROW-6148: - Summary: Missing debian build dependencies Key: ARROW-6148 URL: https://issues.apache.org/jira/browse/ARROW-6148 Project: Apache Arrow

[jira] [Created] (ARROW-6147) [Go] implement a Flight client

2019-08-06 Thread Sebastien Binet (JIRA)
Sebastien Binet created ARROW-6147: -- Summary: [Go] implement a Flight client Key: ARROW-6147 URL: https://issues.apache.org/jira/browse/ARROW-6147 Project: Apache Arrow Issue Type: New

[jira] [Commented] (ARROW-6107) [Go] ipc.Writer Option to skip appending data buffers

2019-08-06 Thread Sebastien Binet (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900964#comment-16900964 ] Sebastien Binet commented on ARROW-6107: ok. (just nit-picking but to really assess the CGo

[jira] [Created] (ARROW-6146) [Go] implement a Plasma client

2019-08-06 Thread Sebastien Binet (JIRA)
Sebastien Binet created ARROW-6146: -- Summary: [Go] implement a Plasma client Key: ARROW-6146 URL: https://issues.apache.org/jira/browse/ARROW-6146 Project: Apache Arrow Issue Type: New

[jira] [Updated] (ARROW-6145) [Java] UnionVector created by MinorType#getNewVector could not keep field type info properly

2019-08-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6145: -- Labels: pull-request-available (was: ) > [Java] UnionVector created by MinorType#getNewVector

[jira] [Created] (ARROW-6145) [Java] UnionVector created by MinorType#getNewVector could not keep field type info properly

2019-08-06 Thread Ji Liu (JIRA)
Ji Liu created ARROW-6145: - Summary: [Java] UnionVector created by MinorType#getNewVector could not keep field type info properly Key: ARROW-6145 URL: https://issues.apache.org/jira/browse/ARROW-6145

[jira] [Updated] (ARROW-6144) Implement random function in Gandiva

2019-08-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6144: -- Labels: pull-request-available (was: ) > Implement random function in Gandiva >

[jira] [Updated] (ARROW-6038) [Python] pyarrow.Table.from_batches produces corrupted table if any of the batches were empty

2019-08-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-6038: -- Labels: pull-request-available windows (was: windows) > [Python] pyarrow.Table.from_batches

[jira] [Created] (ARROW-6144) Implement random function in Gandiva

2019-08-06 Thread Prudhvi Porandla (JIRA)
Prudhvi Porandla created ARROW-6144: --- Summary: Implement random function in Gandiva Key: ARROW-6144 URL: https://issues.apache.org/jira/browse/ARROW-6144 Project: Apache Arrow Issue Type:

[jira] [Updated] (ARROW-1562) [C++] Numeric kernel implementations for add (+)

2019-08-06 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-1562: -- Labels: Analytics pull-request-available (was: Analytics) > [C++] Numeric kernel

[jira] [Commented] (ARROW-5953) Thrift download ERRORS with apache-arrow-0.14.0

2019-08-06 Thread Marcin Juszkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16900831#comment-16900831 ] Marcin Juszkiewicz commented on ARROW-5953: --- On Debian 'buster' I have similar failure with

[jira] [Closed] (ARROW-6129) Row_groups duplicate Rows

2019-08-06 Thread albertoramon (JIRA)
[ https://issues.apache.org/jira/browse/ARROW-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] albertoramon closed ARROW-6129. --- Resolution: Not A Problem This is the expected behavior > Row_groups duplicate Rows >