[jira] [Comment Edited] (ARROW-8283) [Python][Dataset] Non-existent files are silently dropped in pa.dataset.FileSystemDataset

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133064#comment-17133064 ] Joris Van den Bossche edited comment on ARROW-8283 at 6/11/20, 8:49 AM:

[jira] [Commented] (ARROW-8283) [Python][Dataset] Non-existent files are silently dropped in pa.dataset.FileSystemDataset

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133064#comment-17133064 ] Joris Van den Bossche commented on ARROW-8283: -- Ah, indeed. We maybe could still accept a

[jira] [Assigned] (ARROW-8283) [Python][Dataset] Non-existent files are silently dropped in pa.dataset.FileSystemDataset

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche reassigned ARROW-8283: Assignee: Joris Van den Bossche > [Python][Dataset] Non-existent files

[jira] [Updated] (ARROW-9102) [Packaging] Upload built manylinux docker images

2020-06-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9102: -- Labels: pull-request-available (was: ) > [Packaging] Upload built manylinux docker images >

[jira] [Resolved] (ARROW-9098) RecordBatch::ToStructArray cannot handle record batches with 0 column

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-9098. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 7401

[jira] [Created] (ARROW-9100) Add ascii_lower kernel

2020-06-11 Thread Maarten Breddels (Jira)
Maarten Breddels created ARROW-9100: --- Summary: Add ascii_lower kernel Key: ARROW-9100 URL: https://issues.apache.org/jira/browse/ARROW-9100 Project: Apache Arrow Issue Type: Task

[jira] [Created] (ARROW-9101) [Doc][C++][Python] Document encoding expected by CSV and JSON readers

2020-06-11 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-9101: - Summary: [Doc][C++][Python] Document encoding expected by CSV and JSON readers Key: ARROW-9101 URL: https://issues.apache.org/jira/browse/ARROW-9101 Project:

[jira] [Updated] (ARROW-9076) [Rust] Async CSV reader

2020-06-11 Thread Sergey Todyshev (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Todyshev updated ARROW-9076: --- Description: It would be nice to have it in arrow crate as well. It is extremely useful in

[jira] [Updated] (ARROW-7676) [Packaging][Python] Ensure that the static libraries are not built in the wheel scripts

2020-06-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-7676: -- Labels: pull-request-available (was: ) > [Packaging][Python] Ensure that the static libraries

[jira] [Assigned] (ARROW-9098) RecordBatch::ToStructArray cannot handle record batches with 0 column

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-9098: - Assignee: Zhuo Peng > RecordBatch::ToStructArray cannot handle record batches with 0

[jira] [Updated] (ARROW-9100) [C++] Add ascii_lower kernel

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-9100: - Summary: [C++] Add ascii_lower kernel (was: Add ascii_lower kernel) > [C++] Add

[jira] [Updated] (ARROW-9100) Add ascii_lower kernel

2020-06-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9100: -- Labels: pull-request-available (was: ) > Add ascii_lower kernel > -- > >

[jira] [Created] (ARROW-9103) [Python] Clarify behaviour of CSV reader for non-UTF8 text data

2020-06-11 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-9103: Summary: [Python] Clarify behaviour of CSV reader for non-UTF8 text data Key: ARROW-9103 URL: https://issues.apache.org/jira/browse/ARROW-9103

[jira] [Created] (ARROW-9102) [Packaging] Upload built manylinux docker images

2020-06-11 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-9102: -- Summary: [Packaging] Upload built manylinux docker images Key: ARROW-9102 URL: https://issues.apache.org/jira/browse/ARROW-9102 Project: Apache Arrow

[jira] [Closed] (ARROW-9103) [Python] Clarify behaviour of CSV reader for non-UTF8 text data

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche closed ARROW-9103. Resolution: Duplicate Antoine was faster .. > [Python] Clarify behaviour of CSV

[jira] [Resolved] (ARROW-8860) [C++] IPC/Feather decompression broken for nested arrays

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-8860. --- Resolution: Fixed Issue resolved by pull request 7233

[jira] [Assigned] (ARROW-8860) [C++] IPC/Feather decompression broken for nested arrays

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-8860: - Assignee: Joris Van den Bossche > [C++] IPC/Feather decompression broken for nested

[jira] [Updated] (ARROW-5377) [C++] Make IpcPayload public and add GetPayloadSize

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-5377: -- Summary: [C++] Make IpcPayload public and add GetPayloadSize (was: [C++] Develop interface

[jira] [Resolved] (ARROW-5377) [C++] Develop interface for writing a RecordBatch IPC stream into pre-allocated space (e.g. memory map) that avoids unnecessary serialization

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-5377. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 7387

[jira] [Assigned] (ARROW-5377) [C++] Make IpcPayload public and add GetPayloadSize

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-5377: - Assignee: David Li > [C++] Make IpcPayload public and add GetPayloadSize >

[jira] [Commented] (ARROW-9063) [Python][C++] Order of files are not respected using the new pyarrow.dataset

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133203#comment-17133203 ] Joris Van den Bossche commented on ARROW-9063: -- [~brillliantz] thanks for the report

[jira] [Updated] (ARROW-8588) [Python] `driver` param removed from `hdfs.connect()`

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8588: - Fix Version/s: 1.0.0 > [Python] `driver` param removed from `hdfs.connect()` >

[jira] [Updated] (ARROW-9096) [Python] Pandas roundtrip with object-dtype column labels with integer values: data type "integer" not understood

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-9096: - Summary: [Python] Pandas roundtrip with object-dtype column labels with integer

[jira] [Commented] (ARROW-9096) data type "integer" not understood: pandas roundtrip

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133155#comment-17133155 ] Joris Van den Bossche commented on ARROW-9096: -- Thanks for the report. A smaller reproducer:

[jira] [Commented] (ARROW-9065) [Python] Support parsing date32 in dataset partition folders

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133207#comment-17133207 ] Joris Van den Bossche commented on ARROW-9065: -- [~dhirschfeld] thanks for the report cc

[jira] [Updated] (ARROW-9065) [Python] Support parsing date32 in dataset partition folders

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-9065: - Labels: dataset (was: ) > [Python] Support parsing date32 in dataset partition

[jira] [Updated] (ARROW-8240) [Python] New FS interface (pyarrow.fs) does not seem to work correctly for HDFS (Python 3.6, pyarrow 0.16.0)

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8240: - Labels: HDFS filesystem hdfs (was: HDFS filesystem) > [Python] New FS interface

[jira] [Updated] (ARROW-9002) [C++] Unable to load libjvm on ppc64le architecture for hdfs.connect()

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-9002: - Labels: filesystem hdfs (was: ) > [C++] Unable to load libjvm on ppc64le

[jira] [Created] (ARROW-9104) [C++] Parquet encryption tests should write files to a temporary directory instead of the testing submodule's directory

2020-06-11 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-9104: -- Summary: [C++] Parquet encryption tests should write files to a temporary directory instead of the testing submodule's directory Key: ARROW-9104 URL:

[jira] [Updated] (ARROW-9065) [Python] Support parsing date32 in dataset partition folders

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-9065: - Description: I have some data which is partitioned by year/month/date. It would

[jira] [Resolved] (ARROW-7676) [Packaging][Python] Ensure that the static libraries are not built in the wheel scripts

2020-06-11 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-7676. - Resolution: Fixed Issue resolved by pull request 7405

[jira] [Created] (ARROW-9106) [C++] Add C++ foundation to ease file transcoding

2020-06-11 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-9106: - Summary: [C++] Add C++ foundation to ease file transcoding Key: ARROW-9106 URL: https://issues.apache.org/jira/browse/ARROW-9106 Project: Apache Arrow

[jira] [Comment Edited] (ARROW-9105) [C++] ParquetFileFragment scanning doesn't handle filter on partition field

2020-06-11 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133329#comment-17133329 ] Ben Kietzman edited comment on ARROW-9105 at 6/11/20, 4:02 PM: --- - Infer a

[jira] [Updated] (ARROW-9105) [C++] ParquetFileFragment::SplitByRowGroup doesn't handle filter on partition field

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-9105: - Description: When splitting a fragment into row group fragments, filtering on

[jira] [Updated] (ARROW-9056) [C++] Aggregation methods for Scalars?

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-9056: --- Fix Version/s: (was: 1.0.0) > [C++] Aggregation methods for Scalars? >

[jira] [Updated] (ARROW-9107) [C++][Dataset] Time-based types support

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-9107: -- Priority: Blocker (was: Major) > [C++][Dataset] Time-based types support >

[jira] [Created] (ARROW-9108) [C++][Dataset] Add Parquet Statistics conversion for timestamp columns

2020-06-11 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-9108: - Summary: [C++][Dataset] Add Parquet Statistics conversion for timestamp columns Key: ARROW-9108 URL: https://issues.apache.org/jira/browse/ARROW-9108

[jira] [Updated] (ARROW-9107) [C++][Dataset] Time-based types support

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-9107: -- Fix Version/s: 1.0.0 > [C++][Dataset] Time-based types support >

[jira] [Created] (ARROW-9105) [C++] ParquetFileFragment::SplitByRowGroup doesn't handle filter on partition field

2020-06-11 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-9105: Summary: [C++] ParquetFileFragment::SplitByRowGroup doesn't handle filter on partition field Key: ARROW-9105 URL: https://issues.apache.org/jira/browse/ARROW-9105

[jira] [Updated] (ARROW-9093) [FlightRPC][C++][Python] Allow setting gRPC client options

2020-06-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9093: -- Labels: pull-request-available (was: ) > [FlightRPC][C++][Python] Allow setting gRPC client

[jira] [Created] (ARROW-9107) [C++][Dataset] Time-based types support

2020-06-11 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-9107: - Summary: [C++][Dataset] Time-based types support Key: ARROW-9107 URL: https://issues.apache.org/jira/browse/ARROW-9107 Project: Apache Arrow

[jira] [Updated] (ARROW-9065) [Python] Support parsing date32 in dataset partition folders

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-9065: -- Parent: ARROW-9107 Issue Type: Sub-task (was: Improvement) > [Python]

[jira] [Commented] (ARROW-8283) [Python][Dataset] Non-existent files are silently dropped in pa.dataset.FileSystemDataset

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133307#comment-17133307 ] Francois Saint-Jacques commented on ARROW-8283: --- Correct, we should not touch

[jira] [Commented] (ARROW-9105) [C++] ParquetFileFragment scanning doesn't handle filter on partition field

2020-06-11 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133329#comment-17133329 ] Ben Kietzman commented on ARROW-9105: - - Infer a schema for fields referenced in a fragment's

[jira] [Updated] (ARROW-9101) [Doc][C++][Python] Document encoding expected by CSV and JSON readers

2020-06-11 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-9101: -- Labels: pull-request-available (was: ) > [Doc][C++][Python] Document encoding expected by CSV

[jira] [Assigned] (ARROW-9065) [Python] Support parsing date32 in dataset partition folders

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-9065: - Assignee: Francois Saint-Jacques > [Python] Support parsing date32 in

[jira] [Assigned] (ARROW-8283) [Python][Dataset] Non-existent files are silently dropped in pa.dataset.FileSystemDataset

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-8283: - Assignee: Joris Van den Bossche (was: Francois Saint-Jacques) >

[jira] [Assigned] (ARROW-8283) [Python][Dataset] Non-existent files are silently dropped in pa.dataset.FileSystemDataset

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-8283: - Assignee: Francois Saint-Jacques (was: Joris Van den Bossche) >

[jira] [Commented] (ARROW-9105) [C++] ParquetFileFragment scanning doesn't handle filter on partition field

2020-06-11 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133320#comment-17133320 ] Ben Kietzman commented on ARROW-9105: - The fragment's physical schema is used to insert implicit

[jira] [Resolved] (ARROW-6602) [Doc] Add feature / implementation matrix

2020-06-11 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-6602. - Resolution: Fixed Issue resolved by pull request 7350

[jira] [Updated] (ARROW-2801) [Python][C++][Dataset] Implement split_row_groups for ParquetDataset

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-2801: --- Fix Version/s: 1.0.0 > [Python][C++][Dataset] Implement split_row_groups for ParquetDataset

[jira] [Updated] (ARROW-2801) [Python][C++][Dataset] Implement split_row_groups for ParquetDataset

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-2801: --- Summary: [Python][C++][Dataset] Implement split_row_groups for ParquetDataset (was:

[jira] [Assigned] (ARROW-2801) [Python][C++][Dataset] Implement split_row_groups for ParquetDataset

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-2801: -- Assignee: Joris Van den Bossche > [Python][C++][Dataset] Implement split_row_groups

[jira] [Commented] (ARROW-9065) [Python] Support parsing date32 in dataset partition folders

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133353#comment-17133353 ] Francois Saint-Jacques commented on ARROW-9065: --- There's a general void of time based types

[jira] [Commented] (ARROW-9105) [C++] ParquetFileFragment::SplitByRowGroup doesn't handle filter on partition field

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133291#comment-17133291 ] Joris Van den Bossche commented on ARROW-9105: -- And it's not only {{SplitByRowGroup}}, but

[jira] [Updated] (ARROW-9105) [C++] ParquetFileFragment scanning doesn't handle filter on partition field

2020-06-11 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-9105: - Summary: [C++] ParquetFileFragment scanning doesn't handle filter on partition

[jira] [Commented] (ARROW-9063) [Python][C++] Order of files are not respected using the new pyarrow.dataset

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133317#comment-17133317 ] Neal Richardson commented on ARROW-9063: If you want to confirm that it is fixed, you can try

[jira] [Closed] (ARROW-1796) [Python] RowGroup filtering on file level

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson closed ARROW-1796. -- Fix Version/s: 1.0.0 Assignee: Joris Van den Bossche (was: Uwe Korn)

[jira] [Resolved] (ARROW-9102) [Packaging] Upload built manylinux docker images

2020-06-11 Thread Krisztian Szucs (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Krisztian Szucs resolved ARROW-9102. Resolution: Fixed Issue resolved by pull request 7404

[jira] [Resolved] (ARROW-5760) [C++] Optimize Take implementation

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-5760. --- Resolution: Fixed Issue resolved by pull request 7382

[jira] [Updated] (ARROW-4429) [Doc] Add git rebase tips to the 'Contributing' page in the developer docs

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-4429: --- Fix Version/s: 1.0.0 > [Doc] Add git rebase tips to the 'Contributing' page in the developer

[jira] [Updated] (ARROW-7607) [C++] Add to cpp/examples minimal examples of using Arrow as a dependency of another CMake project

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-7607: -- Fix Version/s: 1.0.0 > [C++] Add to cpp/examples minimal examples of using Arrow as a

[jira] [Created] (ARROW-9110) [C++] Fix CPU cache size detection on macOS

2020-06-11 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-9110: -- Summary: [C++] Fix CPU cache size detection on macOS Key: ARROW-9110 URL: https://issues.apache.org/jira/browse/ARROW-9110 Project: Apache Arrow Issue

[jira] [Assigned] (ARROW-7798) [R] Refactor R <-> Array conversion

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-7798: - Assignee: (was: Francois Saint-Jacques) > [R] Refactor R <-> Array

[jira] [Assigned] (ARROW-9001) [R] Box outputs as correct type in call_function

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-9001: -- Fix Version/s: (was: 1.0.0) Assignee: Romain Francois > [R] Box outputs

[jira] [Updated] (ARROW-8718) [R] Add str() methods to objects

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8718: --- Description: Apparently this will make the RStudio IDE show useful things in the environment

[jira] [Assigned] (ARROW-8718) [R] Add str() methods to objects

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-8718: -- Assignee: Romain Francois > [R] Add str() methods to objects >

[jira] [Commented] (ARROW-8942) [R] support read gzip csv files

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133663#comment-17133663 ] Neal Richardson commented on ARROW-8942: I found the relevant python code, will do something like

[jira] [Commented] (ARROW-2801) [Python][C++][Dataset] Implement split_row_groups for ParquetDataset

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133662#comment-17133662 ] Neal Richardson commented on ARROW-2801: [~jorisvandenbossche] can you close this if/when you're

[jira] [Assigned] (ARROW-8942) [R] support read gzip csv files

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-8942: -- Assignee: Neal Richardson > [R] support read gzip csv files >

[jira] [Assigned] (ARROW-9054) [C++] Add ScalarAggregateOptions

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-9054: -- Assignee: Krisztian Szucs > [C++] Add ScalarAggregateOptions >

[jira] [Assigned] (ARROW-9055) [C++] Add sum/mean kernels for Boolean type

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-9055: -- Assignee: Krisztian Szucs > [C++] Add sum/mean kernels for Boolean type >

[jira] [Assigned] (ARROW-4429) [Doc] Add git rebase tips to the 'Contributing' page in the developer docs

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-4429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-4429: -- Assignee: Neal Richardson > [Doc] Add git rebase tips to the 'Contributing' page in

[jira] [Commented] (ARROW-8587) [C++] Compilation error when linking arrow-flight-perf-server

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133603#comment-17133603 ] Antoine Pitrou commented on ARROW-8587: --- [~cxma] Do you still encounter this issue on git master?

[jira] [Assigned] (ARROW-9108) [C++][Dataset] Add Parquet Statistics conversion for timestamp columns

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-9108: - Assignee: Francois Saint-Jacques > [C++][Dataset] Add Parquet

[jira] [Assigned] (ARROW-9107) [C++][Dataset] Time-based types support

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-9107: - Assignee: Francois Saint-Jacques > [C++][Dataset] Time-based types

[jira] [Updated] (ARROW-9108) [C++][Dataset] Add Parquet Statistics conversion for timestamp columns

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-9108: -- Component/s: C++ > [C++][Dataset] Add Parquet Statistics conversion for

[jira] [Updated] (ARROW-9108) [C++][Dataset] Add Parquet Statistics conversion for timestamp columns

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-9108: -- Fix Version/s: 1.0.0 > [C++][Dataset] Add Parquet Statistics conversion for

[jira] [Updated] (ARROW-9108) [C++][Dataset] Add Parquet Statistics conversion for timestamp columns

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-9108: -- Priority: Blocker (was: Major) > [C++][Dataset] Add Parquet Statistics

[jira] [Updated] (ARROW-9108) [C++][Dataset] Add Parquet Statistics conversion for timestamp columns

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-9108: -- Labels: dataset (was: ) > [C++][Dataset] Add Parquet Statistics conversion

[jira] [Updated] (ARROW-8826) [Crossbow] remote URL should always have .git

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-8826: --- Fix Version/s: 1.0.0 Issue Type: Bug (was: Improvement) > [Crossbow] remote URL

[jira] [Updated] (ARROW-6981) [R] Implement HDFS file-system interface in R

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6981: --- Fix Version/s: (was: 1.0.0) > [R] Implement HDFS file-system interface in R >

[jira] [Resolved] (ARROW-8487) [FlightRPC][C++] Make it possible to target a specific payload size

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-8487. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 7398

[jira] [Updated] (ARROW-7012) [C++] Clarify ChunkedArray chunking strategy and policy

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-7012: --- Priority: Minor (was: Major) > [C++] Clarify ChunkedArray chunking strategy and policy >

[jira] [Commented] (ARROW-6437) [R] Add AWS SDK to system dependencies for macOS (homebrew, autobrew)

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133580#comment-17133580 ] Neal Richardson commented on ARROW-6437: This is a bigger challenge and isn't going to go in the

[jira] [Updated] (ARROW-6437) [R] Add AWS SDK to system dependencies for macOS (homebrew, autobrew)

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson updated ARROW-6437: --- Fix Version/s: (was: 1.0.0) > [R] Add AWS SDK to system dependencies for macOS

[jira] [Assigned] (ARROW-6437) [R] Add AWS SDK to system dependencies for macOS (homebrew, autobrew)

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-6437: -- Assignee: (was: Neal Richardson) > [R] Add AWS SDK to system dependencies for

[jira] [Resolved] (ARROW-4427) [Doc] Move Confluence Wiki pages to the Sphinx docs

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson resolved ARROW-4427. Resolution: Fixed > [Doc] Move Confluence Wiki pages to the Sphinx docs >

[jira] [Commented] (ARROW-4427) [Doc] Move Confluence Wiki pages to the Sphinx docs

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-4427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17133578#comment-17133578 ] Neal Richardson commented on ARROW-4427: I'm going to mark this as resolved and we can make

[jira] [Assigned] (ARROW-9094) [Python] Bump versions of compiled dependencies in manylinux wheels

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-9094: - Assignee: Antoine Pitrou > [Python] Bump versions of compiled dependencies in manylinux

[jira] [Created] (ARROW-9109) [Python][Packaging] Enable S3 support in manylinux wheels

2020-06-11 Thread Antoine Pitrou (Jira)
Antoine Pitrou created ARROW-9109: - Summary: [Python][Packaging] Enable S3 support in manylinux wheels Key: ARROW-9109 URL: https://issues.apache.org/jira/browse/ARROW-9109 Project: Apache Arrow

[jira] [Resolved] (ARROW-9093) [FlightRPC][C++][Python] Allow setting gRPC client options

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou resolved ARROW-9093. --- Fix Version/s: 1.0.0 Resolution: Fixed Issue resolved by pull request 7406

[jira] [Updated] (ARROW-9065) [C++] Support parsing date32 in dataset partition folders

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-9065: -- Summary: [C++] Support parsing date32 in dataset partition folders (was:

[jira] [Assigned] (ARROW-5761) [R] Improve autosplice cpp code

2020-06-11 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques reassigned ARROW-5761: - Assignee: (was: Francois Saint-Jacques) > [R] Improve autosplice

[jira] [Assigned] (ARROW-7288) [R] read_parquet() freezes on Windows with Japanese locale

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-7288: -- Assignee: Romain Francois > [R] read_parquet() freezes on Windows with Japanese

[jira] [Assigned] (ARROW-7018) [R] Special characters as question mark in parquet files

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-7018: -- Assignee: Romain Francois > [R] Special characters as question mark in parquet files

[jira] [Assigned] (ARROW-8899) [R] Add R metadata like pandas metadata for round-trip fidelity

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-8899: -- Assignee: Romain Francois > [R] Add R metadata like pandas metadata for round-trip

[jira] [Assigned] (ARROW-9031) [R] Implement conversion from Type::UINT64 to R vector

2020-06-11 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neal Richardson reassigned ARROW-9031: -- Assignee: Romain Francois > [R] Implement conversion from Type::UINT64 to R vector >

[jira] [Assigned] (ARROW-7607) [C++] Add to cpp/examples minimal examples of using Arrow as a dependency of another CMake project

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou reassigned ARROW-7607: - Assignee: Antoine Pitrou > [C++] Add to cpp/examples minimal examples of using Arrow as

[jira] [Updated] (ARROW-9106) [C++] Add C++ foundation to ease file transcoding

2020-06-11 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-9106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-9106: -- Fix Version/s: 1.0.0 > [C++] Add C++ foundation to ease file transcoding >

  1   2   >