[jira] [Commented] (ARROW-6256) [Rust] parquet-format should be released by Apache process

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056555#comment-17056555 ] Wes McKinney commented on ARROW-6256: - There are several steps that have to be undertaken to make

[jira] [Updated] (ARROW-6256) [Rust] parquet-format should be released by Apache process

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-6256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-6256: Fix Version/s: (was: 0.17.0) 1.0.0 > [Rust] parquet-format should be

[jira] [Assigned] (ARROW-8042) [Python] pyarrow.ChunkedArray docstring is incorrect regarding zero-length ChunkedArray having no chunks

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-8042: --- Assignee: Wes McKinney > [Python] pyarrow.ChunkedArray docstring is incorrect regarding

[jira] [Assigned] (ARROW-7907) [Python] Conversion to pandas of empty table with timestamp type aborts

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-7907: --- Assignee: Wes McKinney > [Python] Conversion to pandas of empty table with timestamp type

[jira] [Commented] (ARROW-7907) [Python] Conversion to pandas of empty table with timestamp type aborts

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056573#comment-17056573 ] Wes McKinney commented on ARROW-7907: - This looks like it was fixed in

[jira] [Created] (ARROW-8070) [Python] Casting Segfault

2020-03-10 Thread Daniel Nugent (Jira)
Daniel Nugent created ARROW-8070: Summary: [Python] Casting Segfault Key: ARROW-8070 URL: https://issues.apache.org/jira/browse/ARROW-8070 Project: Apache Arrow Issue Type: Bug

[jira] [Updated] (ARROW-7989) [Developer][C++] IWYU fails on include-cycle in uriparser/Uri.h

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-7989: Fix Version/s: (was: 0.17.0) > [Developer][C++] IWYU fails on include-cycle in uriparser/Uri.h

[jira] [Comment Edited] (ARROW-8053) [JS] Improve performance of filtering

2020-03-10 Thread Leo Meyerovich (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056567#comment-17056567 ] Leo Meyerovich edited comment on ARROW-8053 at 3/11/20, 1:26 AM: - Sorry,

[jira] [Updated] (ARROW-7976) [C++] Add field to IpcOptions to include padding in Buffer metadata accounting

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-7976: Fix Version/s: (was: 0.17.0) > [C++] Add field to IpcOptions to include padding in Buffer

[jira] [Updated] (ARROW-8022) [C++] Provide or Vendor a small_vector implementation

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-8022: Fix Version/s: (was: 0.17.0) > [C++] Provide or Vendor a small_vector implementation >

[jira] [Updated] (ARROW-8026) [Python] Support memoryview in addition to string value types for constructing string and binary type arrays

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8026: -- Labels: pull-request-available (was: ) > [Python] Support memoryview in addition to string

[jira] [Comment Edited] (ARROW-8053) [JS] Improve performance of filtering

2020-03-10 Thread Leo Meyerovich (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056567#comment-17056567 ] Leo Meyerovich edited comment on ARROW-8053 at 3/11/20, 1:25 AM: - Sorry,

[jira] [Updated] (ARROW-7907) [Python] Conversion to pandas of empty table with timestamp type aborts

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-7907: -- Labels: pull-request-available (was: ) > [Python] Conversion to pandas of empty table with

[jira] [Commented] (ARROW-8053) [JS] Improve performance of filtering

2020-03-10 Thread Leo Meyerovich (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056567#comment-17056567 ] Leo Meyerovich commented on ARROW-8053: --- Sorry, we never got support for continuing our arrow js

[jira] [Updated] (ARROW-8071) [GLib] Build error with configure

2020-03-10 Thread Kouhei Sutou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou updated ARROW-8071: Description: This is introduced by ARROW-7444. (was: This is introduced by ARROW-8055.) > [GLib]

[jira] [Updated] (ARROW-8071) [GLib] Build error with configure

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8071: -- Labels: pull-request-available (was: ) > [GLib] Build error with configure >

[jira] [Created] (ARROW-8071) [GLib] Build error with configure

2020-03-10 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-8071: --- Summary: [GLib] Build error with configure Key: ARROW-8071 URL: https://issues.apache.org/jira/browse/ARROW-8071 Project: Apache Arrow Issue Type: Bug

[jira] [Updated] (ARROW-7904) [C++] Decide about Field/Schema metadata printing parameters and how much to show by default

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-7904: -- Labels: pull-request-available (was: ) > [C++] Decide about Field/Schema metadata printing

[jira] [Resolved] (ARROW-8071) [GLib] Build error with configure

2020-03-10 Thread Kouhei Sutou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kouhei Sutou resolved ARROW-8071. - Fix Version/s: 0.17.0 Resolution: Fixed Issue resolved by pull request 6575

[jira] [Updated] (ARROW-8042) [Python] pyarrow.ChunkedArray docstring is incorrect regarding zero-length ChunkedArray having no chunks

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8042: -- Labels: pull-request-available (was: ) > [Python] pyarrow.ChunkedArray docstring is incorrect

[jira] [Commented] (ARROW-8028) [Go] Allow duplicate field names in schemas and nested types

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056563#comment-17056563 ] Wes McKinney commented on ARROW-8028: - cc [~sbinet] > [Go] Allow duplicate field names in schemas

[jira] [Updated] (ARROW-7910) [C++] Provide function to query page size portably

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-7910: Fix Version/s: (was: 0.17.0) 1.0.0 > [C++] Provide function to query page

[jira] [Assigned] (ARROW-8026) [Python] Support memoryview in addition to string value types for constructing string and binary type arrays

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-8026: --- Assignee: Wes McKinney > [Python] Support memoryview in addition to string value types for

[jira] [Commented] (ARROW-8015) [Python] Build 0.16.0 wheel install for Windows + Python 3.5 and publish to PyPI

2020-03-10 Thread Lucas Pickup (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056574#comment-17056574 ] Lucas Pickup commented on ARROW-8015: - I ran a bunch of our 'scenario' notebooks against this wheel.

[jira] [Comment Edited] (ARROW-8053) [JS] Improve performance of filtering

2020-03-10 Thread Leo Meyerovich (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056567#comment-17056567 ] Leo Meyerovich edited comment on ARROW-8053 at 3/11/20, 1:23 AM: - Sorry,

[jira] [Updated] (ARROW-8055) [GLib][Ruby] Add some metadata bindings to GArrowSchema

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8055: -- Labels: pull-request-available (was: ) > [GLib][Ruby] Add some metadata bindings to

[jira] [Created] (ARROW-8055) [GLib][Ruby] Add some metadata bindings to GArrowSchema

2020-03-10 Thread Kouhei Sutou (Jira)
Kouhei Sutou created ARROW-8055: --- Summary: [GLib][Ruby] Add some metadata bindings to GArrowSchema Key: ARROW-8055 URL: https://issues.apache.org/jira/browse/ARROW-8055 Project: Apache Arrow

[jira] [Created] (ARROW-8054) [JS] Improve performance of filtering

2020-03-10 Thread Will Strimling (Jira)
Will Strimling created ARROW-8054: - Summary: [JS] Improve performance of filtering Key: ARROW-8054 URL: https://issues.apache.org/jira/browse/ARROW-8054 Project: Apache Arrow Issue Type: Bug

[jira] [Commented] (ARROW-7830) [C++] Parquet library version doesn't change with releases

2020-03-10 Thread H. Vetinari (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055702#comment-17055702 ] H. Vetinari commented on ARROW-7830: OK, I didn't want to come off as demanding (anything, really),

[jira] [Created] (ARROW-8053) [JS] Improve performance of filtering

2020-03-10 Thread Will Strimling (Jira)
Will Strimling created ARROW-8053: - Summary: [JS] Improve performance of filtering Key: ARROW-8053 URL: https://issues.apache.org/jira/browse/ARROW-8053 Project: Apache Arrow Issue Type: Bug

[jira] [Commented] (ARROW-1231) [C++] Add filesystem / IO implementation for Google Cloud Storage

2020-03-10 Thread Frank Natividad (Jira)
[ https://issues.apache.org/jira/browse/ARROW-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055656#comment-17055656 ] Frank Natividad commented on ARROW-1231: The XML API does exist and compatibility with S3 SDK is

[jira] [Commented] (ARROW-8057) Schema equality not roundtrip safe

2020-03-10 Thread Florian Jetter (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055804#comment-17055804 ] Florian Jetter commented on ARROW-8057: --- Investigating the fields explicitly shows that the

[jira] [Updated] (ARROW-5265) [Python/CI] Add integration test with kartothek

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-5265: -- Labels: parquet pull-request-available (was: parquet) > [Python/CI] Add integration test with

[jira] [Created] (ARROW-8058) [C++][Python][Dataset] Provide an option to skip validation in FileSystemDatasetFactoryOptions

2020-03-10 Thread Ben Kietzman (Jira)
Ben Kietzman created ARROW-8058: --- Summary: [C++][Python][Dataset] Provide an option to skip validation in FileSystemDatasetFactoryOptions Key: ARROW-8058 URL: https://issues.apache.org/jira/browse/ARROW-8058

[jira] [Assigned] (ARROW-8058) [C++][Python][Dataset] Provide an option to skip validation in FileSystemDatasetFactoryOptions

2020-03-10 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Kietzman reassigned ARROW-8058: --- Assignee: (was: Ben Kietzman) > [C++][Python][Dataset] Provide an option to skip

[jira] [Updated] (ARROW-7996) [Python] Error serializing empty pandas DataFrame with pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-7996: - Labels: serialization (was: ) > [Python] Error serializing empty pandas

[jira] [Commented] (ARROW-8004) [Python] Define API for user-defined conversions of array cell values in pyarrow.array

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055776#comment-17055776 ] Joris Van den Bossche commented on ARROW-8004: -- For a more limited use case than general

[jira] [Updated] (ARROW-8010) [Python] Fixed size list not convertible to Numpy Array / pandas Series

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8010: - Summary: [Python] Fixed size list not convertible to Numpy Array / pandas Series

[jira] [Updated] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Dyfan Jones (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dyfan Jones updated ARROW-8056: --- Component/s: R > [R] Support read and write orc file format >

[jira] [Commented] (ARROW-7680) [C++][Dataset] Partition discovery is not working with windows path

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055828#comment-17055828 ] Joris Van den Bossche commented on ARROW-7680: -- Indeed, we are still getting the same error

[jira] [Created] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Dyfan Jones (Jira)
Dyfan Jones created ARROW-8056: -- Summary: [R] Support read and write orc file format Key: ARROW-8056 URL: https://issues.apache.org/jira/browse/ARROW-8056 Project: Apache Arrow Issue Type: New

[jira] [Commented] (ARROW-8010) [Python] Fixed size list not convertible to Numpy Array / pandas Series

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055833#comment-17055833 ] Joris Van den Bossche commented on ARROW-8010: -- [~balancap] Thanks for the report! I think

[jira] [Updated] (ARROW-8058) [C++][Python][Dataset] Provide an option to toggle validation and schema inference in FileSystemDatasetFactoryOptions

2020-03-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8058: -- Summary: [C++][Python][Dataset] Provide an option to toggle validation and

[jira] [Commented] (ARROW-7996) [Python] Error serializing empty pandas DataFrame with pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055760#comment-17055760 ] Joris Van den Bossche commented on ARROW-7996: -- The error comes from deserializing the

[jira] [Updated] (ARROW-7680) [C++][Dataset] Partition discovery is not working with windows path

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-7680: -- Labels: pull-request-available (was: ) > [C++][Dataset] Partition discovery is not working

[jira] [Commented] (ARROW-7680) [C++][Dataset] Partition discovery is not working with windows path

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055815#comment-17055815 ] Joris Van den Bossche commented on ARROW-7680: -- Since ARROW-7677 is not yet resolved, I

[jira] [Commented] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055816#comment-17055816 ] Antoine Pitrou commented on ARROW-8057: --- cc [~wesm] > [C++] Schema equality not roundtrip safe

[jira] [Updated] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-8057: -- Component/s: C++ > [C++] Schema equality not roundtrip safe through Parquet >

[jira] [Updated] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antoine Pitrou updated ARROW-8057: -- Summary: [C++] Schema equality not roundtrip safe through Parquet (was: Schema equality not

[jira] [Comment Edited] (ARROW-7677) [C++] Handle Windows file paths with backslashes in GetTargetStats

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055814#comment-17055814 ] Joris Van den Bossche edited comment on ARROW-7677 at 3/10/20, 10:56 AM:

[jira] [Commented] (ARROW-7677) [C++] Handle Windows file paths with backslashes in GetTargetStats

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055814#comment-17055814 ] Joris Van den Bossche commented on ARROW-7677: -- It came up in a partitioned parquet dataset

[jira] [Commented] (ARROW-1231) [C++] Add filesystem / IO implementation for Google Cloud Storage

2020-03-10 Thread Antoine Pitrou (Jira)
[ https://issues.apache.org/jira/browse/ARROW-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055813#comment-17055813 ] Antoine Pitrou commented on ARROW-1231: --- Thank you for the explanation. I agree we should use the

[jira] [Closed] (ARROW-8010) [Python] Fixed size list not convertible to Numpy Array / pandas Series

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche closed ARROW-8010. Resolution: Duplicate > [Python] Fixed size list not convertible to Numpy Array /

[jira] [Assigned] (ARROW-5265) [Python/CI] Add integration test with kartothek

2020-03-10 Thread Uwe Korn (Jira)
[ https://issues.apache.org/jira/browse/ARROW-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Korn reassigned ARROW-5265: --- Assignee: Uwe Korn > [Python/CI] Add integration test with kartothek >

[jira] [Updated] (ARROW-2728) [Python][C++][Dataset] Support partitioned Parquet datasets using glob-style file paths

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-2728: - Component/s: C++ - Dataset > [Python][C++][Dataset] Support partitioned Parquet

[jira] [Updated] (ARROW-3154) [Python][C++] Document how to write _metadata, _common_metadata files with Parquet datasets

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-3154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-3154: - Component/s: C++ - Dataset > [Python][C++] Document how to write _metadata,

[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055766#comment-17055766 ] Joris Van den Bossche commented on ARROW-7997: -- [~otaviocv] Thanks for the report! That is

[jira] [Updated] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-7997: - Component/s: Python > [Python] Schema equals method with inconsistent docs in

[jira] [Updated] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-7997: - Summary: [Python] Schema equals method with inconsistent docs in pyarrow (was:

[jira] [Commented] (ARROW-8052) [Python] requirements-test.txt cannot be used with conda install --file

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055830#comment-17055830 ] Joris Van den Bossche commented on ARROW-8052: -- I don't think this should be expected to

[jira] [Resolved] (ARROW-7956) [Python] Memory leak in pyarrow functions .ipc.serialize_pandas/deserialize_pandas

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney resolved ARROW-7956. - Resolution: Fixed Yes, resolved as part of the patch for ARROW-4120 > [Python] Memory leak in

[jira] [Commented] (ARROW-7830) [C++] Parquet library version doesn't change with releases

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055974#comment-17055974 ] Wes McKinney commented on ARROW-7830: - > So wouldn't it be a reasonable way of looking at things that

[jira] [Updated] (ARROW-7963) [C++][Python][Dataset] Expose listing fragments

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-7963: -- Labels: pull-request-available (was: ) > [C++][Python][Dataset] Expose listing fragments >

[jira] [Commented] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056030#comment-17056030 ] Neal Richardson commented on ARROW-8056: Codewise, it probably wouldn't be too bad to add R

[jira] [Closed] (ARROW-8052) [Python] requirements-test.txt cannot be used with conda install --file

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney closed ARROW-8052. --- Resolution: Not A Problem Alright, closing then > [Python] requirements-test.txt cannot be used

[jira] [Commented] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056005#comment-17056005 ] Wes McKinney commented on ARROW-8057: - I went with changing the default to {{False}}, it feels like

[jira] [Commented] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055979#comment-17055979 ] Wes McKinney commented on ARROW-8057: - Yes, this was added in

[jira] [Updated] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8057: -- Labels: pull-request-available (was: ) > [C++] Schema equality not roundtrip safe through

[jira] [Commented] (ARROW-7996) Error serializing empty pandas DataFrame with pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055756#comment-17055756 ] Joris Van den Bossche commented on ARROW-7996: -- [~jdavidagudelo] Thanks for the report! A

[jira] [Updated] (ARROW-7996) [Python] Error serializing empty pandas DataFrame with pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-7996: - Summary: [Python] Error serializing empty pandas DataFrame with pyarrow (was:

[jira] [Commented] (ARROW-7956) [Python] Memory leak in pyarrow functions .ipc.serialize_pandas/deserialize_pandas

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17055783#comment-17055783 ] Joris Van den Bossche commented on ARROW-7956: -- [~wesm] I think this was closed by

[jira] [Created] (ARROW-8057) Schema equality not roundtrip safe

2020-03-10 Thread Florian Jetter (Jira)
Florian Jetter created ARROW-8057: - Summary: Schema equality not roundtrip safe Key: ARROW-8057 URL: https://issues.apache.org/jira/browse/ARROW-8057 Project: Apache Arrow Issue Type: Bug

[GitHub] [arrow-dist] Rajpratik71 opened a new pull request #31: optimization debian package manager tweaks

2020-03-10 Thread GitBox
Rajpratik71 opened a new pull request #31: optimization debian package manager tweaks URL: https://github.com/apache/arrow-dist/pull/31 By default, Ubuntu or Debian based "apt" or "apt-get" system installs recommended but not suggested packages . By passing

[jira] [Assigned] (ARROW-8057) [C++] Schema equality not roundtrip safe through Parquet

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney reassigned ARROW-8057: --- Assignee: Wes McKinney > [C++] Schema equality not roundtrip safe through Parquet >

[jira] [Updated] (ARROW-8060) [Python] Make dataset Expression objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8060: - Fix Version/s: 0.17.0 > [Python] Make dataset Expression objects serializable >

[jira] [Created] (ARROW-8059) [Python] Make FileSystem objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8059: Summary: [Python] Make FileSystem objects serializable Key: ARROW-8059 URL: https://issues.apache.org/jira/browse/ARROW-8059 Project: Apache Arrow

[jira] [Updated] (ARROW-8059) [Python] Make FileSystem objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Van den Bossche updated ARROW-8059: - Fix Version/s: 0.17.0 > [Python] Make FileSystem objects serializable >

[jira] [Created] (ARROW-8060) [Python] Make dataset Expression objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8060: Summary: [Python] Make dataset Expression objects serializable Key: ARROW-8060 URL: https://issues.apache.org/jira/browse/ARROW-8060 Project: Apache

[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056260#comment-17056260 ] Wes McKinney commented on ARROW-7997: - Yes, we appear to be changing the default to False so the

[jira] [Updated] (ARROW-8064) [Dev] Implement Comment bot via Github actions

2020-03-10 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ARROW-8064: -- Labels: pull-request-available (was: ) > [Dev] Implement Comment bot via Github actions >

[jira] [Commented] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Neal Richardson (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056194#comment-17056194 ] Neal Richardson commented on ARROW-8056: Sounds like a reasonable objective. Since your packages

[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Jira
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056207#comment-17056207 ] Otávio Vasques commented on ARROW-7997: --- Interested! I will work on that. > [Python] Schema equals

[jira] [Commented] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056211#comment-17056211 ] Francois Saint-Jacques commented on ARROW-8061: --- Yes, this is possible, a ParquetFragment

[jira] [Created] (ARROW-8062) [C++][Dataset] Parquet Dataset factory from a _metadata/_common_metadata file

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8062: Summary: [C++][Dataset] Parquet Dataset factory from a _metadata/_common_metadata file Key: ARROW-8062 URL: https://issues.apache.org/jira/browse/ARROW-8062

[jira] [Commented] (ARROW-8059) [Python] Make FileSystem objects serializable

2020-03-10 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056228#comment-17056228 ] Ben Kietzman commented on ARROW-8059: - what will be the result of trying to serialize a local file

[jira] [Commented] (ARROW-8047) [Python][Documentation] Document migration from ParquetDataset to pyarrow.datasets

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056235#comment-17056235 ] Joris Van den Bossche commented on ARROW-8047: -- I also created ARROW-8063 for general user

[jira] [Commented] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056248#comment-17056248 ] Joris Van den Bossche commented on ARROW-8061: -- > Note that parallelism of RowGroup is

[jira] [Created] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8061: Summary: [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups) Key: ARROW-8061 URL:

[jira] [Commented] (ARROW-8061) [C++][Dataset] Ability to specify granularity of ParquetFileFragment (support row groups)

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056201#comment-17056201 ] Joris Van den Bossche commented on ARROW-8061: -- Example usecase for this: for Dask, wich

[jira] [Commented] (ARROW-8060) [Python] Make dataset Expression objects serializable

2020-03-10 Thread Ben Kietzman (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056221#comment-17056221 ] Ben Kietzman commented on ARROW-8060: - this should probably wait for ARROW-7878 (and follow-ups)

[jira] [Created] (ARROW-8063) [Python] Add user guide documentation for Datasets API

2020-03-10 Thread Joris Van den Bossche (Jira)
Joris Van den Bossche created ARROW-8063: Summary: [Python] Add user guide documentation for Datasets API Key: ARROW-8063 URL: https://issues.apache.org/jira/browse/ARROW-8063 Project: Apache

[jira] [Commented] (ARROW-7997) [Python] Schema equals method with inconsistent docs in pyarrow

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056241#comment-17056241 ] Joris Van den Bossche commented on ARROW-7997: -- Actually, there is just today work going on

[jira] [Commented] (ARROW-8059) [Python] Make FileSystem objects serializable

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056251#comment-17056251 ] Joris Van den Bossche commented on ARROW-8059: -- Specifically for dask's usecase, it might

[jira] [Created] (ARROW-8064) [Dev] Implement Comment bot via Github actions

2020-03-10 Thread Krisztian Szucs (Jira)
Krisztian Szucs created ARROW-8064: -- Summary: [Dev] Implement Comment bot via Github actions Key: ARROW-8064 URL: https://issues.apache.org/jira/browse/ARROW-8064 Project: Apache Arrow

[jira] [Commented] (ARROW-8039) [C++][Python][Dataset] Assemble a minimal ParquetDataset shim

2020-03-10 Thread Joris Van den Bossche (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056284#comment-17056284 ] Joris Van den Bossche commented on ARROW-8039: -- > We might focus this by saying that the

[jira] [Created] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-10 Thread Francois Saint-Jacques (Jira)
Francois Saint-Jacques created ARROW-8065: - Summary: [C++][Dataset] Untangle Dataset, Fragment and ScanOptions Key: ARROW-8065 URL: https://issues.apache.org/jira/browse/ARROW-8065 Project:

[jira] [Updated] (ARROW-8065) [C++][Dataset] Untangle Dataset, Fragment and ScanOptions

2020-03-10 Thread Francois Saint-Jacques (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francois Saint-Jacques updated ARROW-8065: -- Component/s: C++ - Dataset > [C++][Dataset] Untangle Dataset, Fragment and

[jira] [Commented] (ARROW-8056) [R] Support read and write orc file format

2020-03-10 Thread Dyfan Jones (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056107#comment-17056107 ] Dyfan Jones commented on ARROW-8056: To my knowledge R doesn't have any maintained packages that

[jira] [Commented] (ARROW-8053) [JS] Improve performance of filtering

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17056516#comment-17056516 ] Wes McKinney commented on ARROW-8053: - Looking at

[jira] [Updated] (ARROW-8066) [Python] Specify behavior for converting tz-aware datetime.datetime objects to Arrow format

2020-03-10 Thread Wes McKinney (Jira)
[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wes McKinney updated ARROW-8066: Summary: [Python] Specify behavior for converting tz-aware datetime.datetime objects to Arrow

  1   2   >