[GitHub] [arrow] github-actions[bot] commented on pull request #7628: ARROW-9315: [Java] Fix the failure of testAllocationManagerType

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7628: URL: https://github.com/apache/arrow/pull/7628#issuecomment-653349350 https://issues.apache.org/jira/browse/ARROW-9315 This is an automated message from the Apache Git

[GitHub] [arrow] jacques-n commented on pull request #6156: ARROW-7539: [Java] FieldVector getFieldBuffers API should not set reader/writer indices

2020-07-02 Thread GitBox
jacques-n commented on pull request #6156: URL: https://github.com/apache/arrow/pull/6156#issuecomment-653348862 This doesn't just break Dremio tests, it breaks Dremio functionally. A little history lesson: ValueVector.getBuffers() has existed for a much longer time than

[GitHub] [arrow] liyafan82 opened a new pull request #7628: ARROW-9315: [Java] Fix the failure of testAllocationManagerType

2020-07-02 Thread GitBox
liyafan82 opened a new pull request #7628: URL: https://github.com/apache/arrow/pull/7628 The problem was caused by a cyclic dependency of class loading. Please see the discussion in the jira: https://issues.apache.org/jira/browse/ARROW-9315 We solve it by making

[GitHub] [arrow] emkornfield commented on a change in pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
emkornfield commented on a change in pull request #7604: URL: https://github.com/apache/arrow/pull/7604#discussion_r449373710 ## File path: cpp/src/arrow/python/datetime.cc ## @@ -262,6 +265,42 @@ int64_t PyDate_to_days(PyDateTime_Date* pydate) {

[GitHub] [arrow] emkornfield commented on a change in pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
emkornfield commented on a change in pull request #7604: URL: https://github.com/apache/arrow/pull/7604#discussion_r449373710 ## File path: cpp/src/arrow/python/datetime.cc ## @@ -262,6 +265,42 @@ int64_t PyDate_to_days(PyDateTime_Date* pydate) {

[GitHub] [arrow] emkornfield commented on a change in pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
emkornfield commented on a change in pull request #7604: URL: https://github.com/apache/arrow/pull/7604#discussion_r449373516 ## File path: python/pyarrow/tests/test_pandas.py ## @@ -3321,9 +3321,12 @@ def test_cast_timestamp_unit(): assert result.equals(expected)

[GitHub] [arrow] emkornfield commented on a change in pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
emkornfield commented on a change in pull request #7604: URL: https://github.com/apache/arrow/pull/7604#discussion_r449371074 ## File path: cpp/src/arrow/python/arrow_to_pandas.cc ## @@ -642,24 +641,27 @@ inline Status ConvertStruct(const PandasOptions& options, const

[GitHub] [arrow] emkornfield closed pull request #7502: ARROW-9308: [Format] Add Feature enum for forward compatibility.

2020-07-02 Thread GitBox
emkornfield closed pull request #7502: URL: https://github.com/apache/arrow/pull/7502 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] emkornfield commented on pull request #7502: ARROW-9308: [Format] Add Feature enum for forward compatibility.

2020-07-02 Thread GitBox
emkornfield commented on pull request #7502: URL: https://github.com/apache/arrow/pull/7502#issuecomment-653322766 +1 approved on ML by vote. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] emkornfield commented on pull request #6156: ARROW-7539: [Java] FieldVector getFieldBuffers API should not set reader/writer indices

2020-07-02 Thread GitBox
emkornfield commented on pull request #6156: URL: https://github.com/apache/arrow/pull/6156#issuecomment-653322130 Given how long this PR has been open and approved, I think we should aim to check it in next Tuesday, unless we can come up with a concrete plan by then to help mitigate

[GitHub] [arrow] liyafan82 closed pull request #7543: ARROW-9221: [Java] account for big-endian buffers in ArrowBuf.setBytes

2020-07-02 Thread GitBox
liyafan82 closed pull request #7543: URL: https://github.com/apache/arrow/pull/7543 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #7627: ARROW-9307: [Ruby] Add Arrow::RecordBatchIterator#to_a

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7627: URL: https://github.com/apache/arrow/pull/7627#issuecomment-653298250 https://issues.apache.org/jira/browse/ARROW-9307 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7626: ARROW-9306: [Ruby] Add support for Arrow::RecordBatch.new(raw_table)

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7626: URL: https://github.com/apache/arrow/pull/7626#issuecomment-653294335 https://issues.apache.org/jira/browse/ARROW-9306 This is an automated message from the Apache Git

[GitHub] [arrow] kou opened a new pull request #7627: ARROW-9307: [Ruby] Add Arrow::RecordBatchIterator#to_a

2020-07-02 Thread GitBox
kou opened a new pull request #7627: URL: https://github.com/apache/arrow/pull/7627 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] kou opened a new pull request #7626: ARROW-9306: [Ruby] Add support for Arrow::RecordBatch.new(raw_table)

2020-07-02 Thread GitBox
kou opened a new pull request #7626: URL: https://github.com/apache/arrow/pull/7626 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] BryanCutler commented on a change in pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
BryanCutler commented on a change in pull request #7604: URL: https://github.com/apache/arrow/pull/7604#discussion_r449338456 ## File path: python/pyarrow/tests/test_pandas.py ## @@ -3321,9 +3321,12 @@ def test_cast_timestamp_unit(): assert result.equals(expected)

[GitHub] [arrow] kou closed pull request #7615: ARROW-9294: [GLib] Add GArrowFunction and related objects

2020-07-02 Thread GitBox
kou closed pull request #7615: URL: https://github.com/apache/arrow/pull/7615 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] kou commented on pull request #7615: ARROW-9294: [GLib] Add GArrowFunction and related objects

2020-07-02 Thread GitBox
kou commented on pull request #7615: URL: https://github.com/apache/arrow/pull/7615#issuecomment-653269378 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] sunchao closed pull request #7622: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-02 Thread GitBox
sunchao closed pull request #7622: URL: https://github.com/apache/arrow/pull/7622 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] wesm closed pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm closed pull request #7621: URL: https://github.com/apache/arrow/pull/7621 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653266761 Closing this. Brotli is not the problem This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] wesm commented on pull request #7539: ARROW-9156: [C++] Reducing the code size of the tensor module

2020-07-02 Thread GitBox
wesm commented on pull request #7539: URL: https://github.com/apache/arrow/pull/7539#issuecomment-653266493 What do you think about pursuing the performance optimization work as a follow up? This is an automated message

[GitHub] [arrow] wesm closed pull request #7598: ARROW-9278: [C++][Python] Remove validity bitmap from Union types, update IPC read/write and integration tests

2020-07-02 Thread GitBox
wesm closed pull request #7598: URL: https://github.com/apache/arrow/pull/7598 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7598: ARROW-9278: [C++][Python] Remove validity bitmap from Union types, update IPC read/write and integration tests

2020-07-02 Thread GitBox
wesm commented on pull request #7598: URL: https://github.com/apache/arrow/pull/7598#issuecomment-653265084 +1. I will open a follow up JIRA about the "AppendEmpty" issue This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7598: ARROW-9278: [C++][Python] Remove validity bitmap from Union types, update IPC read/write and integration tests

2020-07-02 Thread GitBox
wesm commented on pull request #7598: URL: https://github.com/apache/arrow/pull/7598#issuecomment-653253875 The test failure appears transient This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] sunchao closed pull request #7610: ARROW-9290: [Rust] [Parquet] Add features to allow opting out of dependencies

2020-07-02 Thread GitBox
sunchao closed pull request #7610: URL: https://github.com/apache/arrow/pull/7610 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] wesm closed pull request #7566: ARROW-9258: [FORMAT] Add V5 MetadataVersion to Schema.fbs

2020-07-02 Thread GitBox
wesm closed pull request #7566: URL: https://github.com/apache/arrow/pull/7566 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7566: ARROW-9258: [FORMAT] Add V5 MetadataVersion to Schema.fbs

2020-07-02 Thread GitBox
wesm commented on pull request #7566: URL: https://github.com/apache/arrow/pull/7566#issuecomment-653251445 +1, vote result https://lists.apache.org/thread.html/r690331f9d7ba7ff5f23f28300253d1da5cd14a56a205c7626e1c97fc%40%3Cdev.arrow.apache.org%3E

[GitHub] [arrow] wesm closed pull request #7567: ARROW-9259: [Format] Add language indicating that unsigned dictionary indices are supported but that signed integers are preferred

2020-07-02 Thread GitBox
wesm closed pull request #7567: URL: https://github.com/apache/arrow/pull/7567 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7567: ARROW-9259: [Format] Add language indicating that unsigned dictionary indices are supported but that signed integers are preferred

2020-07-02 Thread GitBox
wesm commented on pull request #7567: URL: https://github.com/apache/arrow/pull/7567#issuecomment-653250668 +1, vote result https://lists.apache.org/thread.html/rf239738d3ce23878ade6eb87d0e69f5ce685dd335c45a06d9c05c43a%40%3Cdev.arrow.apache.org%3E

[GitHub] [arrow] wesm commented on pull request #7535: ARROW-9222: [Format] Columnar.rst changes for removing validity bitmap from union types

2020-07-02 Thread GitBox
wesm commented on pull request #7535: URL: https://github.com/apache/arrow/pull/7535#issuecomment-653248226 +1, vote result https://lists.apache.org/thread.html/r47c6578fe9c87d821ccd26e6fc28a49caa3849cca2c5e01be3c225dc%40%3Cdev.arrow.apache.org%3E

[GitHub] [arrow] github-actions[bot] commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653248351 Revision: 0722674b2743bcb9fba3e95cbe06e293d2b428c0 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] wesm closed pull request #7535: ARROW-9222: [Format] Columnar.rst changes for removing validity bitmap from union types

2020-07-02 Thread GitBox
wesm closed pull request #7535: URL: https://github.com/apache/arrow/pull/7535 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653247966 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[GitHub] [arrow] pitrou commented on pull request #7596: ARROW-9163: [C++] Validate UTF8 contents of a StringArray

2020-07-02 Thread GitBox
pitrou commented on pull request #7596: URL: https://github.com/apache/arrow/pull/7596#issuecomment-653242456 Rebased. This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] pitrou commented on pull request #7618: ARROW-7010: [C++] Implement decimal-to-float casts

2020-07-02 Thread GitBox
pitrou commented on pull request #7618: URL: https://github.com/apache/arrow/pull/7618#issuecomment-653237225 Rebased. This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] pitrou removed a comment on pull request #7618: ARROW-7010: [C++] Implement decimal-to-float casts

2020-07-02 Thread GitBox
pitrou removed a comment on pull request #7618: URL: https://github.com/apache/arrow/pull/7618#issuecomment-653062331 Based on PR #7612, please review that one first. This is an automated message from the Apache Git Service.

[GitHub] [arrow] github-actions[bot] commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653232832 Revision: 2f432d224959846d8d409fcf642ccec91b645015 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] wesm closed pull request #7612: ARROW-7011: [C++] Implement casts from float/double to decimal

2020-07-02 Thread GitBox
wesm closed pull request #7612: URL: https://github.com/apache/arrow/pull/7612 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7612: ARROW-7011: [C++] Implement casts from float/double to decimal

2020-07-02 Thread GitBox
wesm commented on pull request #7612: URL: https://github.com/apache/arrow/pull/7612#issuecomment-653232883 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] wesm commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653232236 @github-actions crossbow submit wheel-win-cp37m This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] wesm edited a comment on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm edited a comment on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653230746 OK based on reading https://github.com/google/brotli/pull/655 unfortunately it appears that the conda-forge Brotli static libraries are broken on Windows. I'll open a ticket

[GitHub] [arrow] wesm commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653230746 OK based on reading https://github.com/google/brotli/pull/655 unfortunately it appears that the conda-forge Brotli static libraries are broken on Windows. I'll open a ticket for

[GitHub] [arrow] jorisvandenbossche commented on pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
jorisvandenbossche commented on pull request #7604: URL: https://github.com/apache/arrow/pull/7604#issuecomment-653228729 > the reason for not always using to the to_object path, is because I don't want to potentially change functionality of pandas conversion to datetime. Pandas

[GitHub] [arrow] wesm commented on pull request #7625: [CI] Add s390x Travis CI build back to allow_failures until it becomes less flaky

2020-07-02 Thread GitBox
wesm commented on pull request #7625: URL: https://github.com/apache/arrow/pull/7625#issuecomment-653227577 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] wesm merged pull request #7625: [CI] Add s390x Travis CI build back to allow_failures until it becomes less flaky

2020-07-02 Thread GitBox
wesm merged pull request #7625: URL: https://github.com/apache/arrow/pull/7625 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] jorisvandenbossche commented on pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
jorisvandenbossche commented on pull request #7604: URL: https://github.com/apache/arrow/pull/7604#issuecomment-653226285 > But if the timestamp is meant for display purposes, I would expect the tz-naive datetime value to reflect the local timezone. Since it displays a "calendar" like

[GitHub] [arrow] wesm commented on pull request #7612: ARROW-7011: [C++] Implement casts from float/double to decimal

2020-07-02 Thread GitBox
wesm commented on pull request #7612: URL: https://github.com/apache/arrow/pull/7612#issuecomment-653223898 Ugh there's that pesky flake again https://github.com/apache/arrow/pull/7612/checks?check_run_id=832227898 @kszucs did we determine whether it's feasible to get backtraces on

[GitHub] [arrow] wesm commented on pull request #7605: ARROW-9283: [Python] Expose build info

2020-07-02 Thread GitBox
wesm commented on pull request #7605: URL: https://github.com/apache/arrow/pull/7605#issuecomment-653223316 I think the scenarios where the C++ library would be upgraded but not the Python library are likely to be infrequently occurring, that said I think it would be useful to be clear

[GitHub] [arrow] wesm edited a comment on pull request #7605: ARROW-9283: [Python] Expose build info

2020-07-02 Thread GitBox
wesm edited a comment on pull request #7605: URL: https://github.com/apache/arrow/pull/7605#issuecomment-653223316 I think the scenarios where the C++ library would be upgraded but not the Python library are likely to be infrequently occurring, that said I think it would be useful to be

[GitHub] [arrow] emkornfield commented on pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
emkornfield commented on pull request #7604: URL: https://github.com/apache/arrow/pull/7604#issuecomment-653223390 > And for tz-naive we don't need to handle any timezone issue, and can just convert it do a datetime.datetime object, which by default is also tz-naive and can thus be used

[GitHub] [arrow] kou commented on a change in pull request #7620: ARROW-9013: [C++] Validate CMake options

2020-07-02 Thread GitBox
kou commented on a change in pull request #7620: URL: https://github.com/apache/arrow/pull/7620#discussion_r449264750 ## File path: cpp/CMakeLists.txt ## @@ -18,6 +18,37 @@ cmake_minimum_required(VERSION 3.2) message(STATUS "Building using CMake version: ${CMAKE_VERSION}")

[GitHub] [arrow] pitrou commented on pull request #7605: ARROW-9283: [Python] Expose build info

2020-07-02 Thread GitBox
pitrou commented on pull request #7605: URL: https://github.com/apache/arrow/pull/7605#issuecomment-653221217 While it may be supported to do so (upgrade only Arrow C++ DLLs), I'm not sure it's something we want to encourage, and exposing different version numbers is confusing for users.

[GitHub] [arrow] github-actions[bot] commented on pull request #7625: [CI] Add s390x Travis CI build back to allow_failures until it becomes less flaky

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7625: URL: https://github.com/apache/arrow/pull/7625#issuecomment-653219469 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] jorisvandenbossche commented on pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
jorisvandenbossche commented on pull request #7604: URL: https://github.com/apache/arrow/pull/7604#issuecomment-653217519 Moving the discussion at https://github.com/apache/arrow/pull/7604#discussion_r449130523 outside the inline thread here (which makes it easier to find). It's

[GitHub] [arrow] wesm opened a new pull request #7625: [CI] Add s390x Travis CI build back to allow_failures until it becomes less flaky

2020-07-02 Thread GitBox
wesm opened a new pull request #7625: URL: https://github.com/apache/arrow/pull/7625 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] kou commented on pull request #7605: ARROW-9283: [Python] Expose build info

2020-07-02 Thread GitBox
kou commented on pull request #7605: URL: https://github.com/apache/arrow/pull/7605#issuecomment-653216334 Yes. We may release a patch version (1.0.1 for 1.0.0) that should be compatible with base version (1.0.0). If a patch version includes fixes only in C++, source pyarrow

[GitHub] [arrow] pitrou commented on a change in pull request #7612: ARROW-7011: [C++] Implement casts from float/double to decimal

2020-07-02 Thread GitBox
pitrou commented on a change in pull request #7612: URL: https://github.com/apache/arrow/pull/7612#discussion_r449258150 ## File path: cpp/src/arrow/compute/kernels/scalar_cast_numeric.cc ## @@ -467,6 +467,51 @@ struct CastFunctor { } }; +//

[GitHub] [arrow] wesm commented on a change in pull request #7612: ARROW-7011: [C++] Implement casts from float/double to decimal

2020-07-02 Thread GitBox
wesm commented on a change in pull request #7612: URL: https://github.com/apache/arrow/pull/7612#discussion_r449248816 ## File path: cpp/src/arrow/compute/kernels/scalar_cast_numeric.cc ## @@ -467,6 +467,51 @@ struct CastFunctor { } }; +//

[GitHub] [arrow] tazimmerman closed issue #7624: Specifying columns in a dataset drops the index (pandas) metadata.

2020-07-02 Thread GitBox
tazimmerman closed issue #7624: URL: https://github.com/apache/arrow/issues/7624 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] wesm closed pull request #7617: ARROW-9298: [C++] Fix crashes with invalid IPC input

2020-07-02 Thread GitBox
wesm closed pull request #7617: URL: https://github.com/apache/arrow/pull/7617 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] wesm commented on pull request #7607: ARROW-8996: [C++] Add AVX version for aggregate sum/mean with runtime dispatch

2020-07-02 Thread GitBox
wesm commented on pull request #7607: URL: https://github.com/apache/arrow/pull/7607#issuecomment-653202336 I will review when I can -- we are going into a long weekend in the US so I may not get to it until early next week

[GitHub] [arrow] wesm commented on issue #7624: Specifying columns in a dataset drops the index (pandas) metadata.

2020-07-02 Thread GitBox
wesm commented on issue #7624: URL: https://github.com/apache/arrow/issues/7624#issuecomment-653201919 Could you open a JIRA about this? If something seems wrong or contrary to expectations it either is a bug or something that needs to be clarified in the API or documentation

[GitHub] [arrow] github-actions[bot] commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653197684 Revision: 7598a021ca12c84227702176c495635603c088fb Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] tazimmerman opened a new issue #7624: Specifying columns in a dataset drops the index (pandas) metadata.

2020-07-02 Thread GitBox
tazimmerman opened a new issue #7624: URL: https://github.com/apache/arrow/issues/7624 I'm not sure if this is a missing feature, or just undocumented, or perhaps not even something I should expect to work. Let's start with a multi-index dataframe. ``` >>> import pyarrow

[GitHub] [arrow] wesm commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653197213 I think part of the issue is that brotli is being installed by conda so this iteration will hopefully fix it This

[GitHub] [arrow] wesm commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653196984 @github-actions crossbow submit wheel-win-cp37m This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] wesm commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653195997 Take a closer look at the brotli_ep build https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake#L877 When you use

[GitHub] [arrow] BryanCutler commented on a change in pull request #7275: ARROW-6110: [Java][Integration] Support LargeList Type and add integration test with C++

2020-07-02 Thread GitBox
BryanCutler commented on a change in pull request #7275: URL: https://github.com/apache/arrow/pull/7275#discussion_r449226788 ## File path: java/vector/src/main/java/org/apache/arrow/vector/complex/LargeListVector.java ## @@ -0,0 +1,1004 @@ +/* + * Licensed to the Apache

[GitHub] [arrow] fsaintjacques closed pull request #7614: ARROW-8977: [R] Table$create with schema crashes with some dictionary index types

2020-07-02 Thread GitBox
fsaintjacques closed pull request #7614: URL: https://github.com/apache/arrow/pull/7614 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] kszucs commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
kszucs commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653180991 I think if we link brotly dinamically than we need to bundle the dll in pyarrow/cmakelist and setup.py into the wheel, so my original plan was to link it statically.

[GitHub] [arrow] houqp commented on pull request #7501: ARROW-9192: [CI][Rust] Add support for running clippy

2020-07-02 Thread GitBox
houqp commented on pull request #7501: URL: https://github.com/apache/arrow/pull/7501#issuecomment-653175026 Thanks @kszucs , at the mean time, I will try to fix more linting errors and see if we can just run clippy as is without the custom linting script before you start adding it to

[GitHub] [arrow] github-actions[bot] commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653171393 Revision: f434ec6bdd509ce9a8872bad0a98c8ef5af172f5 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] wesm commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653170709 @github-actions crossbow submit wheel-win-cp37m This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] wesm commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
wesm commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653169897 I'm pretty confident you're going to have to restore -DBUILD_SHARED_LIBS=OFF to the brotli_ep build This is an

[GitHub] [arrow] github-actions[bot] commented on pull request #7623: ARROW-9108: [C++][Dataset] Add supports for missing type in Statistics to Scalar conversion

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7623: URL: https://github.com/apache/arrow/pull/7623#issuecomment-653166715 https://issues.apache.org/jira/browse/ARROW-9108 This is an automated message from the Apache Git

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7623: ARROW-9108: [C++][Dataset] Add supports for missing type in Statistics to Scalar conversion

2020-07-02 Thread GitBox
fsaintjacques commented on a change in pull request #7623: URL: https://github.com/apache/arrow/pull/7623#discussion_r449204595 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -760,6 +760,98 @@ def test_fragments_parquet_row_groups(tempdir): assert len(result) ==

[GitHub] [arrow] sunchao commented on a change in pull request #7610: ARROW-9290: [Rust] [Parquet] Add features to allow opting out of dependencies

2020-07-02 Thread GitBox
sunchao commented on a change in pull request #7610: URL: https://github.com/apache/arrow/pull/7610#discussion_r449204149 ## File path: rust/parquet/Cargo.toml ## @@ -29,20 +29,29 @@ build = "build.rs" edition = "2018" [dependencies] -parquet-format = "~2.6"

[GitHub] [arrow] fsaintjacques opened a new pull request #7623: ARROW-9108: [C++][Dataset] Add supports for missing type in Statistics to Scalar conversion

2020-07-02 Thread GitBox
fsaintjacques opened a new pull request #7623: URL: https://github.com/apache/arrow/pull/7623 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] pitrou commented on a change in pull request #7598: ARROW-9278: [C++][Python][DONOTMERGE] Remove validity bitmap from Union types, update IPC read/write and integration tests

2020-07-02 Thread GitBox
pitrou commented on a change in pull request #7598: URL: https://github.com/apache/arrow/pull/7598#discussion_r449198523 ## File path: cpp/src/arrow/array/builder_nested.h ## @@ -395,8 +395,17 @@ class ARROW_EXPORT StructBuilder : public ArrayBuilder { return

[GitHub] [arrow] wesm commented on a change in pull request #7598: ARROW-9278: [C++][Python][DONOTMERGE] Remove validity bitmap from Union types, update IPC read/write and integration tests

2020-07-02 Thread GitBox
wesm commented on a change in pull request #7598: URL: https://github.com/apache/arrow/pull/7598#discussion_r449194637 ## File path: cpp/src/arrow/ipc/util.h ## @@ -19,6 +19,8 @@ #include +#include "arrow/type.h" Review comment: This was an artifact of an

[GitHub] [arrow] wesm commented on a change in pull request #7598: ARROW-9278: [C++][Python][DONOTMERGE] Remove validity bitmap from Union types, update IPC read/write and integration tests

2020-07-02 Thread GitBox
wesm commented on a change in pull request #7598: URL: https://github.com/apache/arrow/pull/7598#discussion_r449194292 ## File path: cpp/src/arrow/array/builder_nested.h ## @@ -395,8 +395,17 @@ class ARROW_EXPORT StructBuilder : public ArrayBuilder { return Status::OK();

[GitHub] [arrow] wesm commented on a change in pull request #7598: ARROW-9278: [C++][Python][DONOTMERGE] Remove validity bitmap from Union types, update IPC read/write and integration tests

2020-07-02 Thread GitBox
wesm commented on a change in pull request #7598: URL: https://github.com/apache/arrow/pull/7598#discussion_r449192621 ## File path: cpp/src/arrow/array/array_base.h ## @@ -86,16 +86,17 @@ class ARROW_EXPORT Array { std::shared_ptr type() const { return data_->type; }

[GitHub] [arrow] wesm commented on a change in pull request #7598: ARROW-9278: [C++][Python][DONOTMERGE] Remove validity bitmap from Union types, update IPC read/write and integration tests

2020-07-02 Thread GitBox
wesm commented on a change in pull request #7598: URL: https://github.com/apache/arrow/pull/7598#discussion_r449192771 ## File path: cpp/src/arrow/array/array_nested.cc ## @@ -678,6 +680,10 @@ Result> DenseUnionArray::Make( return Status::TypeError("UnionArray type_ids

[GitHub] [arrow] github-actions[bot] commented on pull request #7622: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7622: URL: https://github.com/apache/arrow/pull/7622#issuecomment-653141108 https://issues.apache.org/jira/browse/ARROW-9280 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653140442 Revision: 1e8fecd3616b988c92a03ad58bebe122e32588c6 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] zeevm opened a new pull request #7622: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-02 Thread GitBox
zeevm opened a new pull request #7622: URL: https://github.com/apache/arrow/pull/7622 Allow writer to provide pre-calculated stats This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [arrow] kszucs opened a new pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
kszucs opened a new pull request #7621: URL: https://github.com/apache/arrow/pull/7621 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] kszucs commented on pull request #7621: [Packaging][Python] Link brotli statically in the windows wheels

2020-07-02 Thread GitBox
kszucs commented on pull request #7621: URL: https://github.com/apache/arrow/pull/7621#issuecomment-653139804 @github-actions crossbow submit wheel-win-cp37m This is an automated message from the Apache Git Service. To

[GitHub] [arrow] zeevm closed pull request #7586: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-02 Thread GitBox
zeevm closed pull request #7586: URL: https://github.com/apache/arrow/pull/7586 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] saethlin commented on a change in pull request #7610: ARROW-9290: [Rust] [Parquet] Add features to allow opting out of dependencies

2020-07-02 Thread GitBox
saethlin commented on a change in pull request #7610: URL: https://github.com/apache/arrow/pull/7610#discussion_r449168566 ## File path: rust/parquet/Cargo.toml ## @@ -29,20 +29,29 @@ build = "build.rs" edition = "2018" [dependencies] -parquet-format = "~2.6"

[GitHub] [arrow] saethlin commented on a change in pull request #7610: ARROW-9290: [Rust] [Parquet] Add features to allow opting out of dependencies

2020-07-02 Thread GitBox
saethlin commented on a change in pull request #7610: URL: https://github.com/apache/arrow/pull/7610#discussion_r449168566 ## File path: rust/parquet/Cargo.toml ## @@ -29,20 +29,29 @@ build = "build.rs" edition = "2018" [dependencies] -parquet-format = "~2.6"

[GitHub] [arrow] wesm commented on a change in pull request #7598: ARROW-9278: [C++][Python][DONOTMERGE] Remove validity bitmap from Union types, update IPC read/write and integration tests

2020-07-02 Thread GitBox
wesm commented on a change in pull request #7598: URL: https://github.com/apache/arrow/pull/7598#discussion_r449156988 ## File path: cpp/src/arrow/array/array_struct_test.cc ## @@ -256,16 +256,6 @@ TEST_F(TestStructBuilder, TestAppendNull) {

[GitHub] [arrow] sunchao commented on pull request #7586: ARROW-9280: [Rust] [Parquet] Calculate page and column statistics

2020-07-02 Thread GitBox
sunchao commented on pull request #7586: URL: https://github.com/apache/arrow/pull/7586#issuecomment-653122116 @zeevm once approved, a committer will help merge this. Seems the PR now is a little messed up, can you clean it up so I can merge it?

[GitHub] [arrow] BryanCutler commented on pull request #6316: ARROW-7717: [CI] Have nightly integration test for Spark's latest release

2020-07-02 Thread GitBox
BryanCutler commented on pull request #6316: URL: https://github.com/apache/arrow/pull/6316#issuecomment-653118024 This is great, thanks @kszucs ! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7608: ARROW-9288: [C++][Dataset] Fix PartitioningFactory with dictionary encoding for HivePartioning

2020-07-02 Thread GitBox
fsaintjacques commented on a change in pull request #7608: URL: https://github.com/apache/arrow/pull/7608#discussion_r449148761 ## File path: cpp/src/arrow/dataset/partition.cc ## @@ -646,15 +657,26 @@ class HivePartitioningFactory : public PartitioningFactory { }

[GitHub] [arrow] github-actions[bot] commented on pull request #7619: ARROW-9300: [Java] Separate Netty Memory to its own module

2020-07-02 Thread GitBox
github-actions[bot] commented on pull request #7619: URL: https://github.com/apache/arrow/pull/7619#issuecomment-653114882 https://issues.apache.org/jira/browse/ARROW-9300 This is an automated message from the Apache Git

[GitHub] [arrow] rymurr opened a new pull request #7619: ARROW-9300: [Java] Separate Netty Memory to its own module

2020-07-02 Thread GitBox
rymurr opened a new pull request #7619: URL: https://github.com/apache/arrow/pull/7619 This finishes the netty split in arrow-memory and creates 3 new modules * memory-core: core memory implementation * memory-netty: netty allocation manager * memory-unsafe: unsafe allocation

[GitHub] [arrow] pitrou opened a new pull request #7620: ARROW-9013: [C++] Validate CMake options

2020-07-02 Thread GitBox
pitrou opened a new pull request #7620: URL: https://github.com/apache/arrow/pull/7620 Disallow passing invalid values to enum-style CMake options (such as `-DARROW_SIMD_LEVEL=xyzzy`) This is an automated message from the

[GitHub] [arrow] emkornfield commented on a change in pull request #7604: ARROW-9223: [Python] Propagate timezone information in pandas conversion

2020-07-02 Thread GitBox
emkornfield commented on a change in pull request #7604: URL: https://github.com/apache/arrow/pull/7604#discussion_r449130523 ## File path: cpp/src/arrow/python/arrow_to_pandas.cc ## @@ -642,24 +641,27 @@ inline Status ConvertStruct(const PandasOptions& options, const

  1   2   >