[GitHub] [arrow] arw2019 opened a new pull request #8816: ARROW-9027: [Python][Testing] Split parquet tests into multiple files + clean-up

2020-12-01 Thread GitBox
arw2019 opened a new pull request #8816: URL: https://github.com/apache/arrow/pull/8816 Only relocation - none of the tests are touched. cc @jorisvandenbossche This is an automated message from the Apache Git Service.

[GitHub] [arrow] andygrove commented on a change in pull request #8815: ARROW-10753: [Rust] [DataFusion] Fix parsing of negative numbers in DataFusion

2020-12-01 Thread GitBox
andygrove commented on a change in pull request #8815: URL: https://github.com/apache/arrow/pull/8815#discussion_r533821974 ## File path: rust/datafusion/src/lib.rs ## @@ -52,6 +52,7 @@ clippy::useless_format, clippy::zero_prefixed_literal )]

[GitHub] [arrow] kou commented on pull request #8817: ARROW-10786: [Packaging][RPM] Drop support for CentOS 6

2020-12-01 Thread GitBox
kou commented on pull request #8817: URL: https://github.com/apache/arrow/pull/8817#issuecomment-736959349 @github-actions crossbow submit centos-* This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] kou opened a new pull request #8817: ARROW-10786: [Packaging][RPM] Drop support for CentOS 6

2020-12-01 Thread GitBox
kou opened a new pull request #8817: URL: https://github.com/apache/arrow/pull/8817 Because CentOS 6 reached EOL at 2020-11-30. This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] github-actions[bot] commented on pull request #8817: ARROW-10786: [Packaging][RPM] Drop support for CentOS 6

2020-12-01 Thread GitBox
github-actions[bot] commented on pull request #8817: URL: https://github.com/apache/arrow/pull/8817#issuecomment-736960561 Revision: b1c4fdba3923bb1ed2ff98c52d6129fe4dccbdad Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] andygrove commented on a change in pull request #8794: ARROW-10759: [Rust][DataFusion] Implement string to date cast

2020-12-01 Thread GitBox
andygrove commented on a change in pull request #8794: URL: https://github.com/apache/arrow/pull/8794#discussion_r533816728 ## File path: rust/arrow/src/compute/kernels/cast.rs ## @@ -376,6 +378,27 @@ pub fn cast(array: , to_type: ) -> Result { Int64 =>

[GitHub] [arrow] andygrove commented on a change in pull request #8815: ARROW-10753: [Rust] [DataFusion] Fix parsing of negative numbers in DataFusion

2020-12-01 Thread GitBox
andygrove commented on a change in pull request #8815: URL: https://github.com/apache/arrow/pull/8815#discussion_r533821974 ## File path: rust/datafusion/src/lib.rs ## @@ -52,6 +52,7 @@ clippy::useless_format, clippy::zero_prefixed_literal )]

[GitHub] [arrow] nealrichardson commented on a change in pull request #8757: ARROW-8147: [C++] Add google-cloud-cpp to ThirdpartyToolchain

2020-12-01 Thread GitBox
nealrichardson commented on a change in pull request #8757: URL: https://github.com/apache/arrow/pull/8757#discussion_r533838250 ## File path: cpp/thirdparty/versions.txt ## @@ -32,14 +32,17 @@ ARROW_BOOST_BUILD_VERSION=1.71.0 ARROW_BROTLI_BUILD_VERSION=v1.0.7

[GitHub] [arrow] github-actions[bot] commented on pull request #8817: ARROW-10786: [Packaging][RPM] Drop support for CentOS 6

2020-12-01 Thread GitBox
github-actions[bot] commented on pull request #8817: URL: https://github.com/apache/arrow/pull/8817#issuecomment-736965696 https://issues.apache.org/jira/browse/ARROW-10786 This is an automated message from the Apache Git

[GitHub] [arrow] andygrove commented on a change in pull request #8815: [datafusion][rust] Arrow-10753: Fix parsing of negative numbers in DataFusion

2020-12-01 Thread GitBox
andygrove commented on a change in pull request #8815: URL: https://github.com/apache/arrow/pull/8815#discussion_r533821182 ## File path: rust/datafusion/tests/sql.rs ## @@ -1496,6 +1496,25 @@ async fn csv_query_sum_cast() { execute( ctx, sql).await; } +#[tokio::test]

[GitHub] [arrow] nealrichardson closed pull request #8813: ARROW-10780: [R] Update known R installation issues for CentOS 7

2020-12-01 Thread GitBox
nealrichardson closed pull request #8813: URL: https://github.com/apache/arrow/pull/8813 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] github-actions[bot] commented on pull request #8815: ARROW-10753: [Rust] [DataFusion] Fix parsing of negative numbers in DataFusion

2020-12-01 Thread GitBox
github-actions[bot] commented on pull request #8815: URL: https://github.com/apache/arrow/pull/8815#issuecomment-736921960 https://issues.apache.org/jira/browse/ARROW-10753 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #8816: ARROW-9027: [Python][Testing] Split parquet tests into multiple files + clean-up

2020-12-01 Thread GitBox
github-actions[bot] commented on pull request #8816: URL: https://github.com/apache/arrow/pull/8816#issuecomment-736921959 https://issues.apache.org/jira/browse/ARROW-9027 This is an automated message from the Apache Git

[GitHub] [arrow] GregBowyer commented on pull request #8698: ARROW-10636: [Rust][Parquet] Remove rust specialization

2020-12-01 Thread GitBox
GregBowyer commented on pull request #8698: URL: https://github.com/apache/arrow/pull/8698#issuecomment-736938052 I have been working on this w.r.t performance, I think I have most parts performing better than the original. I am running off clean benchmarks right now to validate.

[GitHub] [arrow] nealrichardson commented on a change in pull request #8813: ARROW-10780: [R] Update known R installation issues for CentOS 7

2020-12-01 Thread GitBox
nealrichardson commented on a change in pull request #8813: URL: https://github.com/apache/arrow/pull/8813#discussion_r533785013 ## File path: r/vignettes/install.Rmd ## @@ -284,8 +284,15 @@ so that we can attempt to improve the script. ## Known installation issues * On

[GitHub] [arrow] andygrove commented on a change in pull request #8815: [datafusion][rust] Arrow-10753: Fix parsing of negative numbers in DataFusion

2020-12-01 Thread GitBox
andygrove commented on a change in pull request #8815: URL: https://github.com/apache/arrow/pull/8815#discussion_r533820801 ## File path: rust/datafusion/src/sql/planner.rs ## @@ -638,10 +638,13 @@ impl<'a, S: SchemaProvider> SqlToRel<'a, S> {

[GitHub] [arrow] nevi-me commented on pull request #8800: ARROW-10767: [Rust] Speed up sum with nulls (non-simd)

2020-12-01 Thread GitBox
nevi-me commented on pull request #8800: URL: https://github.com/apache/arrow/pull/8800#issuecomment-737022972 > Looks like an unrelated failure in the CI The failure's indeed unrelated, it's something we subsequently fixed, but is still failing because it's not yet rebased in this

[GitHub] [arrow] nealrichardson commented on a change in pull request #8813: ARROW-10780: [R] update known R installation issues for centos7 and a…

2020-12-01 Thread GitBox
nealrichardson commented on a change in pull request #8813: URL: https://github.com/apache/arrow/pull/8813#discussion_r533676787 ## File path: r/vignettes/install.Rmd ## @@ -284,7 +284,11 @@ so that we can attempt to improve the script. ## Known installation issues * On

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #8794: ARROW-10759: [Rust][DataFusion] Implement string to date cast

2020-12-01 Thread GitBox
yordan-pavlov commented on a change in pull request #8794: URL: https://github.com/apache/arrow/pull/8794#discussion_r533682639 ## File path: rust/arrow/src/compute/kernels/cast.rs ## @@ -376,6 +378,27 @@ pub fn cast(array: , to_type: ) -> Result { Int64 =>

[GitHub] [arrow] mattpollock commented on a change in pull request #8813: ARROW-10780: [R] update known R installation issues for centos7 and a…

2020-12-01 Thread GitBox
mattpollock commented on a change in pull request #8813: URL: https://github.com/apache/arrow/pull/8813#discussion_r533674921 ## File path: r/vignettes/install.Rmd ## @@ -284,7 +284,11 @@ so that we can attempt to improve the script. ## Known installation issues * On

[GitHub] [arrow] kou commented on a change in pull request #8813: ARROW-10780: [R] Update known R installation issues for CentOS 7

2020-12-01 Thread GitBox
kou commented on a change in pull request #8813: URL: https://github.com/apache/arrow/pull/8813#discussion_r533697120 ## File path: r/vignettes/install.Rmd ## @@ -284,7 +284,11 @@ so that we can attempt to improve the script. ## Known installation issues * On CentOS, if

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #8794: ARROW-10759: [Rust][DataFusion] Implement string to date cast

2020-12-01 Thread GitBox
yordan-pavlov commented on a change in pull request #8794: URL: https://github.com/apache/arrow/pull/8794#discussion_r533701070 ## File path: rust/arrow/src/compute/kernels/cast.rs ## @@ -2720,6 +2743,42 @@ mod tests { .collect() } +#[test] +fn

[GitHub] [arrow] nealrichardson commented on a change in pull request #8808: ARROW-10774: [R] Set minimum cpp11 version

2020-12-01 Thread GitBox
nealrichardson commented on a change in pull request #8808: URL: https://github.com/apache/arrow/pull/8808#discussion_r533643340 ## File path: r/DESCRIPTION ## @@ -48,7 +48,7 @@ Suggests: rmarkdown, testthat, tibble -LinkingTo: cpp11 +LinkingTo: cpp11 (>= 0.2.0)

[GitHub] [arrow] nealrichardson closed pull request #8808: ARROW-10774: [R] Set minimum cpp11 version

2020-12-01 Thread GitBox
nealrichardson closed pull request #8808: URL: https://github.com/apache/arrow/pull/8808 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] mattpollock opened a new pull request #8813: ARROW-10780: [R] update known R installation issues for centos7 and a…

2020-12-01 Thread GitBox
mattpollock opened a new pull request #8813: URL: https://github.com/apache/arrow/pull/8813 Update known installation issues vignette per discussion in ARROW-10780 This is an automated message from the Apache Git Service. To

[GitHub] [arrow] bkietz commented on a change in pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

2020-12-01 Thread GitBox
bkietz commented on a change in pull request #8770: URL: https://github.com/apache/arrow/pull/8770#discussion_r533640298 ## File path: cpp/src/arrow/util/bit_util_test.cc ## @@ -66,6 +66,15 @@ using internal::InvertBitmap; using ::testing::ElementsAreArray; +namespace

[GitHub] [arrow] kou commented on pull request #8782: ARROW-10746: [C++] Bump gtest version + use GTEST_SKIP in tests

2020-12-01 Thread GitBox
kou commented on pull request #8782: URL: https://github.com/apache/arrow/pull/8782#issuecomment-736770394 I think so. We need a fix like https://github.com/apache/arrow/pull/8782#issuecomment-736217882 for `cpp/src/parquet/column_writer_test.cc`.

[GitHub] [arrow] bkietz commented on a change in pull request #8777: ARROW-10569: [C++] Improve table filtering performance

2020-12-01 Thread GitBox
bkietz commented on a change in pull request #8777: URL: https://github.com/apache/arrow/pull/8777#discussion_r533624625 ## File path: cpp/src/arrow/compute/kernels/vector_selection.cc ## @@ -1838,19 +1838,113 @@ Result> FilterTable(const Table& table, const Datum& filt if

[GitHub] [arrow] bkietz commented on a change in pull request #8761: ARROW-10697: [C++] Add notes about bitmap readers

2020-12-01 Thread GitBox
bkietz commented on a change in pull request #8761: URL: https://github.com/apache/arrow/pull/8761#discussion_r533626348 ## File path: cpp/src/arrow/compute/kernels/vector_sort.cc ## @@ -238,21 +239,9 @@ inline void VisitRawValuesInline(const ArrayType& values,

[GitHub] [arrow] bkietz closed pull request #8613: ARROW-10411: [C++] Fix incorrect child array lengths for Concatenate of FixedSizeList

2020-12-01 Thread GitBox
bkietz closed pull request #8613: URL: https://github.com/apache/arrow/pull/8613 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] nealrichardson commented on a change in pull request #8757: ARROW-8147: [C++] Add google-cloud-cpp to ThirdpartyToolchain

2020-12-01 Thread GitBox
nealrichardson commented on a change in pull request #8757: URL: https://github.com/apache/arrow/pull/8757#discussion_r533651013 ## File path: cpp/thirdparty/versions.txt ## @@ -32,14 +32,17 @@ ARROW_BOOST_BUILD_VERSION=1.71.0 ARROW_BROTLI_BUILD_VERSION=v1.0.7

[GitHub] [arrow] nealrichardson commented on a change in pull request #8813: ARROW-10780: [R] update known R installation issues for centos7 and a…

2020-12-01 Thread GitBox
nealrichardson commented on a change in pull request #8813: URL: https://github.com/apache/arrow/pull/8813#discussion_r533657147 ## File path: r/vignettes/install.Rmd ## @@ -284,7 +284,11 @@ so that we can attempt to improve the script. ## Known installation issues * On

[GitHub] [arrow] bluss edited a comment on pull request #7734: ARROW-8480: [Rust] Use NonNull well aligned pointer as Unique reference

2020-12-01 Thread GitBox
bluss edited a comment on pull request #7734: URL: https://github.com/apache/arrow/pull/7734#issuecomment-736685727 It is claimed https://issues.apache.org/jira/browse/ARROW-8480 is solved by this PR, but this PR does not add checks for allocation faliure, and there are still bugs to spot

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8401: ARROW-10109: [Rust] Add support to the C data interface for primitive types and utf8

2020-12-01 Thread GitBox
jorgecarleitao commented on a change in pull request #8401: URL: https://github.com/apache/arrow/pull/8401#discussion_r533614032 ## File path: rust/arrow-pyarrow-integration-testing/tests/test_sql.py ## @@ -0,0 +1,61 @@ +# -*- coding: utf-8 -*- +# Licensed to the Apache

[GitHub] [arrow] andygrove closed pull request #8807: ARROW-10750: [Rust] [DataFusion] Add SQL support for LEFT and RIGHT join

2020-12-01 Thread GitBox
andygrove closed pull request #8807: URL: https://github.com/apache/arrow/pull/8807 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] nevi-me closed pull request #8804: ARROW-10775: [Rust][DataFusion] Use ahash in join hashmap

2020-12-01 Thread GitBox
nevi-me closed pull request #8804: URL: https://github.com/apache/arrow/pull/8804 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] vikashsingh009 opened a new issue #8812: Typescript Arrowjs Class 'RecordBatch' incorrectly extends base class 'StructVector'

2020-12-01 Thread GitBox
vikashsingh009 opened a new issue #8812: URL: https://github.com/apache/arrow/issues/8812 I have imported arrowjs in typescript like import { Table } from 'apache-arrow'; const arrow = readFileSync('simple.arrow'); const table = Table.from([arrow]); but when i

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #8401: ARROW-10109: [Rust] Add support to the C data interface for primitive types and utf8

2020-12-01 Thread GitBox
jorgecarleitao commented on a change in pull request #8401: URL: https://github.com/apache/arrow/pull/8401#discussion_r533521705 ## File path: rust/arrow-c-integration/src/lib.rs ## @@ -0,0 +1,162 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] Dandandan commented on pull request #8800: ARROW-10767: [Rust] Speed up sum with nulls (non-simd)

2020-12-01 Thread GitBox
Dandandan commented on pull request #8800: URL: https://github.com/apache/arrow/pull/8800#issuecomment-736683582 Looks like an unrelated failure in the CI This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] bluss edited a comment on pull request #7734: ARROW-8480: [Rust] Use NonNull well aligned pointer as Unique reference

2020-12-01 Thread GitBox
bluss edited a comment on pull request #7734: URL: https://github.com/apache/arrow/pull/7734#issuecomment-736685727 It is claimed https://issues.apache.org/jira/browse/ARROW-8480 is solved by this PR, but this PR does not add checks for allocation faliure, and there are still bugs due to

[GitHub] [arrow] bluss commented on pull request #7734: ARROW-8480: [Rust] Use NonNull well aligned pointer as Unique reference

2020-12-01 Thread GitBox
bluss commented on pull request #7734: URL: https://github.com/apache/arrow/pull/7734#issuecomment-736685727 It is claimed https://issues.apache.org/jira/browse/ARROW-8480 is solved by this PR, but this PR does not add checks for allocation faliure, and there are still bugs due to that,

[GitHub] [arrow] github-actions[bot] commented on pull request #8813: ARROW-10780: [R] Update known R installation issues for CentOS 7

2020-12-01 Thread GitBox
github-actions[bot] commented on pull request #8813: URL: https://github.com/apache/arrow/pull/8813#issuecomment-736820773 https://issues.apache.org/jira/browse/ARROW-10780 This is an automated message from the Apache Git

[GitHub] [arrow] jorisvandenbossche commented on pull request #8704: ARROW-10644: [Python] Consolidate path/filesystem handling in pyarrow.dataset and pyarrow.fs

2020-12-01 Thread GitBox
jorisvandenbossche commented on pull request #8704: URL: https://github.com/apache/arrow/pull/8704#issuecomment-736823311 I added a test, this should be good now This is an automated message from the Apache Git Service. To

[GitHub] [arrow] Dandandan opened a new pull request #8814: ARROW-10785: [Rust] Optimize take string

2020-12-01 Thread GitBox
Dandandan opened a new pull request #8814: URL: https://github.com/apache/arrow/pull/8814 Further optimizes take string to benefit from creating a buffer directly. Interestingly, it seems it is still faster to copy a `Vec` than to use a buffer for the the values. Presumably

[GitHub] [arrow] josiahyan commented on a change in pull request #8757: ARROW-8147: [C++] Add google-cloud-cpp to ThirdpartyToolchain

2020-12-01 Thread GitBox
josiahyan commented on a change in pull request #8757: URL: https://github.com/apache/arrow/pull/8757#discussion_r533728016 ## File path: cpp/thirdparty/versions.txt ## @@ -32,14 +32,17 @@ ARROW_BOOST_BUILD_VERSION=1.71.0 ARROW_BROTLI_BUILD_VERSION=v1.0.7

[GitHub] [arrow] josiahyan commented on a change in pull request #8757: ARROW-8147: [C++] Add google-cloud-cpp to ThirdpartyToolchain

2020-12-01 Thread GitBox
josiahyan commented on a change in pull request #8757: URL: https://github.com/apache/arrow/pull/8757#discussion_r533728016 ## File path: cpp/thirdparty/versions.txt ## @@ -32,14 +32,17 @@ ARROW_BOOST_BUILD_VERSION=1.71.0 ARROW_BROTLI_BUILD_VERSION=v1.0.7

[GitHub] [arrow] github-actions[bot] commented on pull request #8814: ARROW-10785: [Rust] Optimize take string

2020-12-01 Thread GitBox
github-actions[bot] commented on pull request #8814: URL: https://github.com/apache/arrow/pull/8814#issuecomment-736850611 https://issues.apache.org/jira/browse/ARROW-10785 This is an automated message from the Apache Git

[GitHub] [arrow] velvia commented on pull request #8815: [datafusion][rust] Arrow-10753: Fix parsing of negative numbers in DataFusion

2020-12-01 Thread GitBox
velvia commented on pull request #8815: URL: https://github.com/apache/arrow/pull/8815#issuecomment-736864975 /cc @jorgecarleitao who last changed this part of planner.rs This is an automated message from the Apache Git

[GitHub] [arrow] velvia opened a new pull request #8815: [datafusion][rust] Arrow-10753: Fix parsing of negative numbers in DataFusion

2020-12-01 Thread GitBox
velvia opened a new pull request #8815: URL: https://github.com/apache/arrow/pull/8815 Currently, DataFusion SQL statements that compare negative numbers result in an exception, as negative numbers are incorrectly parsed. The error thrown is `InternalError("SQL binary operator cannot be

[GitHub] [arrow] github-actions[bot] commented on pull request #8809: ARROW-10778: [Python] Fix RowGroupInfo.statistics for empty row groups

2020-12-01 Thread GitBox
github-actions[bot] commented on pull request #8809: URL: https://github.com/apache/arrow/pull/8809#issuecomment-736372323 https://issues.apache.org/jira/browse/ARROW-10778 This is an automated message from the Apache Git

[GitHub] [arrow] maartenbreddels commented on pull request #8755: ARROW-10709: [Python] Allow PythonFile.read() to always return a buffer

2020-12-01 Thread GitBox
maartenbreddels commented on pull request #8755: URL: https://github.com/apache/arrow/pull/8755#issuecomment-736369145 I think the failure is unrelated. I'm really excited about this (seemingly trivial) feature, this gives such a massive performance improvement (equal to memory

[GitHub] [arrow] pitrou commented on a change in pull request #8755: ARROW-10709: [Python] Allow PythonFile.read() to always return a buffer

2020-12-01 Thread GitBox
pitrou commented on a change in pull request #8755: URL: https://github.com/apache/arrow/pull/8755#discussion_r533193607 ## File path: cpp/src/arrow/python/io.cc ## @@ -199,25 +219,32 @@ Result PyReadableFile::Read(int64_t nbytes, void* out) { PyObject* bytes_obj =

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8755: ARROW-10709: [Python] Allow PythonFile.read() to always return a buffer

2020-12-01 Thread GitBox
maartenbreddels commented on a change in pull request #8755: URL: https://github.com/apache/arrow/pull/8755#discussion_r533159508 ## File path: python/pyarrow/tests/test_io.py ## @@ -163,6 +163,34 @@ def test_python_file_readinto(): assert len(dst_buf) == length

[GitHub] [arrow] pitrou commented on a change in pull request #8757: ARROW-8147: [C++] Add google-cloud-cpp to ThirdpartyToolchain

2020-12-01 Thread GitBox
pitrou commented on a change in pull request #8757: URL: https://github.com/apache/arrow/pull/8757#discussion_r533241989 ## File path: ci/docker/conda-cpp.dockerfile ## @@ -73,6 +73,7 @@ ENV ARROW_BUILD_TESTS=ON \ ARROW_PARQUET=ON \ ARROW_PLASMA=ON \ ARROW_S3=ON

[GitHub] [arrow] pitrou commented on pull request #8755: ARROW-10709: [Python] Allow PythonFile.read() to always return a buffer

2020-12-01 Thread GitBox
pitrou commented on pull request #8755: URL: https://github.com/apache/arrow/pull/8755#issuecomment-736373876 > I think the failure is unrelated. It is. We're dropping Python 3.5 support soon. This is an automated

[GitHub] [arrow] pitrou closed pull request #8761: ARROW-10697: [C++] Add notes about bitmap readers

2020-12-01 Thread GitBox
pitrou closed pull request #8761: URL: https://github.com/apache/arrow/pull/8761 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] pitrou commented on pull request #8704: ARROW-10644: [Python] Consolidate path/filesystem handling in pyarrow.dataset and pyarrow.fs

2020-12-01 Thread GitBox
pitrou commented on pull request #8704: URL: https://github.com/apache/arrow/pull/8704#issuecomment-737046252 Thank you @jorisvandenbossche ! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] pitrou commented on a change in pull request #8761: ARROW-10697: [C++] Add notes about bitmap readers

2020-12-01 Thread GitBox
pitrou commented on a change in pull request #8761: URL: https://github.com/apache/arrow/pull/8761#discussion_r533951860 ## File path: cpp/src/arrow/compute/kernels/vector_sort.cc ## @@ -238,21 +239,9 @@ inline void VisitRawValuesInline(const ArrayType& values,

[GitHub] [arrow] pitrou commented on a change in pull request #8761: ARROW-10697: [C++] Add notes about bitmap readers

2020-12-01 Thread GitBox
pitrou commented on a change in pull request #8761: URL: https://github.com/apache/arrow/pull/8761#discussion_r533951860 ## File path: cpp/src/arrow/compute/kernels/vector_sort.cc ## @@ -238,21 +239,9 @@ inline void VisitRawValuesInline(const ArrayType& values,

[GitHub] [arrow] pitrou commented on a change in pull request #8770: ARROW-10696: [C++] Add SetBitRunReader

2020-12-01 Thread GitBox
pitrou commented on a change in pull request #8770: URL: https://github.com/apache/arrow/pull/8770#discussion_r533955521 ## File path: cpp/src/arrow/util/bit_run_reader.h ## @@ -166,7 +167,350 @@ class ARROW_EXPORT BitRunReader { using BitRunReader = BitRunReaderLinear;

[GitHub] [arrow] pitrou closed pull request #8704: ARROW-10644: [Python] Consolidate path/filesystem handling in pyarrow.dataset and pyarrow.fs

2020-12-01 Thread GitBox
pitrou closed pull request #8704: URL: https://github.com/apache/arrow/pull/8704 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] pitrou closed pull request #8755: ARROW-10709: [C++][Python] Allow PyReadableFile::Read() to call pyobj.read_buffer()

2020-12-01 Thread GitBox
pitrou closed pull request #8755: URL: https://github.com/apache/arrow/pull/8755 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] Dandandan commented on pull request #8743: ARROW-10693: [Rust] [DataFusion] Add support to left join

2020-12-01 Thread GitBox
Dandandan commented on pull request #8743: URL: https://github.com/apache/arrow/pull/8743#issuecomment-736452020 Awesome that the "main" join support is now in! This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #8810: ARROW-10779: [Java] Fix writeNull method in UnionListWriter

2020-12-01 Thread GitBox
github-actions[bot] commented on pull request #8810: URL: https://github.com/apache/arrow/pull/8810#issuecomment-736534626 https://issues.apache.org/jira/browse/ARROW-10779 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on pull request #8776: ARROW-5679: [Python][CI] Remove Python 3.5 support

2020-12-01 Thread GitBox
pitrou commented on pull request #8776: URL: https://github.com/apache/arrow/pull/8776#issuecomment-736392159 @emkornfield opined on this on the ML. This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] projjal closed pull request #8810: ARROW-10779: [Java] Fix writeNull method in UnionListWriter

2020-12-01 Thread GitBox
projjal closed pull request #8810: URL: https://github.com/apache/arrow/pull/8810 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jorisvandenbossche closed pull request #8809: ARROW-10778: [Python] Fix RowGroupInfo.statistics for empty row groups

2020-12-01 Thread GitBox
jorisvandenbossche closed pull request #8809: URL: https://github.com/apache/arrow/pull/8809 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] projjal opened a new pull request #8810: ARROW-10779: [Java] Fix writeNull method in UnionListWriter

2020-12-01 Thread GitBox
projjal opened a new pull request #8810: URL: https://github.com/apache/arrow/pull/8810 UnionListWriter#writeNull currently increments the index in the inner writer and doesn't unset the validity at the particular index. So if the validity at that index was already set (like due to

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #8809: ARROW-10778: [Python] Fix RowGroupInfo.statistics for empty row groups

2020-12-01 Thread GitBox
jorisvandenbossche commented on a change in pull request #8809: URL: https://github.com/apache/arrow/pull/8809#discussion_r533291547 ## File path: python/pyarrow/_dataset.pyx ## @@ -929,7 +929,8 @@ class RowGroupInfo: def name_stats(i): col =

[GitHub] [arrow] josiahyan commented on a change in pull request #8757: ARROW-8147: [C++] Add google-cloud-cpp to ThirdpartyToolchain

2020-12-01 Thread GitBox
josiahyan commented on a change in pull request #8757: URL: https://github.com/apache/arrow/pull/8757#discussion_r533463385 ## File path: ci/docker/conda-cpp.dockerfile ## @@ -73,6 +73,7 @@ ENV ARROW_BUILD_TESTS=ON \ ARROW_PARQUET=ON \ ARROW_PLASMA=ON \

[GitHub] [arrow] pitrou opened a new pull request #8811: ARROW-6883: [C++][Python] Allow writing dictionary deltas

2020-12-01 Thread GitBox
pitrou opened a new pull request #8811: URL: https://github.com/apache/arrow/pull/8811 * Add an ipc::IpcWriteOptions member to govern emission of dictionary deltas. If the option is enabled, deltas are detected by checking whether the new dictionary starts with the last

[GitHub] [arrow] pitrou commented on pull request #8776: ARROW-5679: [Python][CI] Remove Python 3.5 support

2020-12-01 Thread GitBox
pitrou commented on pull request #8776: URL: https://github.com/apache/arrow/pull/8776#issuecomment-736622731 @xhochy, @kszucs, could one of you rewrite this? This is an automated message from the Apache Git Service. To

[GitHub] [arrow] nevi-me commented on pull request #8800: ARROW-10767: [Rust] Speed up sum with nulls (non-simd)

2020-12-01 Thread GitBox
nevi-me commented on pull request #8800: URL: https://github.com/apache/arrow/pull/8800#issuecomment-736605119 I've rerun the integration CI job, we can merge this after we get confirmation This is an automated message from

[GitHub] [arrow] github-actions[bot] commented on pull request #8811: ARROW-6883: [C++][Python] Allow writing dictionary deltas

2020-12-01 Thread GitBox
github-actions[bot] commented on pull request #8811: URL: https://github.com/apache/arrow/pull/8811#issuecomment-736628856 https://issues.apache.org/jira/browse/ARROW-6883 This is an automated message from the Apache Git

[GitHub] [arrow] josiahyan commented on pull request #8757: ARROW-8147: [C++] Add google-cloud-cpp to ThirdpartyToolchain

2020-12-01 Thread GitBox
josiahyan commented on pull request #8757: URL: https://github.com/apache/arrow/pull/8757#issuecomment-736638562 @github-actions rebase This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] andygrove closed pull request #8803: ARROW-10772: [Rust] Speed up take by writing to buffer

2020-12-01 Thread GitBox
andygrove closed pull request #8803: URL: https://github.com/apache/arrow/pull/8803 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] nevi-me closed pull request #8799: ARROW-10765: [Rust] Optimize take string for non-null arrays

2020-12-01 Thread GitBox
nevi-me closed pull request #8799: URL: https://github.com/apache/arrow/pull/8799 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] xhochy closed pull request #8778: WIP: ARROW-10224

2020-12-01 Thread GitBox
xhochy closed pull request #8778: URL: https://github.com/apache/arrow/pull/8778 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] andygrove closed pull request #8798: ARROW-10627: [Rust] Loosen cfg restrictions for wasm32

2020-12-01 Thread GitBox
andygrove closed pull request #8798: URL: https://github.com/apache/arrow/pull/8798 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to