[GitHub] [arrow] kszucs commented on pull request #7767: ARROW-9453: [Rust] Wasm32 compilation support

2020-07-15 Thread GitBox
kszucs commented on pull request #7767: URL: https://github.com/apache/arrow/pull/7767#issuecomment-658946812 @rj-atw you can start here https://github.com/apache/arrow/blob/master/docs/source/developers/docker.rst I assume you'll need to adjust the

[GitHub] [arrow] kszucs removed a comment on pull request #7773: ARROW-9478: [C++] Improve error message for unsupported casts

2020-07-15 Thread GitBox
kszucs removed a comment on pull request #7773: URL: https://github.com/apache/arrow/pull/7773#issuecomment-658941654 @pitrou could you please force push the commit again? This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7770: ARROW-9476: [C++][Dataset] Fix incorrect dictionary association in HivePartitioningFactory

2020-07-15 Thread GitBox
wesm commented on pull request #7770: URL: https://github.com/apache/arrow/pull/7770#issuecomment-658936364 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] bkietz closed pull request #7770: ARROW-9476: [C++][Dataset] Fix incorrect dictionary association in HivePartitioningFactory

2020-07-15 Thread GitBox
bkietz closed pull request #7770: URL: https://github.com/apache/arrow/pull/7770 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] bkietz closed pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
bkietz closed pull request #7545: URL: https://github.com/apache/arrow/pull/7545 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #7777: ARROW-9493: [Python] Enable dictionary encoding in read_table with datasets API

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #: URL: https://github.com/apache/arrow/pull/#issuecomment-658959419 Revision: 1760ce1f805c3005880496669e02da5f8bb4902c Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] jorgecarleitao commented on pull request #7751: ARROW-9461: [Rust] Fixed error in reading Date32 and Date64.

2020-07-15 Thread GitBox
jorgecarleitao commented on pull request #7751: URL: https://github.com/apache/arrow/pull/7751#issuecomment-659041261 > You mean we need Parquet file with the Date64 type? Yes. That would be the most reliable way to test this. Something equivalent to the Rust's counter-part of:

[GitHub] [arrow] jorgecarleitao commented on a change in pull request #7751: ARROW-9461: [Rust] Fixed error in reading Date32 and Date64.

2020-07-15 Thread GitBox
jorgecarleitao commented on a change in pull request #7751: URL: https://github.com/apache/arrow/pull/7751#discussion_r455229742 ## File path: rust/parquet/src/arrow/array_reader.rs ## @@ -196,11 +196,13 @@ impl ArrayReader for PrimitiveArrayReader {

[GitHub] [arrow] jorisvandenbossche opened a new pull request #7777: ARROW-xxxx: [Python] Enable dictionary encoding in read_table with datasets API

2020-07-15 Thread GitBox
jorisvandenbossche opened a new pull request #: URL: https://github.com/apache/arrow/pull/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] jorisvandenbossche commented on pull request #7777: ARROW-xxxx: [Python] Enable dictionary encoding in read_table with datasets API

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #: URL: https://github.com/apache/arrow/pull/#issuecomment-658921118 (it's only the last commit, i'll rebase once https://github.com/apache/arrow/pull/7545 is merged)

[GitHub] [arrow] wesm commented on pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
wesm commented on pull request #7772: URL: https://github.com/apache/arrow/pull/7772#issuecomment-658925659 That seems to be a new flake I haven't seen This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] wesm closed pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
wesm closed pull request #7772: URL: https://github.com/apache/arrow/pull/7772 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] kszucs commented on pull request #7162: ARROW-6917: [Developer] Implement Python script to generate git cherry-pick commands needed to create patch build branch for maint releases

2020-07-15 Thread GitBox
kszucs commented on pull request #7162: URL: https://github.com/apache/arrow/pull/7162#issuecomment-658893947 > > @wesm I'm not sure how to add tests this easily, but we can certainly defer that to a follow-up. > > Well, the business logic that's unrelated to accessing remote data

[GitHub] [arrow] nealrichardson commented on pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
nealrichardson commented on pull request #7772: URL: https://github.com/apache/arrow/pull/7772#issuecomment-658894854 Is this a known flaky test in the macOS C++ build? https://github.com/apache/arrow/pull/7772/checks?check_run_id=874112924#step:6:1308

[GitHub] [arrow] kszucs commented on pull request #7162: ARROW-6917: [Archery][Release] Add support for JIRA curation, changelog generation and commit cherry-picking for maintenance releases

2020-07-15 Thread GitBox
kszucs commented on pull request #7162: URL: https://github.com/apache/arrow/pull/7162#issuecomment-658910886 Follow-ups: - Testing: https://issues.apache.org/jira/browse/ARROW-9487 - Update the post release website script to use the new changelog generation (will handle it after the

[GitHub] [arrow] jorisvandenbossche commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658921634 -> https://github.com/apache/arrow/pull/ (note that this will only impact direct users of `read_table` with partitioned datasets, which eg does not include

[GitHub] [arrow] martindurant commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
martindurant commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658919780 Thank you! This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] sunchao commented on pull request #7751: ARROW-9461: [Rust] Fixed error in reading Date32 and Date64.

2020-07-15 Thread GitBox
sunchao commented on pull request #7751: URL: https://github.com/apache/arrow/pull/7751#issuecomment-658968604 > Date64 is an arrow specific format, and thus to test it we need to have a file in parquet with that format, since we do not have a writer yet to play around. I don't

[GitHub] [arrow] bkietz closed pull request #7777: ARROW-9493: [Python] Enable dictionary encoding in read_table with datasets API

2020-07-15 Thread GitBox
bkietz closed pull request #: URL: https://github.com/apache/arrow/pull/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] paddyhoran commented on a change in pull request #7767: ARROW-9453: [Rust] Wasm32 compilation support

2020-07-15 Thread GitBox
paddyhoran commented on a change in pull request #7767: URL: https://github.com/apache/arrow/pull/7767#discussion_r455256346 ## File path: rust/Cargo.toml ## @@ -24,3 +24,7 @@ members = [ "integration-testing", "benchmarks", ] +default-members= [ Review

[GitHub] [arrow] paddyhoran commented on pull request #7767: ARROW-9453: [Rust] Wasm32 compilation support

2020-07-15 Thread GitBox
paddyhoran commented on pull request #7767: URL: https://github.com/apache/arrow/pull/7767#issuecomment-658935926 > @paddyhoran I love to add the need CI. Do we have any documentation around this projects CI (especially how to setup env locally)? If you search for "rust" under the

[GitHub] [arrow] paddyhoran edited a comment on pull request #7767: ARROW-9453: [Rust] Wasm32 compilation support

2020-07-15 Thread GitBox
paddyhoran edited a comment on pull request #7767: URL: https://github.com/apache/arrow/pull/7767#issuecomment-658935926 > @paddyhoran I love to add the need CI. Do we have any documentation around this projects CI (especially how to setup env locally)? If you search for "rust"

[GitHub] [arrow] github-actions[bot] commented on pull request #7777: ARROW-9493: [Python] Enable dictionary encoding in read_table with datasets API

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #: URL: https://github.com/apache/arrow/pull/#issuecomment-658942953 https://issues.apache.org/jira/browse/ARROW-9493 This is an automated message from the Apache Git

[GitHub] [arrow] jorisvandenbossche commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658919336 @martindurant I am doing the follow-up PR as we speak This is an automated message from the Apache Git

[GitHub] [arrow] nealrichardson closed pull request #7775: ARROW-9485: [R] Better shared library stripping

2020-07-15 Thread GitBox
nealrichardson closed pull request #7775: URL: https://github.com/apache/arrow/pull/7775 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] kszucs closed pull request #7162: ARROW-6917: [Archery][Release] Add support for JIRA curation, changelog generation and commit cherry-picking for maintenance releases

2020-07-15 Thread GitBox
kszucs closed pull request #7162: URL: https://github.com/apache/arrow/pull/7162 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jorisvandenbossche commented on pull request #7777: ARROW-9493: [Python] Enable dictionary encoding in read_table with datasets API

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #: URL: https://github.com/apache/arrow/pull/#issuecomment-658951890 @github-actions crossbow submit test-conda-python-3.7-pandas-master test-conda-python-3.7-kartothek-master test-conda-python-3.7-kartothek-latest

[GitHub] [arrow] wesm closed pull request #7769: ARROW-8521: [Release] Update CHANGELOG.md to include patch releases

2020-07-15 Thread GitBox
wesm closed pull request #7769: URL: https://github.com/apache/arrow/pull/7769 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] kszucs closed pull request #7773: ARROW-9478: [C++] Improve error message for unsupported casts

2020-07-15 Thread GitBox
kszucs closed pull request #7773: URL: https://github.com/apache/arrow/pull/7773 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #7776: ARROW-9486: [C++][Dataset] Support implicit cast of InExpression::set to dict

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7776: URL: https://github.com/apache/arrow/pull/7776#issuecomment-658896704 https://issues.apache.org/jira/browse/ARROW-9486 This is an automated message from the Apache Git

[GitHub] [arrow] jorgecarleitao commented on pull request #7751: ARROW-9461: [Rust] Fixed error in reading Date32 and Date64.

2020-07-15 Thread GitBox
jorgecarleitao commented on pull request #7751: URL: https://github.com/apache/arrow/pull/7751#issuecomment-658906785 > Thanks @jorgecarleitao . Great that the `Date32` is covered. Is it possible to add a test for `Date64` as well or it is also covered? Ideally we want to test the error

[GitHub] [arrow] nealrichardson commented on pull request #7775: ARROW-9485: [R] Better shared library stripping

2020-07-15 Thread GitBox
nealrichardson commented on pull request #7775: URL: https://github.com/apache/arrow/pull/7775#issuecomment-658917762 +1 Autobrew libs are down from 12.2mb to 8.9mb, so this is now working on macOS. And Linux builds are still trimmed.

[GitHub] [arrow] pitrou commented on a change in pull request #7776: ARROW-9486: [C++][Dataset] Support implicit cast of InExpression::set to dict

2020-07-15 Thread GitBox
pitrou commented on a change in pull request #7776: URL: https://github.com/apache/arrow/pull/7776#discussion_r455250939 ## File path: cpp/src/arrow/dataset/filter.cc ## @@ -931,6 +931,22 @@ Result> FieldExpression::Validate(const Schema& schema return null(); } +Result

[GitHub] [arrow] wesm closed pull request #7776: ARROW-9486: [C++][Dataset] Support implicit cast of InExpression::set to dict

2020-07-15 Thread GitBox
wesm closed pull request #7776: URL: https://github.com/apache/arrow/pull/7776 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] kszucs commented on pull request #7773: ARROW-9478: [C++] Improve error message for unsupported casts

2020-07-15 Thread GitBox
kszucs commented on pull request #7773: URL: https://github.com/apache/arrow/pull/7773#issuecomment-658941654 @pitrou could you please force push the commit again? This is an automated message from the Apache Git

[GitHub] [arrow] kszucs commented on pull request #7773: ARROW-9478: [C++] Improve error message for unsupported casts

2020-07-15 Thread GitBox
kszucs commented on pull request #7773: URL: https://github.com/apache/arrow/pull/7773#issuecomment-658961802 @pitrou it seems like the R bindings must be updated This is an automated message from the Apache Git Service. To

[GitHub] [arrow] martindurant commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
martindurant commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658917048 (please please do ensure that dict encoding does happen, at least for str) This is an automated message from

[GitHub] [arrow] bkietz commented on a change in pull request #7776: ARROW-9486: [C++][Dataset] Support implicit cast of InExpression::set to dict

2020-07-15 Thread GitBox
bkietz commented on a change in pull request #7776: URL: https://github.com/apache/arrow/pull/7776#discussion_r45526 ## File path: cpp/src/arrow/dataset/filter.cc ## @@ -931,6 +931,22 @@ Result> FieldExpression::Validate(const Schema& schema return null(); } +Result

[GitHub] [arrow] zhztheplayer opened a new pull request #7768: [ARROW-9475] Clean up usages of BaseAllocator, use BufferAllocator in…

2020-07-15 Thread GitBox
zhztheplayer opened a new pull request #7768: URL: https://github.com/apache/arrow/pull/7768 …stead Issue link: https://issues.apache.org/jira/browse/ARROW-9475. This is an automated message from the Apache Git

[GitHub] [arrow] zhztheplayer commented on pull request #7768: [ARROW-9475] Clean up usages of BaseAllocator, use BufferAllocator in…

2020-07-15 Thread GitBox
zhztheplayer commented on pull request #7768: URL: https://github.com/apache/arrow/pull/7768#issuecomment-658597394 This patch is also supposed to be a dependency of #7030 ([ARROW-7808](https://issues.apache.org/jira/browse/ARROW-7808)).

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-07-15 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-658605439 > I was suggesting a third approach: C++ memory pool that simply updates the direct memory info via Bits in java. It should be done chunk-wise to avoid excessive JNI

[GitHub] [arrow] github-actions[bot] commented on pull request #7768: ARROW-9475: [Java] Clean up usages of BaseAllocator, use BufferAllocator in…

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7768: URL: https://github.com/apache/arrow/pull/7768#issuecomment-658605133 https://issues.apache.org/jira/browse/ARROW-9475 This is an automated message from the Apache Git

[GitHub] [arrow] xhochy commented on pull request #7706: ARROW-9409: [CI][Crossbow] Nightly conda-r fails

2020-07-15 Thread GitBox
xhochy commented on pull request #7706: URL: https://github.com/apache/arrow/pull/7706#issuecomment-658588366 I'm using this to setup my development environment. The conda recipe only has the run dependencies, that's useful for building the final package but not for developing things.

[GitHub] [arrow] sunchao commented on pull request #7751: ARROW-9461: [Rust] Fixed error in reading Date32 and Date64.

2020-07-15 Thread GitBox
sunchao commented on pull request #7751: URL: https://github.com/apache/arrow/pull/7751#issuecomment-658580751 Thanks @jorgecarleitao . LGTM but some simple tests will be appreciated. This is an automated message from the

[GitHub] [arrow] github-actions[bot] commented on pull request #7768: [ARROW-9475] Clean up usages of BaseAllocator, use BufferAllocator in…

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7768: URL: https://github.com/apache/arrow/pull/7768#issuecomment-658592176 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] zhztheplayer commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-07-15 Thread GitBox
zhztheplayer commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-658593410 @jacques-n Addressing a previous comment: > We need to avoid this entirely. If you need some functionality, let's figure out what should be exposed.

[GitHub] [arrow] sunchao commented on a change in pull request #7767: ARROW-9453: [Rust] Wasm32 compilation support

2020-07-15 Thread GitBox
sunchao commented on a change in pull request #7767: URL: https://github.com/apache/arrow/pull/7767#discussion_r454841132 ## File path: rust/Cargo.toml ## @@ -24,3 +24,7 @@ members = [ "integration-testing", "benchmarks", ] +default-members= [ Review

[GitHub] [arrow] liyafan82 commented on pull request #7748: ARROW-9388: [C++] Division kernels

2020-07-15 Thread GitBox
liyafan82 commented on pull request #7748: URL: https://github.com/apache/arrow/pull/7748#issuecomment-658655201 > Thanks for starting this. I'm going to pull this down and make some changes per my comments @wesm Thanks a lot for your effort. Your changes look much more reasonable

[GitHub] [arrow] liyafan82 commented on pull request #7748: ARROW-9388: [C++] Division kernels

2020-07-15 Thread GitBox
liyafan82 commented on pull request #7748: URL: https://github.com/apache/arrow/pull/7748#issuecomment-658658261 > This can't be merged yet. Divide by zero in the unchecked case causes SIGFPE process crash. > > We should probably return null when dividing by zero, this is what

[GitHub] [arrow] xhochy commented on pull request #7758: ARROW-9469: [Python] Make more objects weakrefable

2020-07-15 Thread GitBox
xhochy commented on pull request #7758: URL: https://github.com/apache/arrow/pull/7758#issuecomment-658676195 Sounds like a good idea, CI is failing though. This is an automated message from the Apache Git Service. To

[GitHub] [arrow] jorisvandenbossche commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658662358 To clarify: - The current PR right now doesn't use dictionary encoding for any type of partition fields, so also not for strings - For strings I could rather

[GitHub] [arrow] pitrou commented on a change in pull request #7770: ARROW-9476: [C++][Dataset] Fix incorrect dictionary association in HivePartitioningFactory

2020-07-15 Thread GitBox
pitrou commented on a change in pull request #7770: URL: https://github.com/apache/arrow/pull/7770#discussion_r455064306 ## File path: cpp/src/arrow/dataset/filter.cc ## @@ -772,13 +772,10 @@ std::shared_ptr and_(std::shared_ptr lhs, } std::shared_ptr and_(const

[GitHub] [arrow] jorisvandenbossche commented on pull request #7770: ARROW-9476: [C++][Dataset] Fix incorrect dictionary association in HivePartitioningFactory

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #7770: URL: https://github.com/apache/arrow/pull/7770#issuecomment-658783140 I can approve the dataset test, but I wrote it though .. ;) This is an automated message from the

[GitHub] [arrow] bkietz opened a new pull request #7770: ARROW-9476: [C++][Dataset] Fix incorrect dictionary association in HivePartitioningFactory

2020-07-15 Thread GitBox
bkietz opened a new pull request #7770: URL: https://github.com/apache/arrow/pull/7770 In the presence of multiple fields, it was possible (non-deterministically) that the field->dictionary association could be scrambled.

[GitHub] [arrow] github-actions[bot] commented on pull request #7770: ARROW-9476: [C++][Dataset] Fix incorrect dictionary association in HivePartitioningFactory

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7770: URL: https://github.com/apache/arrow/pull/7770#issuecomment-658766032 https://issues.apache.org/jira/browse/ARROW-9476 This is an automated message from the Apache Git

[GitHub] [arrow] liyafan82 commented on a change in pull request #7748: ARROW-9388: [C++] Division kernels

2020-07-15 Thread GitBox
liyafan82 commented on a change in pull request #7748: URL: https://github.com/apache/arrow/pull/7748#discussion_r454956210 ## File path: cpp/src/arrow/compute/api_scalar.h ## @@ -130,6 +130,19 @@ Result Multiply(const Datum& left, const Datum& right,

[GitHub] [arrow] liyafan82 commented on a change in pull request #7748: ARROW-9388: [C++] Division kernels

2020-07-15 Thread GitBox
liyafan82 commented on a change in pull request #7748: URL: https://github.com/apache/arrow/pull/7748#discussion_r454958337 ## File path: cpp/src/arrow/compute/kernels/scalar_arithmetic.cc ## @@ -219,48 +219,77 @@ struct MultiplyChecked { } }; +struct Divide { +

[GitHub] [arrow] liyafan82 commented on a change in pull request #7748: ARROW-9388: [C++] Division kernels

2020-07-15 Thread GitBox
liyafan82 commented on a change in pull request #7748: URL: https://github.com/apache/arrow/pull/7748#discussion_r454957895 ## File path: cpp/src/arrow/compute/kernels/codegen_internal.h ## @@ -675,11 +704,80 @@ struct ScalarBinary { } }; +// An alternative to

[GitHub] [arrow] rymurr commented on a change in pull request #7768: ARROW-9475: [Java] Clean up usages of BaseAllocator, use BufferAllocator in…

2020-07-15 Thread GitBox
rymurr commented on a change in pull request #7768: URL: https://github.com/apache/arrow/pull/7768#discussion_r454964570 ## File path: java/memory/memory-core/src/main/java/org/apache/arrow/memory/BufferAllocator.java ## @@ -126,6 +134,30 @@ BufferAllocator newChildAllocator(

[GitHub] [arrow] jorisvandenbossche commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658714201 When enabling dictionary encoding for string partition fields, there are actually a bunch of failing tests .. Eg this one (based on

[GitHub] [arrow] kszucs commented on pull request #7162: ARROW-6917: [Developer] Implement Python script to generate git cherry-pick commands needed to create patch build branch for maint releases

2020-07-15 Thread GitBox
kszucs commented on pull request #7162: URL: https://github.com/apache/arrow/pull/7162#issuecomment-658717702 Also updated the changelog generation and removed the other two implementation. @wesm I'm not sure how to add tests this `easily`, but we can certainly defer that to a

[GitHub] [arrow] nealrichardson opened a new pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
nealrichardson opened a new pull request #7772: URL: https://github.com/apache/arrow/pull/7772 Also snuck in a little r/README.md cleanup This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] nealrichardson commented on pull request #7769: ARROW-8521: [Release] Update CHANGELOG.md including patch releases

2020-07-15 Thread GitBox
nealrichardson commented on pull request #7769: URL: https://github.com/apache/arrow/pull/7769#issuecomment-658839577 I agree that the only consistent distinction we use is bug vs. not bug. And even if we had a meaningful distinction between new feature and improvement, I'm not sure a

[GitHub] [arrow] pitrou commented on a change in pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
pitrou commented on a change in pull request #7772: URL: https://github.com/apache/arrow/pull/7772#discussion_r455146862 ## File path: docs/source/cpp/compute.rst ## @@ -263,56 +263,56 @@ given class:

[GitHub] [arrow] pitrou opened a new pull request #7773: ARROW-9478: [C++] Improve error message for unsupported casts

2020-07-15 Thread GitBox
pitrou opened a new pull request #7773: URL: https://github.com/apache/arrow/pull/7773 Mention both input type and target type, as far as possible. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] nealrichardson commented on a change in pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
nealrichardson commented on a change in pull request #7772: URL: https://github.com/apache/arrow/pull/7772#discussion_r455149121 ## File path: docs/source/cpp/compute.rst ## @@ -263,56 +263,56 @@ given class:

[GitHub] [arrow] pitrou commented on a change in pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
pitrou commented on a change in pull request #7772: URL: https://github.com/apache/arrow/pull/7772#discussion_r455153059 ## File path: docs/source/cpp/compute.rst ## @@ -263,56 +263,56 @@ given class:

[GitHub] [arrow] github-actions[bot] commented on pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7772: URL: https://github.com/apache/arrow/pull/7772#issuecomment-658844488 https://issues.apache.org/jira/browse/ARROW-9484 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7773: ARROW-9478: [C++] Improve error message for unsupported casts

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7773: URL: https://github.com/apache/arrow/pull/7773#issuecomment-658844487 https://issues.apache.org/jira/browse/ARROW-9478 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on a change in pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
pitrou commented on a change in pull request #7772: URL: https://github.com/apache/arrow/pull/7772#discussion_r455153729 ## File path: docs/source/cpp/compute.rst ## @@ -263,56 +263,56 @@ given class:

[GitHub] [arrow] pitrou commented on a change in pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
pitrou commented on a change in pull request #7772: URL: https://github.com/apache/arrow/pull/7772#discussion_r455153953 ## File path: docs/source/cpp/compute.rst ## @@ -263,56 +263,56 @@ given class:

[GitHub] [arrow] nealrichardson commented on a change in pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
nealrichardson commented on a change in pull request #7772: URL: https://github.com/apache/arrow/pull/7772#discussion_r455153673 ## File path: docs/source/cpp/compute.rst ## @@ -263,56 +263,56 @@ given class:

[GitHub] [arrow] pitrou commented on a change in pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
pitrou commented on a change in pull request #7772: URL: https://github.com/apache/arrow/pull/7772#discussion_r455156052 ## File path: docs/source/cpp/compute.rst ## @@ -263,56 +263,56 @@ given class:

[GitHub] [arrow] o0Ignition0o opened a new pull request #7774: Target thrift master until the apache team releases a new version

2020-07-15 Thread GitBox
o0Ignition0o opened a new pull request #7774: URL: https://github.com/apache/arrow/pull/7774 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] pitrou commented on pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
pitrou commented on pull request #7772: URL: https://github.com/apache/arrow/pull/7772#issuecomment-658849775 PR diff can be viewed at https://gist.github.com/pitrou/25dabac3b79ca8c17e5d261ddad203df This is an automated

[GitHub] [arrow] wesm commented on pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
wesm commented on pull request #7772: URL: https://github.com/apache/arrow/pull/7772#issuecomment-658850617 +1 from me This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] wesm commented on a change in pull request #7773: ARROW-9478: [C++] Improve error message for unsupported casts

2020-07-15 Thread GitBox
wesm commented on a change in pull request #7773: URL: https://github.com/apache/arrow/pull/7773#discussion_r455159415 ## File path: cpp/src/arrow/visitor_inline.h ## @@ -94,6 +94,25 @@ inline Status VisitTypeInline(const DataType& type, VISITOR* visitor) { #undef

[GitHub] [arrow] github-actions[bot] commented on pull request #7774: Target thrift master until the apache team releases a new version

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7774: URL: https://github.com/apache/arrow/pull/7774#issuecomment-658852715 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] jorgecarleitao commented on pull request #7751: ARROW-9461: [Rust] Fixed error in reading Date32 and Date64.

2020-07-15 Thread GitBox
jorgecarleitao commented on pull request #7751: URL: https://github.com/apache/arrow/pull/7751#issuecomment-658854948 Thanks @andygrove and @sunchao for taking the time to look at this. I had to [change an existing

[GitHub] [arrow] pitrou commented on a change in pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
pitrou commented on a change in pull request #7772: URL: https://github.com/apache/arrow/pull/7772#discussion_r455166892 ## File path: docs/source/cpp/compute.rst ## @@ -263,56 +263,56 @@ given class:

[GitHub] [arrow] nealrichardson commented on a change in pull request #7772: ARROW-9484: [Docs] Update is* functions to be is_* in the compute docs

2020-07-15 Thread GitBox
nealrichardson commented on a change in pull request #7772: URL: https://github.com/apache/arrow/pull/7772#discussion_r455169961 ## File path: docs/source/cpp/compute.rst ## @@ -263,56 +263,56 @@ given class:

[GitHub] [arrow] o0Ignition0o closed pull request #7774: Target thrift master until the apache team releases a new version

2020-07-15 Thread GitBox
o0Ignition0o closed pull request #7774: URL: https://github.com/apache/arrow/pull/7774 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] o0Ignition0o commented on pull request #7774: Target thrift master until the apache team releases a new version

2020-07-15 Thread GitBox
o0Ignition0o commented on pull request #7774: URL: https://github.com/apache/arrow/pull/7774#issuecomment-658862234 Sorry I didnt mean to open this one upstream :sweat_smile: This is an automated message from the Apache Git

[GitHub] [arrow] kszucs edited a comment on pull request #7162: ARROW-6917: [Developer] Implement Python script to generate git cherry-pick commands needed to create patch build branch for maint relea

2020-07-15 Thread GitBox
kszucs edited a comment on pull request #7162: URL: https://github.com/apache/arrow/pull/7162#issuecomment-658717702 Also updated the changelog generation and removed the other two implementation. @wesm I'm not sure how to add tests this `easily`, but we can certainly defer that

[GitHub] [arrow] jorisvandenbossche commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658728008 The existing tests are already failing (the above reproducible snippets were based on those), *if* the dictionary encoding gets enabled. But I can write a

[GitHub] [arrow] liyafan82 commented on a change in pull request #7748: ARROW-9388: [C++] Division kernels

2020-07-15 Thread GitBox
liyafan82 commented on a change in pull request #7748: URL: https://github.com/apache/arrow/pull/7748#discussion_r454955654 ## File path: cpp/src/arrow/compute/api_scalar.h ## @@ -130,6 +130,19 @@ Result Multiply(const Datum& left, const Datum& right,

[GitHub] [arrow] liyafan82 commented on a change in pull request #7748: ARROW-9388: [C++] Division kernels

2020-07-15 Thread GitBox
liyafan82 commented on a change in pull request #7748: URL: https://github.com/apache/arrow/pull/7748#discussion_r454959287 ## File path: cpp/src/arrow/compute/kernels/scalar_cast_test.cc ## @@ -169,7 +169,6 @@ class TestCast : public TestBase {

[GitHub] [arrow] kszucs opened a new pull request #7769: [Release] Update CHANGELOG.md including patch releases

2020-07-15 Thread GitBox
kszucs opened a new pull request #7769: URL: https://github.com/apache/arrow/pull/7769 Generated by the following command from #7162 ```bash archery release --jira-cache /tmp/archery-cache changelog regenerate ```

[GitHub] [arrow] pitrou commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
pitrou commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658721203 @jorisvandenbossche Can you at least add minimal failing tests in the PR? This is an automated message from the

[GitHub] [arrow] github-actions[bot] commented on pull request #7769: [Release] Update CHANGELOG.md including patch releases

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7769: URL: https://github.com/apache/arrow/pull/7769#issuecomment-658711764 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] kszucs commented on pull request #7765: ARROW-9399: [C++] Add forward compatibility test to detect and raise error for future MetadataVersion

2020-07-15 Thread GitBox
kszucs commented on pull request #7765: URL: https://github.com/apache/arrow/pull/7765#issuecomment-658711897 > @kszucs `ARROW_TEST_DATA` is not being set in some of the Ursabot builders Noted, adding it later today.

[GitHub] [arrow] jorisvandenbossche commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658720185 A bit simplified example: ```python import numpy as np import pyarrow as pa import pyarrow.parquet as pq import pyarrow.dataset as ds foo_keys

[GitHub] [arrow] pitrou commented on pull request #7758: ARROW-9469: [Python] Make more objects weakrefable

2020-07-15 Thread GitBox
pitrou commented on pull request #7758: URL: https://github.com/apache/arrow/pull/7758#issuecomment-658719729 The remaining build failures look unrelated. This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] github-actions[bot] commented on pull request #7771: Fix Table.from for zero-item serialized tables, Table.empty for schemas containing compound types (List, FixedSizeList, Map)

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7771: URL: https://github.com/apache/arrow/pull/7771#issuecomment-658801381 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] jorisvandenbossche commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
jorisvandenbossche commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658806394 @github-actions crossbow submit test-conda-python-3.7-pandas-master test-conda-python-3.7-kartothek-master test-conda-python-3.7-kartothek-latest

[GitHub] [arrow] wesm commented on pull request #7769: ARROW-8521: [Release] Update CHANGELOG.md including patch releases

2020-07-15 Thread GitBox
wesm commented on pull request #7769: URL: https://github.com/apache/arrow/pull/7769#issuecomment-658817076 Based on what I can tell people aren't consistent about classifying things as Improvements vs. New Features (and sometimes the distinction is unclear), so I feel like those

[GitHub] [arrow] nealrichardson commented on pull request #7162: ARROW-6917: [Developer] Implement Python script to generate git cherry-pick commands needed to create patch build branch for maint rele

2020-07-15 Thread GitBox
nealrichardson commented on pull request #7162: URL: https://github.com/apache/arrow/pull/7162#issuecomment-658831477 I see on #7769 you called `archery release --jira-cache /tmp/archery-cache changelog regenerate`. Could you please document this somewhere? It's a bit non-obvious.

[GitHub] [arrow] H-Plus-Time opened a new pull request #7771: Fix Table.from for zero-item serialized tables, Table.empty for schemas containing compound types (List, FixedSizeList, Map)

2020-07-15 Thread GitBox
H-Plus-Time opened a new pull request #7771: URL: https://github.com/apache/arrow/pull/7771 Steps for reproduction: ```js const foo = new arrow.List(new arrow.Field('bar', new arrow.Float64())) const table = arrow.Table.empty(foo) // ⚡ ``` The Data constructor assumes

[GitHub] [arrow] github-actions[bot] commented on pull request #7545: ARROW-9139: [Python] Switch parquet.read_table to use new datasets API by default

2020-07-15 Thread GitBox
github-actions[bot] commented on pull request #7545: URL: https://github.com/apache/arrow/pull/7545#issuecomment-658807841 Revision: 5d25c02ae678657c149fa307010339c43656eff6 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] wesm commented on pull request #7162: ARROW-6917: [Developer] Implement Python script to generate git cherry-pick commands needed to create patch build branch for maint releases

2020-07-15 Thread GitBox
wesm commented on pull request #7162: URL: https://github.com/apache/arrow/pull/7162#issuecomment-658818850 > @wesm I'm not sure how to add tests this easily, but we can certainly defer that to a follow-up. Well, the business logic that's unrelated to accessing remote data could be

  1   2   >