[GitHub] [arrow] markhildreth commented on pull request #6972: ARROW-8287: [Rust] Add "pretty" util to help with printing tabular output of RecordBatches

2020-04-24 Thread GitBox
markhildreth commented on pull request #6972: URL: https://github.com/apache/arrow/pull/6972#issuecomment-619215502 Created [follow-up JIRA task](https://issues.apache.org/jira/browse/ARROW-8590). This is an automated

[GitHub] [arrow] vertexclique opened a new pull request #7036: ARROW-8591: [Rust] Reverse lookup for a key in DictionaryArray

2020-04-24 Thread GitBox
vertexclique opened a new pull request #7036: URL: https://github.com/apache/arrow/pull/7036 This PR enables reverse lookup for already built dict. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] zgramana opened a new pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values

2020-04-24 Thread GitBox
zgramana opened a new pull request #7032: URL: https://github.com/apache/arrow/pull/7032 Takes an alternative approach to completing [ARROW-6603](https://issues.apache.org/jira/browse/ARROW-6603) that is in-line with the current API and with other Arrow implementations. More

[GitHub] [arrow] mayuropensource commented on pull request #7022: ARROW-8562: [C++] IO: Parameterize I/O Coalescing using S3 metrics

2020-04-24 Thread GitBox
mayuropensource commented on pull request #7022: URL: https://github.com/apache/arrow/pull/7022#issuecomment-619184138 @fsaintjacques, I can try to put together a python script using boto to determine the S3 metrics. Will that work for you?

[GitHub] [arrow] nealrichardson commented on issue #7034: R arrow package can't see arrow-cpp installation

2020-04-24 Thread GitBox
nealrichardson commented on issue #7034: URL: https://github.com/apache/arrow/issues/7034#issuecomment-619210756 We don't do any testing on NixOS, so it's not surprising that it doesn't just work. http://arrow.apache.org/docs/r/articles/install.html describes how dependencies are

[GitHub] [arrow] github-actions[bot] commented on pull request #7031: ARROW-8587: [C++] Fix linking Flight benchmarks

2020-04-24 Thread GitBox
github-actions[bot] commented on pull request #7031: URL: https://github.com/apache/arrow/pull/7031#issuecomment-619156566 https://issues.apache.org/jira/browse/ARROW-8587 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values

2020-04-24 Thread GitBox
github-actions[bot] commented on pull request #7032: URL: https://github.com/apache/arrow/pull/7032#issuecomment-619164076 https://issues.apache.org/jira/browse/ARROW-6603 This is an automated message from the Apache Git

[GitHub] [arrow] nevi-me commented on pull request #7024: ARROW-8573: [Rust] Upgrade Rust to 1.44 nightly

2020-04-24 Thread GitBox
nevi-me commented on pull request #7024: URL: https://github.com/apache/arrow/pull/7024#issuecomment-619189091 @paddyhoran we might have to try a different nightly, as sometimes a day's version might have no rustfmt. The change I made in that PR installs a nightly version, I don't know

[GitHub] [arrow] bkietz opened a new pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-24 Thread GitBox
bkietz opened a new pull request #7033: URL: https://github.com/apache/arrow/pull/7033 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] github-actions[bot] commented on pull request #7035: ARROW-8590: [Rust] Use arrow crate pretty util in DataFusion

2020-04-24 Thread GitBox
github-actions[bot] commented on pull request #7035: URL: https://github.com/apache/arrow/pull/7035#issuecomment-619219686 https://issues.apache.org/jira/browse/ARROW-8590 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7036: ARROW-8591: [Rust] Reverse lookup for a key in DictionaryArray

2020-04-24 Thread GitBox
github-actions[bot] commented on pull request #7036: URL: https://github.com/apache/arrow/pull/7036#issuecomment-619219685 https://issues.apache.org/jira/browse/ARROW-8591 This is an automated message from the Apache Git

[GitHub] [arrow] zgramana commented on pull request #6121: ARROW-6603: [C#] - Nullable Array Support

2020-04-24 Thread GitBox
zgramana commented on pull request #6121: URL: https://github.com/apache/arrow/pull/6121#issuecomment-619162558 @eerhardt I've just submitted https://github.com/apache/arrow/pull/7032 for review/discussion This is an

[GitHub] [arrow] durch opened a new pull request #7042: ARROW-8597 [Rust] Lints and readability improvements for arrow crate

2020-04-26 Thread GitBox
durch opened a new pull request #7042: URL: https://github.com/apache/arrow/pull/7042 + Pedantic fixes to `unsafe` + Changes to function arguments to pass in references or values as appropriate + Refactor pointer arithmetic to use `usize` instead of `isize` casting + Ignore

[GitHub] [arrow] github-actions[bot] commented on pull request #7042: ARROW-8597 [Rust] Lints and readability improvements for arrow crate

2020-04-26 Thread GitBox
github-actions[bot] commented on pull request #7042: URL: https://github.com/apache/arrow/pull/7042#issuecomment-619599024 https://issues.apache.org/jira/browse/ARROW-8597 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on issue #7014: Go: Minor change to make newBuilder public to aid upstream

2020-04-22 Thread GitBox
github-actions[bot] commented on issue #7014: URL: https://github.com/apache/arrow/pull/7014#issuecomment-617881461 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could you

[GitHub] [arrow] kiszk commented on issue #6954: ARROW-8440: [C++] Refine SIMD header files

2020-04-22 Thread GitBox
kiszk commented on issue #6954: URL: https://github.com/apache/arrow/pull/6954#issuecomment-617854855 Is the function `Armv8CrcHashParallel` uses somewhere? Sorry if I overlook it. This is an automated message from the

[GitHub] [arrow] kiszk edited a comment on issue #6954: ARROW-8440: [C++] Refine SIMD header files

2020-04-22 Thread GitBox
kiszk edited a comment on issue #6954: URL: https://github.com/apache/arrow/pull/6954#issuecomment-617854855 Is the function `Armv8CrcHashParallel` used somewhere? Sorry if I overlook it. This is an automated message from

[GitHub] [arrow] kszucs commented on issue #7000: ARROW-8065: [C++][Dataset] Refactor ScanOptions and Fragment relation

2020-04-22 Thread GitBox
kszucs commented on issue #7000: URL: https://github.com/apache/arrow/pull/7000#issuecomment-617868408 @ursabot build This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] mindhash opened a new pull request #7014: Go: Minor change to make newBuilder public to aid upstream

2020-04-22 Thread GitBox
mindhash opened a new pull request #7014: URL: https://github.com/apache/arrow/pull/7014 Hello team, This minor change makes newBuilder() public to reduce verbosity in upstream. To give you example, I am working on a parquet read / write into Arrow Record batch where the parquet

[GitHub] [arrow] pitrou commented on a change in pull request #6954: ARROW-8440: [C++] Refine SIMD header files

2020-04-22 Thread GitBox
pitrou commented on a change in pull request #6954: URL: https://github.com/apache/arrow/pull/6954#discussion_r413092039 ## File path: cpp/src/arrow/util/hash_util.h ## @@ -27,39 +27,27 @@ #include "arrow/util/logging.h" #include "arrow/util/macros.h" -#include

[GitHub] [arrow] davidanthoff commented on a change in pull request #7001: Use lowercase ws2_32 everywhere

2020-04-21 Thread GitBox
davidanthoff commented on a change in pull request #7001: URL: https://github.com/apache/arrow/pull/7001#discussion_r412497174 ## File path: cpp/cmake_modules/FindThrift.cmake ## @@ -100,7 +100,7 @@ if(Thrift_FOUND OR THRIFT_FOUND)

[GitHub] [arrow] kszucs commented on issue #6883: Prepare for the release candidate

2020-04-21 Thread GitBox
kszucs commented on issue #6883: URL: https://github.com/apache/arrow/pull/6883#issuecomment-617350272 The release is out, we can close this PR. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] paddyhoran opened a new pull request #7004: ARROW-3827: [Rust] Implement UnionArray Updated

2020-04-21 Thread GitBox
paddyhoran opened a new pull request #7004: URL: https://github.com/apache/arrow/pull/7004 Replaces #6209 due to git issues. Implements UnionArray. This PR was getting too big as it was so I will address the following as follow up PR's: ARROW-8546 ARROW-8547 Note

[GitHub] [arrow] kou commented on a change in pull request #7001: Use lowercase ws2_32 everywhere

2020-04-21 Thread GitBox
kou commented on a change in pull request #7001: URL: https://github.com/apache/arrow/pull/7001#discussion_r412492656 ## File path: cpp/cmake_modules/FindThrift.cmake ## @@ -100,7 +100,7 @@ if(Thrift_FOUND OR THRIFT_FOUND)

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7000: ARROW-8065: [C++][Dataset] Refactor ScanOptions and Fragment relation

2020-04-21 Thread GitBox
fsaintjacques commented on a change in pull request #7000: URL: https://github.com/apache/arrow/pull/7000#discussion_r412407398 ## File path: cpp/src/arrow/dataset/dataset.cc ## @@ -72,36 +78,15 @@ Result> Dataset::NewScan() { return NewScan(std::make_shared()); } -bool

[GitHub] [arrow] wesm commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-21 Thread GitBox
wesm commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r412474320 ## File path: cpp/src/parquet/file_reader.h ## @@ -117,6 +117,15 @@ class PARQUET_EXPORT ParquetFileReader { // Returns the file metadata. Only one

[GitHub] [arrow] github-actions[bot] commented on issue #7005: ARROW-8550: [CI] Don't run cron GHA jobs on forks

2020-04-21 Thread GitBox
github-actions[bot] commented on issue #7005: URL: https://github.com/apache/arrow/pull/7005#issuecomment-617456224 https://issues.apache.org/jira/browse/ARROW-8550 This is an automated message from the Apache Git Service.

[GitHub] [arrow] bkietz commented on a change in pull request #7000: ARROW-8065: [C++][Dataset] Refactor ScanOptions and Fragment relation

2020-04-21 Thread GitBox
bkietz commented on a change in pull request #7000: URL: https://github.com/apache/arrow/pull/7000#discussion_r412252930 ## File path: cpp/src/arrow/dataset/dataset.h ## @@ -84,13 +82,12 @@ class ARROW_DS_EXPORT Fragment { class ARROW_DS_EXPORT InMemoryFragment : public

[GitHub] [arrow] github-actions[bot] commented on issue #7004: ARROW-3827: [Rust] Implement UnionArray Updated

2020-04-21 Thread GitBox
github-actions[bot] commented on issue #7004: URL: https://github.com/apache/arrow/pull/7004#issuecomment-617398778 https://issues.apache.org/jira/browse/ARROW-3827 This is an automated message from the Apache Git Service.

[GitHub] [arrow] paddyhoran commented on issue #6306: ARROW-7705: [Rust] Initial sort implementation

2020-04-21 Thread GitBox
paddyhoran commented on issue #6306: URL: https://github.com/apache/arrow/pull/6306#issuecomment-617402123 @nevi-me this needs a rebase now. Once you do that, I'll take a look so we can get this merged. This is an

[GitHub] [arrow] kiszk commented on issue #6981: PARQUET-1845: [C++] Add expected results of Int96 in big-endian

2020-04-21 Thread GitBox
kiszk commented on issue #6981: URL: https://github.com/apache/arrow/pull/6981#issuecomment-617369335 After the 1-day investigation, I knew the implementation looks a little complicated. Regarding encoding, `TypedBufferBuilder` is used in some test cases, but it is not in some test

[GitHub] [arrow] working-estimate opened a new issue #7003: from pyarrow import parquet fails with AttributeError: type object 'pyarrow._parquet.Statistics' has no attribute '__reduce_cython__'

2020-04-21 Thread GitBox
working-estimate opened a new issue #7003: URL: https://github.com/apache/arrow/issues/7003 I have tried versions 0.15.1, 0.16.0, 0.17.0. Same error on all. I've seen in other issues that co-installations of tensorflow and numpy might be causing issues. I have tensorflow==1.14.0 and

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7000: ARROW-8065: [C++][Dataset] Refactor ScanOptions and Fragment relation

2020-04-21 Thread GitBox
fsaintjacques commented on a change in pull request #7000: URL: https://github.com/apache/arrow/pull/7000#discussion_r412452967 ## File path: python/pyarrow/tests/test_dataset.py ## @@ -671,41 +669,29 @@ def test_fragments(tempdir): f = fragments[0] # file's schema

[GitHub] [arrow] davidanthoff commented on a change in pull request #7001: Use lowercase ws2_32 everywhere

2020-04-21 Thread GitBox
davidanthoff commented on a change in pull request #7001: URL: https://github.com/apache/arrow/pull/7001#discussion_r412497174 ## File path: cpp/cmake_modules/FindThrift.cmake ## @@ -100,7 +100,7 @@ if(Thrift_FOUND OR THRIFT_FOUND)

[GitHub] [arrow] github-actions[bot] commented on issue #6995: ARROW-8549: [R] Assorted post-0.17 release cleanups

2020-04-21 Thread GitBox
github-actions[bot] commented on issue #6995: URL: https://github.com/apache/arrow/pull/6995#issuecomment-617445300 Revision: e7dbd9c977b765e618a40e997039be773c9f16bf Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] paddyhoran commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-21 Thread GitBox
paddyhoran commented on a change in pull request #6980: URL: https://github.com/apache/arrow/pull/6980#discussion_r412473292 ## File path: rust/arrow/src/array/builder.rs ## @@ -236,6 +251,14 @@ impl BufferBuilderTrait for BufferBuilder {

[GitHub] [arrow] wesm commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-21 Thread GitBox
wesm commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r412465641 ## File path: cpp/src/parquet/properties.h ## @@ -56,10 +60,32 @@ class PARQUET_EXPORT ReaderProperties { bool is_buffered_stream_enabled() const {

[GitHub] [arrow] nealrichardson commented on issue #6995: ARROW-8549: [R] Assorted post-0.17 release cleanups

2020-04-21 Thread GitBox
nealrichardson commented on issue #6995: URL: https://github.com/apache/arrow/pull/6995#issuecomment-617444872 @github-actions crossbow submit -g r This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] davidanthoff commented on issue #7001: Use lowercase ws2_32 everywhere

2020-04-21 Thread GitBox
davidanthoff commented on issue #7001: URL: https://github.com/apache/arrow/pull/7001#issuecomment-617365369 > How does BinaryBuilder compile Windows binaries on Linux? Using MinGW? Yes, it uses MinGW for Windows, but then it also cross-compiles to lots of other platforms. The PR

[GitHub] [arrow] paddyhoran commented on issue #6209: ARROW-3827: [Rust] Implement UnionArray

2020-04-21 Thread GitBox
paddyhoran commented on issue #6209: URL: https://github.com/apache/arrow/pull/6209#issuecomment-617389640 Closing and I'll open a new PR. This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] github-actions[bot] commented on issue #6995: ARROW-8549: [R] Assorted post-0.17 release cleanups

2020-04-21 Thread GitBox
github-actions[bot] commented on issue #6995: URL: https://github.com/apache/arrow/pull/6995#issuecomment-617442386 https://issues.apache.org/jira/browse/ARROW-8549 This is an automated message from the Apache Git Service.

[GitHub] [arrow] andygrove commented on a change in pull request #6770: ARROW-7842: [Rust] [Parquet] implement array_reader for list type columns

2020-04-22 Thread GitBox
andygrove commented on a change in pull request #6770: URL: https://github.com/apache/arrow/pull/6770#discussion_r413404215 ## File path: rust/datafusion/src/utils.rs ## @@ -74,6 +74,29 @@ macro_rules! make_string { }}; } +macro_rules! make_string_from_list { +

[GitHub] [arrow] andygrove commented on a change in pull request #6770: ARROW-7842: [Rust] [Parquet] implement array_reader for list type columns

2020-04-22 Thread GitBox
andygrove commented on a change in pull request #6770: URL: https://github.com/apache/arrow/pull/6770#discussion_r413404556 ## File path: rust/datafusion/src/utils.rs ## @@ -120,6 +143,7 @@ pub fn array_value_to_string(column: array::ArrayRef, row: usize) -> Result {

[GitHub] [arrow] andygrove commented on a change in pull request #7004: ARROW-3827: [Rust] Implement UnionArray Updated

2020-04-22 Thread GitBox
andygrove commented on a change in pull request #7004: URL: https://github.com/apache/arrow/pull/7004#discussion_r413408444 ## File path: rust/arrow/src/array/mod.rs ## @@ -85,6 +85,7 @@ mod array; mod builder; mod data; mod equal; +mod union; Review comment: I

[GitHub] [arrow] andygrove commented on a change in pull request #7004: ARROW-3827: [Rust] Implement UnionArray Updated

2020-04-22 Thread GitBox
andygrove commented on a change in pull request #7004: URL: https://github.com/apache/arrow/pull/7004#discussion_r413408241 ## File path: rust/arrow/src/array/equal.rs ## @@ -1046,6 +1062,30 @@ impl PartialEq for Value { } } +impl JsonEqual for UnionArray { +fn

[GitHub] [arrow] andygrove commented on issue #6972: ARROW-8287: [Rust] Add "pretty" util to help with printing tabular output of RecordBatches

2020-04-22 Thread GitBox
andygrove commented on issue #6972: URL: https://github.com/apache/arrow/pull/6972#issuecomment-618100341 Thanks @markhildreth for the detailed write-up in the JIRA! I've started looking through this. I'm not sure I understand all the points you made yet, but if there is a way to

[GitHub] [arrow] wesm commented on issue #5947: ARROW-7300: [C++][Gandiva] Implement functions to cast from strings to integers/floats

2020-04-22 Thread GitBox
wesm commented on issue #5947: URL: https://github.com/apache/arrow/pull/5947#issuecomment-618107435 @praveenbingo @projjal would you be able to take a look now? This is an automated message from the Apache Git Service. To

[GitHub] [arrow] fsaintjacques commented on issue #7011: ARROW-8554 [C++][Benchmark] Fix building error "cannot bind lvalue"

2020-04-22 Thread GitBox
fsaintjacques commented on issue #7011: URL: https://github.com/apache/arrow/pull/7011#issuecomment-61869 I understood that @bkietz added an entry for this, maybe it doesn't have the benchmark enabled? This is an

[GitHub] [arrow] wesm commented on issue #7011: ARROW-8554: [C++][Benchmark] Fix building error "cannot bind lvalue"

2020-04-22 Thread GitBox
wesm commented on issue #7011: URL: https://github.com/apache/arrow/pull/7011#issuecomment-618111873 manylinux1 uses gcc 4.8 but does not build the benchmarks. This is an automated message from the Apache Git Service. To

[GitHub] [arrow] houqp commented on issue #7009: ARROW-8552: [Rust] support iterate parquet row columns

2020-04-22 Thread GitBox
houqp commented on issue #7009: URL: https://github.com/apache/arrow/pull/7009#issuecomment-618158794 @nevi-me rebased and tests are passing now :) This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] andygrove commented on a change in pull request #6770: ARROW-7842: [Rust] [Parquet] implement array_reader for list type columns

2020-04-22 Thread GitBox
andygrove commented on a change in pull request #6770: URL: https://github.com/apache/arrow/pull/6770#discussion_r413403760 ## File path: rust/datafusion/src/utils.rs ## @@ -74,6 +74,29 @@ macro_rules! make_string { }}; } +macro_rules! make_string_from_list { +

[GitHub] [arrow] andygrove commented on a change in pull request #6770: ARROW-7842: [Rust] [Parquet] implement array_reader for list type columns

2020-04-22 Thread GitBox
andygrove commented on a change in pull request #6770: URL: https://github.com/apache/arrow/pull/6770#discussion_r413403453 ## File path: rust/datafusion/src/logicalplan.rs ## @@ -828,8 +828,8 @@ mod tests { .build()?; let expected = "Projection: #id\ -

[GitHub] [arrow] andygrove commented on issue #4140: ARROW-5123: [Rust] Parquet derive for simple structs

2020-04-22 Thread GitBox
andygrove commented on issue #4140: URL: https://github.com/apache/arrow/pull/4140#issuecomment-618098211 @bryantbiggs I will take a look at the release tag issue this weekend. This is an automated message from the Apache

[GitHub] [arrow] fsaintjacques commented on issue #7000: ARROW-8065: [C++][Dataset] Refactor ScanOptions and Fragment relation

2020-04-22 Thread GitBox
fsaintjacques commented on issue #7000: URL: https://github.com/apache/arrow/pull/7000#issuecomment-618110604 Addressed most comments and updated followup ticket with what's missing. PTAL and merge quickly so we can unblock the blocked tickets :)

[GitHub] [arrow] xuancong84 opened a new issue #7017: suggestion: why not serialize complex numbers in a Python list/dict/set

2020-04-22 Thread GitBox
xuancong84 opened a new issue #7017: URL: https://github.com/apache/arrow/issues/7017 Dear developers, I realize that complex numbers in Numpy arrays and Pandas dataframe/series can be serialized, but complex numbers in Python structures (e.g., `[1, 2.5, 3+1.j, np.nan]`) cannot be

[GitHub] [arrow] kou commented on issue #6996: ARROW-8538: [Packaging] Remove boost from homebrew formula

2020-04-22 Thread GitBox
kou commented on issue #6996: URL: https://github.com/apache/arrow/pull/6996#issuecomment-618066972 Could you make a JIRA? This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] mcassels commented on issue #6770: ARROW-7842 [Parquet][Rust] implement array_reader for list type columns

2020-04-22 Thread GitBox
mcassels commented on issue #6770: URL: https://github.com/apache/arrow/pull/6770#issuecomment-618082896 @andygrove @nevi-me do you have any thoughts on this? This is an automated message from the Apache Git Service. To

[GitHub] [arrow] wesm commented on issue #7011: ARROW-8554 [C++][Benchmark] Fix building error "cannot bind lvalue"

2020-04-22 Thread GitBox
wesm commented on issue #7011: URL: https://github.com/apache/arrow/pull/7011#issuecomment-618102656 Another gcc 4.8 issue here. We may need a more comprehensive build that also builds the benchmark executables This is an

[GitHub] [arrow] fsaintjacques commented on a change in pull request #6985: ARROW-8413: [C++][Parquet][WIP] Refactor Generating validity bitmap for values column

2020-04-22 Thread GitBox
fsaintjacques commented on a change in pull request #6985: URL: https://github.com/apache/arrow/pull/6985#discussion_r413428461 ## File path: cpp/src/arrow/util/bit_util.h ## @@ -43,13 +43,18 @@ #if defined(_MSC_VER) #include +#include #pragma intrinsic(_BitScanReverse)

[GitHub] [arrow] andygrove commented on a change in pull request #7004: ARROW-3827: [Rust] Implement UnionArray Updated

2020-04-22 Thread GitBox
andygrove commented on a change in pull request #7004: URL: https://github.com/apache/arrow/pull/7004#discussion_r413407658 ## File path: rust/arrow/src/array/equal.rs ## @@ -692,6 +692,22 @@ impl ArrayEqual for StructArray { } } +impl ArrayEqual for UnionArray { +

[GitHub] [arrow] andygrove commented on a change in pull request #7004: ARROW-3827: [Rust] Implement UnionArray Updated

2020-04-22 Thread GitBox
andygrove commented on a change in pull request #7004: URL: https://github.com/apache/arrow/pull/7004#discussion_r413407264 ## File path: rust/arrow/src/array/equal.rs ## @@ -692,6 +692,22 @@ impl ArrayEqual for StructArray { } } +impl ArrayEqual for UnionArray { +

[GitHub] [arrow] github-actions[bot] commented on issue #7014: ARROW-8563: GO Minor change to make newBuilder public

2020-04-22 Thread GitBox
github-actions[bot] commented on issue #7014: URL: https://github.com/apache/arrow/pull/7014#issuecomment-618187424 https://issues.apache.org/jira/browse/ARROW-8563 This is an automated message from the Apache Git Service.

[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox
tustvold commented on a change in pull request #6980: URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972 ## File path: rust/arrow/src/array/builder.rs ## @@ -236,6 +251,14 @@ impl BufferBuilderTrait for BufferBuilder {

[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox
tustvold commented on a change in pull request #6980: URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972 ## File path: rust/arrow/src/array/builder.rs ## @@ -236,6 +251,14 @@ impl BufferBuilderTrait for BufferBuilder {

[GitHub] [arrow] kou commented on issue #7008: ARROW-8551: [CI][Gandiva] Use LLVM 8 in gandiva linux build

2020-04-22 Thread GitBox
kou commented on issue #7008: URL: https://github.com/apache/arrow/pull/7008#issuecomment-617615843 @github-actions crossbow submit gandiva-jar-xenial This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] jianxind commented on a change in pull request #6954: ARROW-8440: [C++] Refine SIMD header files

2020-04-22 Thread GitBox
jianxind commented on a change in pull request #6954: URL: https://github.com/apache/arrow/pull/6954#discussion_r412709153 ## File path: docs/source/developers/benchmarks.rst ## @@ -59,7 +59,7 @@ Sometimes, it is required to pass custom CMake flags, e.g. .. code-block:: shell

[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox
tustvold commented on a change in pull request #6980: URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972 ## File path: rust/arrow/src/array/builder.rs ## @@ -236,6 +251,14 @@ impl BufferBuilderTrait for BufferBuilder {

[GitHub] [arrow] github-actions[bot] commented on issue #7008: ARROW-8551: [CI][Gandiva] Use LLVM 8 in gandiva linux build

2020-04-22 Thread GitBox
github-actions[bot] commented on issue #7008: URL: https://github.com/apache/arrow/pull/7008#issuecomment-617616494 Revision: 1e235ddc11ff6ee4620b62e3b5f9a318d117512b Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] markhildreth opened a new pull request #7006: ARROW-8508 [Rust] FixedSizeListArray improper offset for value

2020-04-21 Thread GitBox
markhildreth opened a new pull request #7006: URL: https://github.com/apache/arrow/pull/7006 Potentially Fixes ARROW-8508 Fixed size list arrays sourced with a non-zero offset of their child data was respecting this offset when calculating value offsets in the `value_offset`

[GitHub] [arrow] github-actions[bot] commented on issue #7007: ARROW-8537: [C++] Revert Optimizing BitmapReader

2020-04-21 Thread GitBox
github-actions[bot] commented on issue #7007: URL: https://github.com/apache/arrow/pull/7007#issuecomment-617494295 https://issues.apache.org/jira/browse/ARROW-8537 This is an automated message from the Apache Git Service.

[GitHub] [arrow] houqp commented on issue #7009: ARROW-8552: [Rust] support iterate parquet row columns

2020-04-22 Thread GitBox
houqp commented on issue #7009: URL: https://github.com/apache/arrow/pull/7009#issuecomment-617600824 looks like the windows CI is failing with error not related to my change: ``` "error: \'rustfmt.exe\' is not installed for the toolchain

[GitHub] [arrow] kou commented on issue #7008: ARROW-8551: [CI][Gandiva] Use LLVM 8 in gandiva linux build

2020-04-22 Thread GitBox
kou commented on issue #7008: URL: https://github.com/apache/arrow/pull/7008#issuecomment-617618030 Can we move the Docker image for building Gandiva on Linux to our `ci/docker/` like https://github.com/apache/arrow/blob/master/python/manylinux201x/Dockerfile-x86_64_base_2014 ?

[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox
tustvold commented on a change in pull request #6980: URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972 ## File path: rust/arrow/src/array/builder.rs ## @@ -236,6 +251,14 @@ impl BufferBuilderTrait for BufferBuilder {

[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox
tustvold commented on a change in pull request #6980: URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972 ## File path: rust/arrow/src/array/builder.rs ## @@ -236,6 +251,14 @@ impl BufferBuilderTrait for BufferBuilder {

[GitHub] [arrow] tustvold commented on a change in pull request #6980: ARROW-8516: [Rust] Improve PrimitiveBuilder::append_slice performance

2020-04-22 Thread GitBox
tustvold commented on a change in pull request #6980: URL: https://github.com/apache/arrow/pull/6980#discussion_r412736972 ## File path: rust/arrow/src/array/builder.rs ## @@ -236,6 +251,14 @@ impl BufferBuilderTrait for BufferBuilder {

[GitHub] [arrow] pitrou commented on a change in pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-23 Thread GitBox
pitrou commented on a change in pull request #6985: URL: https://github.com/apache/arrow/pull/6985#discussion_r413696312 ## File path: cpp/cmake_modules/SetupCxxFlags.cmake ## @@ -40,12 +40,13 @@ if(ARROW_CPU_FLAG STREQUAL "x86") set(CXX_SUPPORTS_SSE4_2 TRUE) else()

[GitHub] [arrow] nevi-me opened a new pull request #7018: ARROW-8536: [Rust] [Flight] Check in proto file, conditional build if file exists

2020-04-23 Thread GitBox
nevi-me opened a new pull request #7018: URL: https://github.com/apache/arrow/pull/7018 When a user compiles the `flight` crate, a `build.rs` script is invoked. This script recursively looks for the `format/Flight.proto` path. A user might not have that path, as they would not have cloned

[GitHub] [arrow] pitrou commented on a change in pull request #6959: ARROW-5649: [Integration][C++] Create integration test for extension types

2020-04-23 Thread GitBox
pitrou commented on a change in pull request #6959: URL: https://github.com/apache/arrow/pull/6959#discussion_r413762177 ## File path: python/pyarrow/tests/test_extension_type.py ## @@ -445,22 +445,28 @@ def test_parquet(tmpdir, registered_period_type): import base64

[GitHub] [arrow] pitrou commented on issue #6954: ARROW-8440: [C++] Refine SIMD header files

2020-04-23 Thread GitBox
pitrou commented on issue #6954: URL: https://github.com/apache/arrow/pull/6954#issuecomment-618329962 cc @emkornfield This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] pitrou commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-23 Thread GitBox
pitrou commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r413715323 ## File path: cpp/src/arrow/filesystem/s3fs_benchmark.cc ## @@ -331,10 +358,64 @@ BENCHMARK_DEFINE_F(MinioFixture, ReadCoalesced500Mib)(benchmark::State&

[GitHub] [arrow] lidavidm commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-23 Thread GitBox
lidavidm commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r413744080 ## File path: cpp/src/parquet/file_reader.cc ## @@ -212,6 +237,21 @@ class SerializedFile : public ParquetFileReader::Contents { file_metadata_ =

[GitHub] [arrow] lidavidm commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-23 Thread GitBox
lidavidm commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r413757274 ## File path: cpp/src/parquet/file_reader.cc ## @@ -212,6 +237,21 @@ class SerializedFile : public ParquetFileReader::Contents { file_metadata_ =

[GitHub] [arrow] mrkn commented on issue #6667: ARROW-8162: [Format][Python] Add serialization for CSF sparse tensors to Python

2020-04-22 Thread GitBox
mrkn commented on issue #6667: URL: https://github.com/apache/arrow/pull/6667#issuecomment-618185887 @rok Thank you for working this! I'll merge this. This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #6959: ARROW-5649: [Integration][C++] Create integration test for extension types

2020-04-23 Thread GitBox
jorisvandenbossche commented on a change in pull request #6959: URL: https://github.com/apache/arrow/pull/6959#discussion_r413750283 ## File path: python/pyarrow/tests/test_extension_type.py ## @@ -445,22 +445,28 @@ def test_parquet(tmpdir, registered_period_type): import

[GitHub] [arrow] pitrou commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-23 Thread GitBox
pitrou commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r413760028 ## File path: cpp/src/parquet/file_reader.cc ## @@ -212,6 +237,21 @@ class SerializedFile : public ParquetFileReader::Contents { file_metadata_ =

[GitHub] [arrow] pitrou commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-23 Thread GitBox
pitrou commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r413760214 ## File path: cpp/src/parquet/file_reader.cc ## @@ -536,6 +577,14 @@ std::shared_ptr ParquetFileReader::RowGroup(int i) { return

[GitHub] [arrow] github-actions[bot] commented on issue #7018: ARROW-8536: [Rust] [Flight] Check in proto file, conditional build if file exists

2020-04-23 Thread GitBox
github-actions[bot] commented on issue #7018: URL: https://github.com/apache/arrow/pull/7018#issuecomment-618349230 https://issues.apache.org/jira/browse/ARROW-8536 This is an automated message from the Apache Git Service.

[GitHub] [arrow] lidavidm commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-23 Thread GitBox
lidavidm commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r413763701 ## File path: cpp/src/parquet/file_reader.cc ## @@ -212,6 +237,21 @@ class SerializedFile : public ParquetFileReader::Contents { file_metadata_ =

[GitHub] [arrow] emkornfield commented on issue #6985: ARROW-8413: [C++][Parquet][WIP] Refactor Generating validity bitmap for values column

2020-04-23 Thread GitBox
emkornfield commented on issue #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-618231752 CC @wesm @pitrou I think this is ready for review now. This is an automated message from the Apache Git Service. To

[GitHub] [arrow] kszucs commented on issue #6998: ARROW-8541: [Release] Don't remove previous source releases automatically

2020-04-23 Thread GitBox
kszucs commented on issue #6998: URL: https://github.com/apache/arrow/pull/6998#issuecomment-618328429 @kou updated This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #7000: ARROW-8065: [C++][Dataset] Refactor ScanOptions and Fragment relation

2020-04-23 Thread GitBox
jorisvandenbossche commented on a change in pull request #7000: URL: https://github.com/apache/arrow/pull/7000#discussion_r413622598 ## File path: cpp/src/arrow/dataset/file_parquet.cc ## @@ -402,23 +401,21 @@ Result ParquetFileFormat::ScanFile( } Result>

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #6992: ARROW-7950: [Python] Determine + test minimal pandas version + raise error when pandas is too old

2020-04-23 Thread GitBox
jorisvandenbossche commented on a change in pull request #6992: URL: https://github.com/apache/arrow/pull/6992#discussion_r413740270 ## File path: python/pyarrow/pandas-shim.pxi ## @@ -55,6 +55,16 @@ cdef class _PandasAPIShim(object): from distutils.version import

[GitHub] [arrow] wesm commented on issue #7017: suggestion: why not serialize complex numbers in a Python list/dict/set

2020-04-23 Thread GitBox
wesm commented on issue #7017: URL: https://github.com/apache/arrow/issues/7017#issuecomment-618352283 Can you send an email to one of the mailing lists or open a JIRA if you want to propose a development project? This is

[GitHub] [arrow] pitrou commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-23 Thread GitBox
pitrou commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r413747770 ## File path: cpp/src/parquet/file_reader.cc ## @@ -212,6 +237,21 @@ class SerializedFile : public ParquetFileReader::Contents { file_metadata_ =

[GitHub] [arrow] lidavidm commented on a change in pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-04-23 Thread GitBox
lidavidm commented on a change in pull request #6744: URL: https://github.com/apache/arrow/pull/6744#discussion_r413759177 ## File path: cpp/src/parquet/file_reader.cc ## @@ -212,6 +237,21 @@ class SerializedFile : public ParquetFileReader::Contents { file_metadata_ =

[GitHub] [arrow] jorisvandenbossche commented on issue #6992: ARROW-7950: [Python] Determine + test minimal pandas version + raise error when pandas is too old

2020-04-23 Thread GitBox
jorisvandenbossche commented on issue #6992: URL: https://github.com/apache/arrow/pull/6992#issuecomment-618366876 cc @wesm @xhochy @BryanCutler are you fine with 1) a hard required minimal pandas version? (meaning: we don't use the pandas integration if an older version is

[GitHub] [arrow] pitrou commented on issue #7002: ARROW-8543: [C++] Single pass coalescing algorithm + Rebase

2020-04-21 Thread GitBox
pitrou commented on issue #7002: URL: https://github.com/apache/arrow/pull/7002#issuecomment-617339230 The original PR message is slightly misleading: both algorithms have the same complexity (O(N) except for the sorting step which is O(N log N)). However, it's true that the new algorithm

[GitHub] [arrow] fsaintjacques commented on issue #6986: ARROW-8523: [C++] Optimize BitmapReader

2020-04-21 Thread GitBox
fsaintjacques commented on issue #6986: URL: https://github.com/apache/arrow/pull/6986#issuecomment-617368021 See either `archery benchmark diff --help` or the [benchmark](https://arrow.apache.org/docs/developers/benchmarks.html) section of the documentation. Archery can compare the same

[GitHub] [arrow] paddyhoran commented on issue #7004: ARROW-3827: [Rust] Implement UnionArray Updated

2020-04-21 Thread GitBox
paddyhoran commented on issue #7004: URL: https://github.com/apache/arrow/pull/7004#issuecomment-617399915 @kszucs it's failing due to `rustfmt` not being installed before testing the flight crate, any idea why this would be the case? Sorry, I don't know much about GitHub actions yet...

[GitHub] [arrow] nealrichardson opened a new pull request #7005: ARROW-8550: [CI] Don't run cron GHA jobs on forks

2020-04-21 Thread GitBox
nealrichardson opened a new pull request #7005: URL: https://github.com/apache/arrow/pull/7005 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

<    1   2   3   4   5   6   7   8   9   10   >