[GitHub] [arrow] emkornfield commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-30 Thread GitBox
emkornfield commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-621974840 @wesm wanted to make sure this is still on your radar This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-30 Thread GitBox
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-621993208 Yep sorry thanks for the nudge, will look today This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622023582 Revision: 5988be2a5f9b283a8bc1012714fc03ff57b453c4 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] fsaintjacques opened a new pull request #7075: ARROW-8447: [C++] Ensure row ordering in Scanner::ToTable

2020-04-30 Thread GitBox
fsaintjacques opened a new pull request #7075: URL: https://github.com/apache/arrow/pull/7075 * This fixes the issue where ScanTask would race to push to the accumulating RecordBatchVector. The new version assign an ordered index to each ScanTask preserving the order in which they were

[GitHub] [arrow] yordan-pavlov commented on pull request #7037: ARROW-6718: [Rust] Remove packed_simd

2020-04-30 Thread GitBox
yordan-pavlov commented on pull request #7037: URL: https://github.com/apache/arrow/pull/7037#issuecomment-621983970 Hi, I thought I would do some profiling yesterday (before packed_simd is removed) and noticed that a lot of time in `simd_compare_op` is spent in this loop here:

[GitHub] [arrow] kszucs opened a new pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
kszucs opened a new pull request #7074: URL: https://github.com/apache/arrow/pull/7074 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] kszucs commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
kszucs commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-621967506 @github-actions crossbow submit wheel-win-* This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] github-actions[bot] commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-621971057 https://issues.apache.org/jira/browse/ARROW-8656 This is an automated message from the Apache Git

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418162946 ## File path: csharp/src/Apache.Arrow/Arrays/BinaryArray.cs ## @@ -111,23 +130,32 @@ public TBuilder AppendRange(IEnumerable values) public

[GitHub] [arrow] github-actions[bot] commented on pull request #7075: ARROW-8447: [C++] Ensure row deterministic ordering in Scanner::ToTable

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7075: URL: https://github.com/apache/arrow/pull/7075#issuecomment-622034299 https://issues.apache.org/jira/browse/ARROW-8447 This is an automated message from the Apache Git

[GitHub] [arrow] bkietz commented on a change in pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-04-30 Thread GitBox
bkietz commented on a change in pull request #7073: URL: https://github.com/apache/arrow/pull/7073#discussion_r418193373 ## File path: cpp/src/arrow/dataset/file_base.cc ## @@ -83,131 +83,67 @@ Result FileFragment::Scan(std::shared_ptr options

[GitHub] [arrow] kszucs commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
kszucs commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622045514 @github-actions crossbow submit wheel-win-cp38 This is an automated message from the Apache Git Service. To

[GitHub] [arrow] kszucs commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
kszucs commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622022639 @github-actions crossbow submit wheel-win-cp38 This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622051056 Revision: 5a0c01cc93b5d4357cab19b27f9397e977a76277 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] bkietz commented on a change in pull request #7075: ARROW-8447: [C++] Ensure row deterministic ordering in Scanner::ToTable

2020-04-30 Thread GitBox
bkietz commented on a change in pull request #7075: URL: https://github.com/apache/arrow/pull/7075#discussion_r418216405 ## File path: cpp/src/arrow/dataset/scanner.cc ## @@ -165,23 +165,47 @@ std::shared_ptr ScanContext::TaskGroup() const { return TaskGroup::MakeSerial();

[GitHub] [arrow] fsaintjacques edited a comment on pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-04-30 Thread GitBox
fsaintjacques edited a comment on pull request #7073: URL: https://github.com/apache/arrow/pull/7073#issuecomment-621966208 > Shouldn't we require that? That seems the goal of UnionDataset to combine datasets with different formats Maybe, this is still enforced if you use the

[GitHub] [arrow] yordan-pavlov edited a comment on pull request #7037: ARROW-6718: [Rust] Remove packed_simd

2020-04-30 Thread GitBox
yordan-pavlov edited a comment on pull request #7037: URL: https://github.com/apache/arrow/pull/7037#issuecomment-621983970 Hi, I thought I would do some profiling yesterday (to help make sure packed_simd is not removed prematurely) and noticed that a lot of time in `simd_compare_op`

[GitHub] [arrow] github-actions[bot] commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-621968378 Revision: f64fd002135d7bb90cfb2725d01d3ccc73b809fa Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] fsaintjacques commented on pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-04-30 Thread GitBox
fsaintjacques commented on pull request #7073: URL: https://github.com/apache/arrow/pull/7073#issuecomment-622092322 Due to R failure (that I didn't catch because my installation was broken and using and old version of arrow), I'll revert the FileSystemDataset::format and make sure

[GitHub] [arrow] github-actions[bot] commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-622062939 Revision: 3cd96ea48a2116322b2fec06207fb1d624e0f969 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] nealrichardson commented on pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

2020-04-30 Thread GitBox
nealrichardson commented on pull request #7064: URL: https://github.com/apache/arrow/pull/7064#issuecomment-622139195 Column I in the first sheet shows which test (generated) files cover which types. So many are covered in a single "test", in that the test JSON it produces includes many

[GitHub] [arrow] nealrichardson commented on pull request #7018: ARROW-8536: [Rust] [Flight] Check in proto file, conditional build if file exists

2020-04-30 Thread GitBox
nealrichardson commented on pull request #7018: URL: https://github.com/apache/arrow/pull/7018#issuecomment-622177076 So `rust/arrow-flight/src/arrow.flight.protocol.rs` is generated from `format/Flight.proto`? There is precedent for adding generated files to `rat_excluded_files.txt`, so

[GitHub] [arrow] kszucs commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
kszucs commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622083097 @github-actions crossbow submit wheel-win-cp38 This is an automated message from the Apache Git Service. To

[GitHub] [arrow] andygrove commented on a change in pull request #7035: ARROW-8590: [Rust] Use arrow crate pretty util in DataFusion

2020-04-30 Thread GitBox
andygrove commented on a change in pull request #7035: URL: https://github.com/apache/arrow/pull/7035#discussion_r418346377 ## File path: rust/arrow/src/util/pretty.rs ## @@ -27,18 +27,18 @@ use prettytable::{Cell, Row, Table}; use crate::error::{ArrowError, Result}; ///!

[GitHub] [arrow] tustvold opened a new pull request #7076: ARROW-8659: [Rust] ListBuilder allocate with_capacity

2020-04-30 Thread GitBox
tustvold opened a new pull request #7076: URL: https://github.com/apache/arrow/pull/7076 Both ListBuilder and FixedSizeListBuilder accept a values_builder as a constructor argument and then set the capacity of their internal builders based off the length of this values_builder.

[GitHub] [arrow] andygrove commented on pull request #7018: ARROW-8536: [Rust] [Flight] Check in proto file, conditional build if file exists

2020-04-30 Thread GitBox
andygrove commented on pull request #7018: URL: https://github.com/apache/arrow/pull/7018#issuecomment-622174765 @nealrichardson Maybe you have an opinion on this? This is the issue I mentioned on the sync call. This is an

[GitHub] [arrow] github-actions[bot] commented on pull request #7076: ARROW-8659: [Rust] ListBuilder allocate with_capacity

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7076: URL: https://github.com/apache/arrow/pull/7076#issuecomment-622178164 https://issues.apache.org/jira/browse/ARROW-8659 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622102421 Revision: d17f2c212f28bf672a6f46d1dbe017d632707271 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-30 Thread GitBox
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622155576 I think this is fine to merge once most of the typos in the comments are fixed. A rebase will probably fix the Rust lint error

[GitHub] [arrow] tustvold commented on a change in pull request #7076: ARROW-8659: [Rust] ListBuilder allocate with_capacity

2020-04-30 Thread GitBox
tustvold commented on a change in pull request #7076: URL: https://github.com/apache/arrow/pull/7076#discussion_r418347285 ## File path: rust/parquet/src/arrow/converter.rs ## @@ -128,7 +128,10 @@ pub struct Utf8ArrayConverter {} impl Converter>, StringArray> for

[GitHub] [arrow] wesm commented on pull request #7060: ARROW-8619: [C++] Use distinct enum values for MonthInterval, DayTimeInterval

2020-04-30 Thread GitBox
wesm commented on pull request #7060: URL: https://github.com/apache/arrow/pull/7060#issuecomment-622156273 The ursabot build failures are spurious This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] yordan-pavlov edited a comment on pull request #7037: ARROW-6718: [Rust] Remove packed_simd

2020-04-30 Thread GitBox
yordan-pavlov edited a comment on pull request #7037: URL: https://github.com/apache/arrow/pull/7037#issuecomment-621983970 Hi, I thought I would do some profiling yesterday (to help make sure packed_simd is not removed prematurely) and noticed that a lot of time in `simd_compare_op`

[GitHub] [arrow] wesm commented on pull request #6707: ARROW-300: [Format] Proposal for "trivial" IPC body buffer compression using either LZ4 or ZSTD codecs

2020-04-30 Thread GitBox
wesm commented on pull request #6707: URL: https://github.com/apache/arrow/pull/6707#issuecomment-622157711 +1. I updated the C++ generated Flatbuffers files. Will merge this once the builds run as the vote has passed on the mailing list

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418382595 ## File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs ## @@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418382595 ## File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs ## @@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418382595 ## File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs ## @@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418393347 ## File path: csharp/src/Apache.Arrow/Arrays/ListArray.cs ## @@ -135,6 +152,11 @@ public int GetValueOffset(int index) public int

[GitHub] [arrow] emkornfield commented on pull request #7066: ARROW-8634: [Java] Add Getting Started section to Java README

2020-04-30 Thread GitBox
emkornfield commented on pull request #7066: URL: https://github.com/apache/arrow/pull/7066#issuecomment-622245195 LGTM, thanks, sorry you had to learn the hard way. This is an automated message from the Apache Git Service.

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418419358 ## File path: csharp/src/Apache.Arrow/Arrays/BooleanArray.cs ## @@ -153,17 +184,25 @@ private void CheckIndex(int index) new[] {

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418383314 ## File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs ## @@ -162,8 +188,8 @@ public TBuilder Swap(int i, int j) public TArray

[GitHub] [arrow] github-actions[bot] commented on pull request #7077: ARROW-8660: [C++][Gandiva] Reduce usage of Boost in Gandiva codebase

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7077: URL: https://github.com/apache/arrow/pull/7077#issuecomment-622215642 https://issues.apache.org/jira/browse/ARROW-8660 This is an automated message from the Apache Git

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418392611 ## File path: csharp/src/Apache.Arrow/Arrays/BooleanArray.cs ## @@ -153,17 +184,25 @@ private void CheckIndex(int index) new[] {

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418393025 ## File path: csharp/src/Apache.Arrow/Arrays/ListArray.cs ## @@ -69,25 +83,28 @@ public ListArray Build(MemoryAllocator allocator = default)

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418417744 ## File path: csharp/test/Apache.Arrow.Tests/BooleanArrayTests.cs ## @@ -48,13 +48,13 @@ public void AppendsExpectedBit()

[GitHub] [arrow] nevi-me commented on pull request #7018: ARROW-8536: [Rust] [Flight] Check in proto file, conditional build if file exists

2020-04-30 Thread GitBox
nevi-me commented on pull request #7018: URL: https://github.com/apache/arrow/pull/7018#issuecomment-622254639 I'll address this later today This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418422210 ## File path: csharp/src/Apache.Arrow/Arrays/BooleanArray.cs ## @@ -153,17 +184,25 @@ private void CheckIndex(int index) new[] {

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418424144 ## File path: csharp/src/Apache.Arrow/Arrays/ListArray.cs ## @@ -135,6 +152,11 @@ public int GetValueOffset(int index) public int

[GitHub] [arrow] wesm commented on pull request #6631: ARROW-8111: [C++][CSV] Support MM/DD/YYYY date format

2020-04-30 Thread GitBox
wesm commented on pull request #6631: URL: https://github.com/apache/arrow/pull/6631#issuecomment-622195001 Thank you. There are some code linting issues and other code style issues (we follow the Google C++ style guide), can you fix the CI builds? I'd like to kick the tires a bit on this

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418384847 ## File path: csharp/src/Apache.Arrow/Arrays/StringArray.cs ## @@ -71,6 +76,15 @@ public string GetString(int index, Encoding encoding = default)

[GitHub] [arrow] vibhatha opened a new issue #7078: Pyarrow building from source along with CPP Libraries to link to another Cython API

2020-04-30 Thread GitBox
vibhatha opened a new issue #7078: URL: https://github.com/apache/arrow/issues/7078 I am trying to integrate arrow with an application that I am developing. Here I build Arrow from the source (CPP) and use the API to develop some custom functions to do a scientific calculation after data

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418385072 ## File path: csharp/src/Apache.Arrow/Arrays/StringArray.cs ## @@ -71,6 +76,15 @@ public string GetString(int index, Encoding encoding = default)

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418388163 ## File path: csharp/src/Apache.Arrow/Arrays/BooleanArray.cs ## @@ -153,17 +182,25 @@ private void CheckIndex(int index) new[] {

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418424732 ## File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs ## @@ -53,6 +60,26 @@ public sealed class ArrayData : IDisposable Offset = offset;

[GitHub] [arrow] pravindra commented on a change in pull request #7070: ARROW-8646: [Java] Allow UnionListWriter to write null values

2020-04-30 Thread GitBox
pravindra commented on a change in pull request #7070: URL: https://github.com/apache/arrow/pull/7070#discussion_r418421121 ## File path: java/vector/src/main/codegen/templates/UnionListWriter.java ## @@ -178,6 +178,10 @@ public void write(DecimalHolder holder) {

[GitHub] [arrow] wesm commented on issue #7063: client delete (of objectid) causes an exception and abort

2020-04-30 Thread GitBox
wesm commented on issue #7063: URL: https://github.com/apache/arrow/issues/7063#issuecomment-622214971 You can also send an e-mail to the dev@ mailing list. Closing this issue since we don't do dev or user discussions on GitHub

[GitHub] [arrow] wesm commented on issue #7078: Pyarrow building from source along with CPP Libraries to link to another Cython API

2020-04-30 Thread GitBox
wesm commented on issue #7078: URL: https://github.com/apache/arrow/issues/7078#issuecomment-622214615 Can you please ask on the mailing list? We don't provide user help on GitHub. This is an automated message from the

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418391298 ## File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs ## @@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu

[GitHub] [arrow] emkornfield commented on a change in pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-04-30 Thread GitBox

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418423261 ## File path: csharp/src/Apache.Arrow/Arrays/ListArray.cs ## @@ -69,25 +83,28 @@ public ListArray Build(MemoryAllocator allocator = default)

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418380212 ## File path: csharp/src/Apache.Arrow/Arrays/PrimitiveArrayBuilder.cs ## @@ -99,55 +105,75 @@ public abstract class PrimitiveArrayBuilder : IArrowArrayBu

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418392155 ## File path: csharp/test/Apache.Arrow.Tests/BooleanArrayTests.cs ## @@ -48,13 +48,13 @@ public void AppendsExpectedBit()

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418392155 ## File path: csharp/test/Apache.Arrow.Tests/BooleanArrayTests.cs ## @@ -48,13 +48,13 @@ public void AppendsExpectedBit()

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418394280 ## File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs ## @@ -53,6 +60,26 @@ public sealed class ArrayData : IDisposable Offset = offset;

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418423816 ## File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs ## @@ -22,6 +22,8 @@ namespace Apache.Arrow { public sealed class ArrayData : IDisposable

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418425841 ## File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs ## @@ -53,6 +60,26 @@ public sealed class ArrayData : IDisposable Offset = offset;

[GitHub] [arrow] wesm opened a new pull request #7077: ARROW-8660: [C++][Gandiva] Reduce usage of Boost in Gandiva codebase

2020-04-30 Thread GitBox
wesm opened a new pull request #7077: URL: https://github.com/apache/arrow/pull/7077 I noticed this while reading the Gandiva codebase as part of the C++ precompiled kernels revamp project. In general we've tried to reduce our use of Boost -- if we can eliminate Boost altogether from

[GitHub] [arrow] zgramana commented on pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
zgramana commented on pull request #7032: URL: https://github.com/apache/arrow/pull/7032#issuecomment-622217498 @eerhardt I think I covered everything. I also discovered that I had omitted adding `null` support to `ListArray.Builder` so I added that too (including test coverage) in the

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418394280 ## File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs ## @@ -53,6 +60,26 @@ public sealed class ArrayData : IDisposable Offset = offset;

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-04-30 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418394031 ## File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs ## @@ -22,6 +22,8 @@ namespace Apache.Arrow { public sealed class ArrayData : IDisposable

[GitHub] [arrow] jorisvandenbossche commented on pull request #6303: ARROW-8039: [Python] Use dataset API in existing parquet readers and tests

2020-04-30 Thread GitBox
jorisvandenbossche commented on pull request #6303: URL: https://github.com/apache/arrow/pull/6303#issuecomment-621937113 I finally listed the open TODO items from the discussions in this PR / the skipped tests, and opened JIRAs where this was not yet the case: - Deduplicating the

[GitHub] [arrow] wesm commented on pull request #7060: ARROW-8619: [C++] Use distinct enum values for MonthInterval, DayTimeInterval

2020-04-30 Thread GitBox
wesm commented on pull request #7060: URL: https://github.com/apache/arrow/pull/7060#issuecomment-621933237 For some reason I can't get JNI running in my local setup ``` CMake Error at /home/wesm/cpp-toolchain/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146

[GitHub] [arrow] kszucs commented on a change in pull request #7067: ARROW-8639: [C++][Plasma] Require gflags

2020-04-30 Thread GitBox
kszucs commented on a change in pull request #7067: URL: https://github.com/apache/arrow/pull/7067#discussion_r418109900 ## File path: cpp/cmake_modules/FindgflagsAlt.cmake ## @@ -15,6 +15,8 @@ # specific language governing permissions and limitations # under the License.

[GitHub] [arrow] wesm commented on pull request #7060: ARROW-8619: [C++] Use distinct enum values for MonthInterval, DayTimeInterval

2020-04-30 Thread GitBox
wesm commented on pull request #7060: URL: https://github.com/apache/arrow/pull/7060#issuecomment-621872082 Will fix the JNI issue and will notify the mailing list This is an automated message from the Apache Git Service. To

[GitHub] [arrow] bkietz commented on a change in pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
bkietz commented on a change in pull request #7033: URL: https://github.com/apache/arrow/pull/7033#discussion_r418108005 ## File path: cpp/src/arrow/dataset/file_csv.cc ## @@ -0,0 +1,136 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] kszucs commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
kszucs commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621957531 @github-actions crossbow submit test-debian-10-cpp This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621884756 Revision: 5c0b02dd7947e1e61da701169cc5fafb9135a6e5 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] kszucs commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
kszucs commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621883711 @github-actions crossbow submit test-debian-10-cpp test-debian-10-go-1.12 test-conda-python-3.7 This is an

[GitHub] [arrow] bkietz commented on pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
bkietz commented on pull request #7033: URL: https://github.com/apache/arrow/pull/7033#issuecomment-621814564 I'm happy to implement whatever configuration is agreeable. I'll add a list of the approaches which have been discussed here to the follow-up so we can discuss them there.

[GitHub] [arrow] markhildreth opened a new pull request #7072: ARROW-8648: [Rust] Optimize Rust CI Workflows

2020-04-30 Thread GitBox
markhildreth opened a new pull request #7072: URL: https://github.com/apache/arrow/pull/7072 WIP This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] github-actions[bot] commented on pull request #7072: ARROW-8648: [Rust] Optimize Rust CI Workflows

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7072: URL: https://github.com/apache/arrow/pull/7072#issuecomment-621910533 https://issues.apache.org/jira/browse/ARROW-8648 This is an automated message from the Apache Git

[GitHub] [arrow] bkietz commented on a change in pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
bkietz commented on a change in pull request #7033: URL: https://github.com/apache/arrow/pull/7033#discussion_r418109100 ## File path: cpp/src/arrow/dataset/file_csv.cc ## @@ -0,0 +1,136 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] github-actions[bot] commented on pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7073: URL: https://github.com/apache/arrow/pull/7073#issuecomment-621937629 https://issues.apache.org/jira/browse/ARROW-8318 This is an automated message from the Apache Git

[GitHub] [arrow] andygrove commented on pull request #7066: ARROW-8634: [Java] Add Java examples

2020-04-30 Thread GitBox
andygrove commented on pull request #7066: URL: https://github.com/apache/arrow/pull/7066#issuecomment-621873527 Ah, I wish I had found this documentation before I started using Arrow Java! OK, I will just add links to the README instead then. Thanks.

[GitHub] [arrow] liyafan82 commented on pull request #7071: ARROW-7955: [Java] Support large buffer for file/stream IPC

2020-04-30 Thread GitBox
liyafan82 commented on pull request #7071: URL: https://github.com/apache/arrow/pull/7071#issuecomment-621896058 Also add an integration test for VarCharVector, as it is possible that the size of the offset buffer be larger than Integer.MAX_VALUE

[GitHub] [arrow] fsaintjacques opened a new pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-04-30 Thread GitBox
fsaintjacques opened a new pull request #7073: URL: https://github.com/apache/arrow/pull/7073 * Simplified FileSystemDataset to hold a FragmentVector. Each Fragment must be a FileFragment and is checked at `FileSystemDataset::Make`. Fragments are not required to use the same

[GitHub] [arrow] github-actions[bot] commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621958397 Revision: 0b532b06da2ffab64d34d4420790eee3e4f64ca3 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] jorisvandenbossche commented on pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-04-30 Thread GitBox
jorisvandenbossche commented on pull request #7073: URL: https://github.com/apache/arrow/pull/7073#issuecomment-621940060 > Fragments are not required to use the same backing filesystem nor the same format. Shouldn't we require that? That seems the goal of UnionDataset to combine

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
fsaintjacques commented on a change in pull request #7033: URL: https://github.com/apache/arrow/pull/7033#discussion_r417959216 ## File path: cpp/src/arrow/dataset/file_csv.cc ## @@ -0,0 +1,136 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] kszucs commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
kszucs commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621946996 @github-actions crossbow submit test-debian-10-cpp This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #7068: ARROW-8592: [C++] Update docs to reflect LLVM 8

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7068: URL: https://github.com/apache/arrow/pull/7068#issuecomment-621635478 https://issues.apache.org/jira/browse/ARROW-8592 This is an automated message from the Apache Git

[GitHub] [arrow] emkornfield opened a new pull request #7068: ARROW-8592: [C++] Update docs to reflect LLVM 8

2020-04-30 Thread GitBox
emkornfield opened a new pull request #7068: URL: https://github.com/apache/arrow/pull/7068 I'm not sure about the windows build instructions, if someone know for sure it would be helpful. This is an automated message from

[GitHub] [arrow] zhztheplayer commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-04-30 Thread GitBox
zhztheplayer commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r415667244 ## File path: cpp/src/jni/dataset/jni_wrapper.cpp ## @@ -0,0 +1,577 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] emkornfield commented on issue #7065: Cython Table API Access gives an error libarrow_python.so.16: undefined symbol: _ZNK5arrow8DataType18ComputeFingerprintEv

2020-04-30 Thread GitBox
emkornfield commented on issue #7065: URL: https://github.com/apache/arrow/issues/7065#issuecomment-621630326 closing this in lieu of JIRA. Thanks @vibhatha This is an automated message from the Apache Git Service. To

[GitHub] [arrow] emkornfield commented on pull request #7067: ARROW-8639: [C++][Plasma] Require gflags

2020-04-30 Thread GitBox
emkornfield commented on pull request #7067: URL: https://github.com/apache/arrow/pull/7067#issuecomment-621634453 Is centos-6-amd64 failure OK? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] zhztheplayer commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-04-30 Thread GitBox
zhztheplayer commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r417789771 ## File path: java/dataset/src/main/java/org/apache/arrow/dataset/source/DatasetFactory.java ## @@ -0,0 +1,34 @@ +/* + * Licensed to the Apache

[GitHub] [arrow] zhztheplayer commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-04-30 Thread GitBox
zhztheplayer commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r417790061 ## File path: java/pom.xml ## @@ -369,24 +369,24 @@ org.apache.maven.plugins maven-compiler-plugin 3.6.2 -

[GitHub] [arrow] emkornfield commented on pull request #7067: ARROW-8639: [C++][Plasma] Require gflags

2020-04-30 Thread GitBox
emkornfield commented on pull request #7067: URL: https://github.com/apache/arrow/pull/7067#issuecomment-621634239 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] zhztheplayer commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-04-30 Thread GitBox
zhztheplayer commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r417789282 ## File path: java/dataset/src/main/java/org/apache/arrow/dataset/jni/JniWrapper.java ## @@ -0,0 +1,61 @@ +/* + * Licensed to the Apache Software

[GitHub] [arrow] jianxind commented on pull request #7029: ARROW-8579 [C++] Add AVX512 SIMD for spaced decoding and encoding.

2020-04-30 Thread GitBox
jianxind commented on pull request #7029: URL: https://github.com/apache/arrow/pull/7029#issuecomment-621704246 > I'd gladly see a AVX2 or SSE version indeed, as many CPUs don't have AVX512. @pitrou @emkornfield Yeah, I has a version of SSE, would you like me to append it to

[GitHub] [arrow] github-actions[bot] commented on pull request #7069: ARROW-8645: [C++] Missing gflags dependency for plasma

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7069: URL: https://github.com/apache/arrow/pull/7069#issuecomment-621710865 Revision: e5402cdb93d59b17d2a83266c06906769a2ed684 Submitted crossbow builds: [ursa-labs/crossbow @

  1   2   >