[GitHub] [arrow] kszucs commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-01 Thread GitBox
kszucs commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622356800 @github-actions crossbow submit wheel-win-* This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] github-actions[bot] commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-01 Thread GitBox
github-actions[bot] commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622357099 Revision: 8852e2f5f32402ca9c85877289c7948db141cca7 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622373323 Take a look at how this is currently being handled in NumPy * https://numpy.org/neps/nep-0038-SIMD-optimizations.html * https://github.com/numpy/numpy/pull/13516 I

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r417708201 ## File path: csharp/src/Apache.Arrow/Arrays/BinaryArray.cs ## @@ -73,24 +76,34 @@ public TArray Build(MemoryAllocator allocator = default) {

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r415075754 ## File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs ## @@ -84,7 +84,7 @@ public ArrayData Slice(int offset, int length) length =

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r416049921 ## File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs ## @@ -84,7 +84,7 @@ public ArrayData Slice(int offset, int length) length =

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418423816 ## File path: csharp/src/Apache.Arrow/Arrays/ArrayData.cs ## @@ -22,6 +22,8 @@ namespace Apache.Arrow { public sealed class ArrayData : IDisposable

[GitHub] [arrow] zgramana commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox
zgramana commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r418385072 ## File path: csharp/src/Apache.Arrow/Arrays/StringArray.cs ## @@ -71,6 +76,15 @@ public string GetString(int index, Encoding encoding = default)

[GitHub] [arrow] eerhardt commented on a change in pull request #7032: ARROW-6603: [C#] Adds ArrayBuilder API to support writing null values + BooleanArray null support

2020-05-01 Thread GitBox
eerhardt commented on a change in pull request #7032: URL: https://github.com/apache/arrow/pull/7032#discussion_r417711102 ## File path: csharp/src/Apache.Arrow/Arrays/StringArray.cs ## @@ -71,6 +76,15 @@ public string GetString(int index, Encoding encoding = default)

[GitHub] [arrow] github-actions[bot] commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-01 Thread GitBox
github-actions[bot] commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622397441 Revision: b65130bd5eae0e6fe79ace9d529a57f76869f621 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] kszucs commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-01 Thread GitBox
kszucs commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622396964 @github-actions crossbow submit wheel-win-* This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-05-01 Thread GitBox
fsaintjacques commented on a change in pull request #7073: URL: https://github.com/apache/arrow/pull/7073#discussion_r418554202 ## File path: cpp/src/arrow/dataset/file_base.cc ## @@ -83,131 +83,67 @@ Result FileFragment::Scan(std::shared_ptr options

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-05-01 Thread GitBox
fsaintjacques commented on a change in pull request #7073: URL: https://github.com/apache/arrow/pull/7073#discussion_r418556583 ## File path: cpp/src/arrow/dataset/file_base.cc ## @@ -221,42 +157,34 @@ Result> FileSystemDataset::Write( filesystem = std::make_shared();

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-05-01 Thread GitBox
fsaintjacques commented on a change in pull request #7073: URL: https://github.com/apache/arrow/pull/7073#discussion_r418556583 ## File path: cpp/src/arrow/dataset/file_base.cc ## @@ -221,42 +157,34 @@ Result> FileSystemDataset::Write( filesystem = std::make_shared();

[GitHub] [arrow] github-actions[bot] commented on pull request #7081: [CI] Cache docker volumes [WIP]

2020-05-01 Thread GitBox
github-actions[bot] commented on pull request #7081: URL: https://github.com/apache/arrow/pull/7081#issuecomment-622378960 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] fsaintjacques commented on pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-05-01 Thread GitBox
fsaintjacques commented on pull request #7073: URL: https://github.com/apache/arrow/pull/7073#issuecomment-622388571 The ParquetFileSystemDataset will hold a `parquet::Metadata` for example. This is an automated message from

[GitHub] [arrow] hantusk opened a new issue #7082: pyarrow 0.17 atexit handler causes a segmentation fault

2020-05-01 Thread GitBox
hantusk opened a new issue #7082: URL: https://github.com/apache/arrow/issues/7082 When running an ASGI webapp in python with uvicorn, I am getting the following error when shutting down. Solved by reverting back to pyarrow 0.16.0 ```python Error in atexit._run_exitfuncs:

[GitHub] [arrow] kszucs commented on pull request #7081: [CI] Cache docker volumes [WIP]

2020-05-01 Thread GitBox
kszucs commented on pull request #7081: URL: https://github.com/apache/arrow/pull/7081#issuecomment-622417613 With warmed up cache the build time has been reduced to 6m from 17m which is promising. This is an automated

[GitHub] [arrow] github-actions[bot] commented on pull request #7080: ARROW-8662: [CI] Consolidate appveyor scripts

2020-05-01 Thread GitBox
github-actions[bot] commented on pull request #7080: URL: https://github.com/apache/arrow/pull/7080#issuecomment-622423422 https://issues.apache.org/jira/browse/ARROW-8662 This is an automated message from the Apache Git

[GitHub] [arrow] kszucs commented on pull request #7080: [CI] Consolidate appveyor scripts [WIP]

2020-05-01 Thread GitBox
kszucs commented on pull request #7080: URL: https://github.com/apache/arrow/pull/7080#issuecomment-622408594 Checking that the cache properly works on my fork's master branch. This is an automated message from the Apache

[GitHub] [arrow] kszucs commented on pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-05-01 Thread GitBox
kszucs commented on pull request #7073: URL: https://github.com/apache/arrow/pull/7073#issuecomment-622410968 I'd like elaborate a bit more on the generic dataset class regardless what kind of wrappers do we provide. - Do you plan to unify the filesystem classes into a single one which

[GitHub] [arrow] kszucs commented on pull request #7080: [CI] Consolidate appveyor scripts [WIP]

2020-05-01 Thread GitBox
kszucs commented on pull request #7080: URL: https://github.com/apache/arrow/pull/7080#issuecomment-622418446 Checking that the cache properly works on my fork's master branch. This is an automated message from the Apache

[GitHub] [arrow] kszucs removed a comment on pull request #7081: [CI] Cache docker volumes [WIP]

2020-05-01 Thread GitBox
kszucs removed a comment on pull request #7081: URL: https://github.com/apache/arrow/pull/7081#issuecomment-622418381 Checking that the cache properly works on my fork's master branch. This is an automated message from the

[GitHub] [arrow] kszucs commented on pull request #7080: ARROW-8662: [CI] Consolidate appveyor scripts

2020-05-01 Thread GitBox
kszucs commented on pull request #7080: URL: https://github.com/apache/arrow/pull/7080#issuecomment-622419439 First run: https://ci.appveyor.com/project/kszucs/arrow/builds/32582056 Second run: https://ci.appveyor.com/project/kszucs/arrow/builds/32583776

[GitHub] [arrow] crd477 opened a new pull request #7083: Update building.rst

2020-05-01 Thread GitBox
crd477 opened a new pull request #7083: URL: https://github.com/apache/arrow/pull/7083 simple typo: not -> note This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] cyb70289 commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox
cyb70289 commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622426187 > Take a look at how this is currently being handled in NumPy > > * https://numpy.org/neps/nep-0038-SIMD-optimizations.html > *

[GitHub] [arrow] github-actions[bot] commented on pull request #7083: ARROW-8663: [Documentation] Small correction to building.rst

2020-05-01 Thread GitBox
github-actions[bot] commented on pull request #7083: URL: https://github.com/apache/arrow/pull/7083#issuecomment-622435457 https://issues.apache.org/jira/browse/ARROW-8663 This is an automated message from the Apache Git

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-05-01 Thread GitBox
fsaintjacques commented on a change in pull request #7021: URL: https://github.com/apache/arrow/pull/7021#discussion_r418596433 ## File path: .github/workflows/archery.yml ## @@ -51,10 +53,12 @@ jobs: python-version: '3.7' - name: Install

[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622409707 The 32-bit R failure seems like it could be real This is an automated message from the Apache Git Service. To

[GitHub] [arrow] wesm edited a comment on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox
wesm edited a comment on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622409707 The 32-bit R failure seems like it could be real cc @nealrichardson This is an automated message from the

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-05-01 Thread GitBox
fsaintjacques commented on a change in pull request #7073: URL: https://github.com/apache/arrow/pull/7073#discussion_r418564299 ## File path: cpp/src/arrow/dataset/file_base.cc ## @@ -83,131 +83,67 @@ Result FileFragment::Scan(std::shared_ptr options

[GitHub] [arrow] kszucs removed a comment on pull request #7080: [CI] Consolidate appveyor scripts [WIP]

2020-05-01 Thread GitBox
kszucs removed a comment on pull request #7080: URL: https://github.com/apache/arrow/pull/7080#issuecomment-622408594 Checking that the cache properly works on my fork's master branch. This is an automated message from the

[GitHub] [arrow] kszucs commented on pull request #7081: [CI] Cache docker volumes [WIP]

2020-05-01 Thread GitBox
kszucs commented on pull request #7081: URL: https://github.com/apache/arrow/pull/7081#issuecomment-622418381 Checking that the cache properly works on my fork's master branch. This is an automated message from the Apache

[GitHub] [arrow] emkornfield commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox
emkornfield commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622422074 @nealrichardson i'm not sure what the error is saying? also 32-bit R didn't realize that is still a thing :)

[GitHub] [arrow] kszucs edited a comment on pull request #7081: [CI] Cache docker volumes [WIP]

2020-05-01 Thread GitBox
kszucs edited a comment on pull request #7081: URL: https://github.com/apache/arrow/pull/7081#issuecomment-622417613 With warmed up cache the build time has been reduced to 6m from 17m which is promising. I'll need to do some gymnastics with the cache keys because the cache plugin

[GitHub] [arrow] github-actions[bot] commented on pull request #7083: Update building.rst

2020-05-01 Thread GitBox
github-actions[bot] commented on pull request #7083: URL: https://github.com/apache/arrow/pull/7083#issuecomment-622429357 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] nealrichardson commented on pull request #6631: ARROW-8111: [C++][CSV] Support MM/DD/YYYY date format

2020-05-01 Thread GitBox
nealrichardson commented on pull request #6631: URL: https://github.com/apache/arrow/pull/6631#issuecomment-622437524 @github-actions autotune everything This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] wesm commented on pull request #6631: ARROW-8111: [C++][CSV] Support MM/DD/YYYY date format

2020-05-01 Thread GitBox
wesm commented on pull request #6631: URL: https://github.com/apache/arrow/pull/6631#issuecomment-622447876 I can pick up this patch today and take it the last mile so it can be merged. This is an automated message from

[GitHub] [arrow] jorisvandenbossche commented on pull request #6303: ARROW-8039: [Python] Use dataset API in existing parquet readers and tests

2020-04-30 Thread GitBox
jorisvandenbossche commented on pull request #6303: URL: https://github.com/apache/arrow/pull/6303#issuecomment-621937113 I finally listed the open TODO items from the discussions in this PR / the skipped tests, and opened JIRAs where this was not yet the case: - Deduplicating the

[GitHub] [arrow] wesm commented on pull request #7060: ARROW-8619: [C++] Use distinct enum values for MonthInterval, DayTimeInterval

2020-04-30 Thread GitBox
wesm commented on pull request #7060: URL: https://github.com/apache/arrow/pull/7060#issuecomment-621933237 For some reason I can't get JNI running in my local setup ``` CMake Error at /home/wesm/cpp-toolchain/share/cmake-3.16/Modules/FindPackageHandleStandardArgs.cmake:146

[GitHub] [arrow] kszucs commented on a change in pull request #7067: ARROW-8639: [C++][Plasma] Require gflags

2020-04-30 Thread GitBox
kszucs commented on a change in pull request #7067: URL: https://github.com/apache/arrow/pull/7067#discussion_r418109900 ## File path: cpp/cmake_modules/FindgflagsAlt.cmake ## @@ -15,6 +15,8 @@ # specific language governing permissions and limitations # under the License.

[GitHub] [arrow] liyafan82 commented on pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types

2020-04-30 Thread GitBox
liyafan82 commented on pull request #6425: URL: https://github.com/apache/arrow/pull/6425#issuecomment-621827317 I have added integration test for the large varchar vector. This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7060: ARROW-8619: [C++] Use distinct enum values for MonthInterval, DayTimeInterval

2020-04-30 Thread GitBox
wesm commented on pull request #7060: URL: https://github.com/apache/arrow/pull/7060#issuecomment-621872082 Will fix the JNI issue and will notify the mailing list This is an automated message from the Apache Git Service. To

[GitHub] [arrow] bkietz commented on a change in pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
bkietz commented on a change in pull request #7033: URL: https://github.com/apache/arrow/pull/7033#discussion_r418108005 ## File path: cpp/src/arrow/dataset/file_csv.cc ## @@ -0,0 +1,136 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] kszucs commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
kszucs commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621957531 @github-actions crossbow submit test-debian-10-cpp This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621884756 Revision: 5c0b02dd7947e1e61da701169cc5fafb9135a6e5 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] kszucs commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
kszucs commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621883711 @github-actions crossbow submit test-debian-10-cpp test-debian-10-go-1.12 test-conda-python-3.7 This is an

[GitHub] [arrow] bkietz commented on pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
bkietz commented on pull request #7033: URL: https://github.com/apache/arrow/pull/7033#issuecomment-621814564 I'm happy to implement whatever configuration is agreeable. I'll add a list of the approaches which have been discussed here to the follow-up so we can discuss them there.

[GitHub] [arrow] bkietz commented on a change in pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
bkietz commented on a change in pull request #7033: URL: https://github.com/apache/arrow/pull/7033#discussion_r418053225 ## File path: cpp/src/arrow/dataset/file_csv.cc ## @@ -0,0 +1,144 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] markhildreth opened a new pull request #7072: ARROW-8648: [Rust] Optimize Rust CI Workflows

2020-04-30 Thread GitBox
markhildreth opened a new pull request #7072: URL: https://github.com/apache/arrow/pull/7072 WIP This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] github-actions[bot] commented on pull request #7072: ARROW-8648: [Rust] Optimize Rust CI Workflows

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7072: URL: https://github.com/apache/arrow/pull/7072#issuecomment-621910533 https://issues.apache.org/jira/browse/ARROW-8648 This is an automated message from the Apache Git

[GitHub] [arrow] bkietz commented on a change in pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
bkietz commented on a change in pull request #7033: URL: https://github.com/apache/arrow/pull/7033#discussion_r418109100 ## File path: cpp/src/arrow/dataset/file_csv.cc ## @@ -0,0 +1,136 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] github-actions[bot] commented on pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7073: URL: https://github.com/apache/arrow/pull/7073#issuecomment-621937629 https://issues.apache.org/jira/browse/ARROW-8318 This is an automated message from the Apache Git

[GitHub] [arrow] andygrove commented on pull request #7066: ARROW-8634: [Java] Add Java examples

2020-04-30 Thread GitBox
andygrove commented on pull request #7066: URL: https://github.com/apache/arrow/pull/7066#issuecomment-621873527 Ah, I wish I had found this documentation before I started using Arrow Java! OK, I will just add links to the README instead then. Thanks.

[GitHub] [arrow] liyafan82 commented on pull request #7071: ARROW-7955: [Java] Support large buffer for file/stream IPC

2020-04-30 Thread GitBox
liyafan82 commented on pull request #7071: URL: https://github.com/apache/arrow/pull/7071#issuecomment-621896058 Also add an integration test for VarCharVector, as it is possible that the size of the offset buffer be larger than Integer.MAX_VALUE

[GitHub] [arrow] fsaintjacques opened a new pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-04-30 Thread GitBox
fsaintjacques opened a new pull request #7073: URL: https://github.com/apache/arrow/pull/7073 * Simplified FileSystemDataset to hold a FragmentVector. Each Fragment must be a FileFragment and is checked at `FileSystemDataset::Make`. Fragments are not required to use the same

[GitHub] [arrow] github-actions[bot] commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621958397 Revision: 0b532b06da2ffab64d34d4420790eee3e4f64ca3 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] bkietz commented on a change in pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
bkietz commented on a change in pull request #7033: URL: https://github.com/apache/arrow/pull/7033#discussion_r418111903 ## File path: cpp/src/arrow/dataset/file_csv.cc ## @@ -0,0 +1,136 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more contributor

[GitHub] [arrow] jorisvandenbossche commented on pull request #7073: ARROW-8318: [C++][Dataset] Construct FileSystemDataset from fragments

2020-04-30 Thread GitBox
jorisvandenbossche commented on pull request #7073: URL: https://github.com/apache/arrow/pull/7073#issuecomment-621940060 > Fragments are not required to use the same backing filesystem nor the same format. Shouldn't we require that? That seems the goal of UnionDataset to combine

[GitHub] [arrow] fsaintjacques commented on a change in pull request #7033: ARROW-7759: [C++][Dataset] Add CsvFileFormat

2020-04-30 Thread GitBox
fsaintjacques commented on a change in pull request #7033: URL: https://github.com/apache/arrow/pull/7033#discussion_r417959216 ## File path: cpp/src/arrow/dataset/file_csv.cc ## @@ -0,0 +1,136 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] kszucs commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
kszucs commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621946996 @github-actions crossbow submit test-debian-10-cpp This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-04-30 Thread GitBox
github-actions[bot] commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-621947873 Revision: 0d768d6948627f165819cbb4542fca1503a340b1 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] sunchao commented on pull request #7076: ARROW-8659: [Rust] ListBuilder allocate with_capacity

2020-05-01 Thread GitBox
sunchao commented on pull request #7076: URL: https://github.com/apache/arrow/pull/7076#issuecomment-622616144 Merged. Thanks @tustvold ! This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] kou edited a comment on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-01 Thread GitBox
kou edited a comment on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622586181 @kszucs I want to set `DOCKERHUB_USER` and `DOCKERHUB_TOKEN` in https://travis-ci.org/github/ursa-labs/crossbow and https://github.com/ursa-labs/crossbow. Which user should we

[GitHub] [arrow] kou edited a comment on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-01 Thread GitBox
kou edited a comment on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622586181 @kszucs I want to set `DOCKERHUB_USER` and `DOCKERHUB_TOKEN` in https://travis-ci.org/github/ursa-labs/crossbow and https://github.com/ursa-labs/crossbow. Which user should we

[GitHub] [arrow] kszucs commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-01 Thread GitBox
kszucs commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622672970 @kou yes we use asf provided credentials on github actions to upload the images. We need a user with write access to that repository with a custom dockerhub token. Just granted

[GitHub] [arrow] kou commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-02 Thread GitBox
kou commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622731551 @github-actions crossbow submit -g linux This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] kou commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-02 Thread GitBox
kou commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622731373 Thanks! I set mine to https://travis-ci.org/github/ursa-labs/crossbow and https://github.com/ursa-labs/crossbow.

[GitHub] [arrow] xhochy commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-02 Thread GitBox
xhochy commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622729700 Actually, I think the most practical approach would be to use the conda-forge recipes, patch them to use `vs2015` and upload them to a separate channel. This gives us the build

[GitHub] [arrow] nealrichardson commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox
nealrichardson commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622495803 For what it's worth, R on Windows uses mingw, not msvc This is an automated message from the Apache Git

[GitHub] [arrow] wesm commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-01 Thread GitBox
wesm commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622496448 Is this change necessary? I understand why we are using VS2017 in the conda package but why in the wheels? This is

[GitHub] [arrow] wesm commented on pull request #7077: ARROW-8660: [C++][Gandiva] Reduce usage of Boost in Gandiva codebase

2020-05-01 Thread GitBox
wesm commented on pull request #7077: URL: https://github.com/apache/arrow/pull/7077#issuecomment-622513296 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] mayuropensource commented on pull request #7022: ARROW-8562: [C++] IO: Parameterize I/O Coalescing using S3 metrics

2020-05-01 Thread GitBox
mayuropensource commented on pull request #7022: URL: https://github.com/apache/arrow/pull/7022#issuecomment-622581788 thank you @wesm This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] wesm edited a comment on pull request #6631: ARROW-8111: [C++][CSV] Support MM/DD/YYYY date format

2020-05-01 Thread GitBox
wesm edited a comment on pull request #6631: URL: https://github.com/apache/arrow/pull/6631#issuecomment-622586463 Problems: * There aren't any unit tests in this patch so there is some work to do to get this merged * Code is duplicated from arrow/util/parsing.h I started

[GitHub] [arrow] github-actions[bot] commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-02 Thread GitBox
github-actions[bot] commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622733242 Revision: c73b2b65c373892f95dab8d65cf6fae06a39fe68 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] xhochy commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-02 Thread GitBox
xhochy commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622697125 We would need to switch away from using the compiled conda packages as the basis for the wheels. I'm not sure if there is a different source that could provide except Arrow

[GitHub] [arrow] kou commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-02 Thread GitBox
kou commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622885428 @github-actions crossbow submit -g linux This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-02 Thread GitBox
github-actions[bot] commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622887235 Revision: c73b2b65c373892f95dab8d65cf6fae06a39fe68 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] nevi-me commented on pull request #7037: ARROW-6718: [Rust] Remove packed_simd

2020-05-02 Thread GitBox
nevi-me commented on pull request #7037: URL: https://github.com/apache/arrow/pull/7037#issuecomment-622889198 Hi @yordan-pavlov, we want to remove packed_simd due to the uncertainty with it being stabilised soon. We so far found that if we optimise some non-SIMD code, we don't lose a lot

[GitHub] [arrow] xhochy commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-01 Thread GitBox
xhochy commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622494222 > @xhochy to run the wheels, or build them? To run. This is an automated message from the Apache Git

[GitHub] [arrow] tobim commented on pull request #7038: ARROW-8593: [C++][Parquet] Fix build with musl libc

2020-05-01 Thread GitBox
tobim commented on pull request #7038: URL: https://github.com/apache/arrow/pull/7038#issuecomment-622535149 @fsaintjacques @emkornfield sorry for the long silence, I updated the commit as you suggested. This is an

[GitHub] [arrow] pauldix commented on a change in pull request #7064: ARROW-6945: [Rust] WIP: Add initial skeleton for Rust integration tests

2020-05-01 Thread GitBox
pauldix commented on a change in pull request #7064: URL: https://github.com/apache/arrow/pull/7064#discussion_r418732315 ## File path: rust/arrow/Cargo.toml ## @@ -50,6 +50,7 @@ chrono = "0.4" flatbuffers = "0.6" hex = "0.4" arrow-flight = { path = "../arrow-flight",

[GitHub] [arrow] wesm edited a comment on pull request #6631: ARROW-8111: [C++][CSV] Support MM/DD/YYYY date format

2020-05-01 Thread GitBox
wesm edited a comment on pull request #6631: URL: https://github.com/apache/arrow/pull/6631#issuecomment-622586463 Problems: * There aren't any unit tests in this patch so there is some work to do to get this merged * Code is duplicated from arrow/util/parsing.h

[GitHub] [arrow] wesm edited a comment on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-01 Thread GitBox
wesm edited a comment on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622496448 Is this change necessary? I understand why we are using VS2017 in the conda package but why in the wheels? I'm sort of -0.5 on this unless there is a concrete reason why we

[GitHub] [arrow] lidavidm commented on pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-05-01 Thread GitBox
lidavidm commented on pull request #6744: URL: https://github.com/apache/arrow/pull/6744#issuecomment-622526177 Thank you both for all the feedback! This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] github-actions[bot] commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-01 Thread GitBox
github-actions[bot] commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622567576 https://issues.apache.org/jira/browse/ARROW-8668 This is an automated message from the Apache Git

[GitHub] [arrow] kou commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-01 Thread GitBox
kou commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622586181 @kszucs I want to set `DOCKERHUB_USER` and `DOCKERHUB_TOKEN` in https://travis-ci.org/github/ursa-labs/crossbow . Which user should we use for this? It seems that we use them for

[GitHub] [arrow] kou opened a new pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-01 Thread GitBox
kou opened a new pull request #7085: URL: https://github.com/apache/arrow/pull/7085 If we use QEMU on GitHub Actions, it takes 6h+. If we use ARM machine on Travis CI, it takes 30-40m. This change adds Docker image caching to

[GitHub] [arrow] kou commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-01 Thread GitBox
kou commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622584942 @github-actions crossbow submit -g linux This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] wesm commented on pull request #6631: ARROW-8111: [C++][CSV] Support MM/DD/YYYY date format

2020-05-01 Thread GitBox
wesm commented on pull request #6631: URL: https://github.com/apache/arrow/pull/6631#issuecomment-622586463 There aren't any unit tests in this patch so there is some work to do to get this merged This is an automated

[GitHub] [arrow] fsaintjacques commented on pull request #7074: ARROW-8656: [Python] Switch to VS2017 in the windows wheel builds

2020-05-01 Thread GitBox
fsaintjacques commented on pull request #7074: URL: https://github.com/apache/arrow/pull/7074#issuecomment-622491245 @xhochy to run the wheels, or build them? This is an automated message from the Apache Git Service. To

[GitHub] [arrow] kszucs commented on pull request #7021: ARROW-8628: [Dev] Wrap docker-compose commands with archery

2020-05-01 Thread GitBox
kszucs commented on pull request #7021: URL: https://github.com/apache/arrow/pull/7021#issuecomment-622578059 Thanks for the reviews, addressing them. This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] wesm commented on pull request #7060: ARROW-8619: [C++] Use distinct enum values for MonthInterval, DayTimeInterval

2020-05-01 Thread GitBox
wesm commented on pull request #7060: URL: https://github.com/apache/arrow/pull/7060#issuecomment-622499382 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [arrow] wesm commented on pull request #6744: PARQUET-1820: [C++] pre-buffer specified columns of row group

2020-05-01 Thread GitBox
wesm commented on pull request #6744: URL: https://github.com/apache/arrow/pull/6744#issuecomment-622529037 thanks @lidavidm! I'm confident we'll be able to devise some solutions to the resource allocation problem This is

[GitHub] [arrow] wesm commented on issue #7082: pyarrow 0.17 atexit handler causes a segmentation fault

2020-05-01 Thread GitBox
wesm commented on issue #7082: URL: https://github.com/apache/arrow/issues/7082#issuecomment-622500836 Can you please open a JIRA issue and provide more information about your system configuration? We saw this error inside GitHub Actions but I haven't been able to reproduce it locally

[GitHub] [arrow] BryanCutler commented on pull request #6425: ARROW-6111: [Java] Support LargeVarChar and LargeBinary types

2020-05-01 Thread GitBox
BryanCutler commented on pull request #6425: URL: https://github.com/apache/arrow/pull/6425#issuecomment-622555307 Yes, please enable integration tests and lets make sure it passes before merging this. This is an automated

[GitHub] [arrow] github-actions[bot] commented on pull request #7085: ARROW-8668: [Packaging][APT][Yum][ARM] Use Travis CI's ARM machine to build packages

2020-05-01 Thread GitBox
github-actions[bot] commented on pull request #7085: URL: https://github.com/apache/arrow/pull/7085#issuecomment-622585325 Revision: c73b2b65c373892f95dab8d65cf6fae06a39fe68 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] wesm commented on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox
wesm commented on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622493938 > I might lean towards macros around FMV for clang/GCC that could enable fallback to a slow version for MSVC FTR, it would seem unfortunate to do the work of SIMD-ifying code

[GitHub] [arrow] wesm edited a comment on pull request #6985: ARROW-8413: [C++][Parquet] Refactor Generating validity bitmap for values column

2020-05-01 Thread GitBox
wesm edited a comment on pull request #6985: URL: https://github.com/apache/arrow/pull/6985#issuecomment-622493938 > I might lean towards macros around FMV for clang/GCC that could enable fallback to a slow version for MSVC FTR, it would seem unfortunate to do the work of

[GitHub] [arrow] fsaintjacques commented on pull request #7038: ARROW-8593: [C++][Parquet] Fix build with musl libc

2020-05-01 Thread GitBox
fsaintjacques commented on pull request #7038: URL: https://github.com/apache/arrow/pull/7038#issuecomment-622503258 @tobim, I do not have the rights to push-force on this branch. You can apply this locally: ``` diff --git a/cpp/src/parquet/file_serialize_test.cc

<    4   5   6   7   8   9   10   11   12   13   >