[GitHub] [arrow] romainfrancois commented on a change in pull request #7514: ARROW-6235: [R] Implement conversion from arrow::BinaryArray to R character vector

2020-06-26 Thread GitBox
romainfrancois commented on a change in pull request #7514: URL: https://github.com/apache/arrow/pull/7514#discussion_r446483934 ## File path: r/tests/testthat/test-Array.R ## @@ -18,16 +18,16 @@ context("Array") expect_array_roundtrip <- function(x, type) { - a <- Array$c

[GitHub] [arrow] wjones1 edited a comment on pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-06-26 Thread GitBox
wjones1 edited a comment on pull request #6979: URL: https://github.com/apache/arrow/pull/6979#issuecomment-650472498 Apologies, I've been away for a bit. I thought I had invited @sonthonaxrk as a collaborator on my fork, but perhaps that did go through. Addressed the minor feedback.

[GitHub] [arrow] wjones1 edited a comment on pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-06-26 Thread GitBox
wjones1 edited a comment on pull request #6979: URL: https://github.com/apache/arrow/pull/6979#issuecomment-650474759 RE: @jorisvandenbossche > Same question as in the other PR: does setting the batch size also influence existing methods like `read` or `read_row_group` ? Should we add t

[GitHub] [arrow] wjones1 commented on pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-06-26 Thread GitBox
wjones1 commented on pull request #6979: URL: https://github.com/apache/arrow/pull/6979#issuecomment-650474759 RE: @jorisvandenbossche > Same question as in the other PR: does setting the batch size also influence existing methods like `read` or `read_row_group` ? Should we add that ke

[GitHub] [arrow] kou closed pull request #7553: ARROW-9234: [GLib][CUDA] Add support for dictionary memo on reading record batch from buffer

2020-06-26 Thread GitBox
kou closed pull request #7553: URL: https://github.com/apache/arrow/pull/7553 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [arrow] kou commented on pull request #7553: ARROW-9234: [GLib][CUDA] Add support for dictionary memo on reading record batch from buffer

2020-06-26 Thread GitBox
kou commented on pull request #7553: URL: https://github.com/apache/arrow/pull/7553#issuecomment-650473985 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and us

[GitHub] [arrow] wjones1 commented on pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-06-26 Thread GitBox
wjones1 commented on pull request #6979: URL: https://github.com/apache/arrow/pull/6979#issuecomment-650472498 Apologies, I've been away for a bit. I thought I had invited @sonthonaxrk as a collaborator on my fork, but perhaps that did go through. Addressed the minor feedback.

[GitHub] [arrow] wjones1 commented on a change in pull request #6979: ARROW-7800 [Python] implement iter_batches() method for ParquetFile and ParquetReader

2020-06-26 Thread GitBox
wjones1 commented on a change in pull request #6979: URL: https://github.com/apache/arrow/pull/6979#discussion_r446469923 ## File path: python/pyarrow/parquet.py ## @@ -310,6 +310,44 @@ def read_row_groups(self, row_groups, columns=None, use_threads=True,

[GitHub] [arrow] github-actions[bot] commented on pull request #7553: ARROW-9234: [GLib][CUDA] Add support for dictionary memo on reading record batch from buffer

2020-06-26 Thread GitBox
github-actions[bot] commented on pull request #7553: URL: https://github.com/apache/arrow/pull/7553#issuecomment-650466293 Revision: 6d42269b47130742abe2719a469410a214b87de6 Submitted crossbow builds: [ursa-labs/crossbow @ actions-361](https://github.com/ursa-labs/crossbow/branches/a

[GitHub] [arrow] kou commented on pull request #7553: ARROW-9234: [GLib][CUDA] Add support for dictionary memo on reading record batch from buffer

2020-06-26 Thread GitBox
kou commented on pull request #7553: URL: https://github.com/apache/arrow/pull/7553#issuecomment-650466094 @github-actions crossbow submit -g linux This is an automated message from the Apache Git Service. To respond to the m

[GitHub] [arrow] scampi commented on a change in pull request #6402: ARROW-7831: [Java] do not allocate a new offset buffer if the slice starts at 0 since the relative offset pointer would be unchange

2020-06-26 Thread GitBox
scampi commented on a change in pull request #6402: URL: https://github.com/apache/arrow/pull/6402#discussion_r446429063 ## File path: java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java ## @@ -751,55 +757,57 @@ private void splitAndTransferOffsetBuf

[GitHub] [arrow] github-actions[bot] commented on pull request #7553: ARROW-9234: [GLib][CUDA] Add support for dictionary memo on reading record batch from buffer

2020-06-26 Thread GitBox
github-actions[bot] commented on pull request #7553: URL: https://github.com/apache/arrow/pull/7553#issuecomment-650412393 https://issues.apache.org/jira/browse/ARROW-9234 This is an automated message from the Apache Git Serv

[GitHub] [arrow] kou opened a new pull request #7553: ARROW-9234: [GLib][CUDA] Add support for dictionary memo on reading record batch from buffer

2020-06-26 Thread GitBox
kou opened a new pull request #7553: URL: https://github.com/apache/arrow/pull/7553 This is a follow up task for https://github.com/apache/arrow/pull/7263 . This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] fsaintjacques closed pull request #7526: ARROW-9146: [C++][Dataset] Lazily store fragment physical schema

2020-06-26 Thread GitBox
fsaintjacques closed pull request #7526: URL: https://github.com/apache/arrow/pull/7526 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] wesm closed pull request #6725: ARROW-8226: [Go] Implement 64 bit offsets binary builder

2020-06-26 Thread GitBox
wesm closed pull request #6725: URL: https://github.com/apache/arrow/pull/6725 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] wesm commented on pull request #6725: ARROW-8226: [Go] Implement 64 bit offsets binary builder

2020-06-26 Thread GitBox
wesm commented on pull request #6725: URL: https://github.com/apache/arrow/pull/6725#issuecomment-650356376 Well, that is definitely a bummer. I hope to see some growth in the Go developer community here in the future. I'll close this PR for now then --

[GitHub] [arrow] richardartoul commented on pull request #6725: ARROW-8226: [Go] Implement 64 bit offsets binary builder

2020-06-26 Thread GitBox
richardartoul commented on pull request #6725: URL: https://github.com/apache/arrow/pull/6725#issuecomment-650355331 Just wanted to share that I probably can't contribute any more time to getting this merged, we've moved away from Arrow for our project (we still use the format in some case

[GitHub] [arrow] nealrichardson commented on pull request #7526: ARROW-9146: [C++][Dataset] Lazily store fragment physical schema

2020-06-26 Thread GitBox
nealrichardson commented on pull request #7526: URL: https://github.com/apache/arrow/pull/7526#issuecomment-650289073 I'll merge after Appveyor passes, ignoring Travis This is an automated message from the Apache Git Service.

[GitHub] [arrow] kszucs commented on pull request #7478: ARROW-9055: [C++] Add sum/mean/minmax kernels for Boolean type

2020-06-26 Thread GitBox
kszucs commented on pull request #7478: URL: https://github.com/apache/arrow/pull/7478#issuecomment-650267609 @ursabot build This is an automated message from the Apache Git Service. To respond to the message, please log on t

[GitHub] [arrow] wesm merged pull request #7552: [CI] Set allow_failures on Travis CI jobs until they stop being broken

2020-06-26 Thread GitBox
wesm merged pull request #7552: URL: https://github.com/apache/arrow/pull/7552 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] wesm commented on pull request #7552: [CI] Set allow_failures on Travis CI jobs until they stop being broken

2020-06-26 Thread GitBox
wesm commented on pull request #7552: URL: https://github.com/apache/arrow/pull/7552#issuecomment-650260460 +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and u

[GitHub] [arrow] github-actions[bot] commented on pull request #7552: [CI] Set allow_failures on Travis CI jobs until they stop being broken

2020-06-26 Thread GitBox
github-actions[bot] commented on pull request #7552: URL: https://github.com/apache/arrow/pull/7552#issuecomment-650259300 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then could

[GitHub] [arrow] romainfrancois commented on a change in pull request #7514: ARROW-6235: [R] Implement conversion from arrow::BinaryArray to R character vector

2020-06-26 Thread GitBox
romainfrancois commented on a change in pull request #7514: URL: https://github.com/apache/arrow/pull/7514#discussion_r446269445 ## File path: r/tests/testthat/test-Array.R ## @@ -18,16 +18,16 @@ context("Array") expect_array_roundtrip <- function(x, type) { - a <- Array$c

[GitHub] [arrow] romainfrancois commented on a change in pull request #7514: ARROW-6235: [R] Implement conversion from arrow::BinaryArray to R character vector

2020-06-26 Thread GitBox
romainfrancois commented on a change in pull request #7514: URL: https://github.com/apache/arrow/pull/7514#discussion_r446269132 ## File path: r/tests/testthat/test-Array.R ## @@ -18,16 +18,16 @@ context("Array") expect_array_roundtrip <- function(x, type) { - a <- Array$c

[GitHub] [arrow] wesm opened a new pull request #7552: [CI] Set allow_failures on Travis CI jobs until they stop being broken

2020-06-26 Thread GitBox
wesm opened a new pull request #7552: URL: https://github.com/apache/arrow/pull/7552 Travis CI has been flaky for days. This is adding a lot of noise to our workflows so I think we should allow them to fail until they become more consistently happy. --

[GitHub] [arrow] kszucs commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean/minmax kernels for Boolean type

2020-06-26 Thread GitBox
kszucs commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r446266184 ## File path: cpp/src/arrow/compute/kernels/aggregate_test.cc ## @@ -399,15 +434,59 @@ class TestNumericMinMaxKernel : public ::testing::Test { }; templa

[GitHub] [arrow] kszucs commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean/minmax kernels for Boolean type

2020-06-26 Thread GitBox
kszucs commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r446264863 ## File path: cpp/src/arrow/compute/kernels/aggregate_basic.cc ## @@ -397,24 +452,26 @@ struct MinMaxImpl : public ScalarAggregator { ArrayType arr(ba

[GitHub] [arrow] nealrichardson commented on pull request #7524: ARROW-8899 [R] Add R metadata like pandas metadata for round-trip fidelity

2020-06-26 Thread GitBox
nealrichardson commented on pull request #7524: URL: https://github.com/apache/arrow/pull/7524#issuecomment-650249655 @romainfrancois regarding tests, I think a fixture something like ```r df <- tibble::tibble( a = structure("one", class = "special_string"), b = 2,

[GitHub] [arrow] wesm commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean/minmax kernels for Boolean type

2020-06-26 Thread GitBox
wesm commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r446262302 ## File path: cpp/src/arrow/compute/kernels/aggregate_test.cc ## @@ -399,15 +434,59 @@ class TestNumericMinMaxKernel : public ::testing::Test { }; template

[GitHub] [arrow] paddyhoran commented on a change in pull request #7500: ARROW-9191: [Rust] Do not panic when milliseconds is less than zero as chrono can handle…

2020-06-26 Thread GitBox
paddyhoran commented on a change in pull request #7500: URL: https://github.com/apache/arrow/pull/7500#discussion_r446258243 ## File path: rust/parquet/src/record/api.rs ## @@ -893,16 +893,6 @@ mod tests { assert_eq!(row, Field::TimestampMillis(123854406)); }

[GitHub] [arrow] nealrichardson commented on pull request #7526: ARROW-9146: [C++][Dataset] Lazily store fragment physical schema

2020-06-26 Thread GitBox
nealrichardson commented on pull request #7526: URL: https://github.com/apache/arrow/pull/7526#issuecomment-650242207 This has a python lint failure: https://github.com/apache/arrow/pull/7526/checks?check_run_id=811470274#step:7:1377 ---

[GitHub] [arrow] nealrichardson closed pull request #7550: ARROW-9219: [R] coerce_timestamps in Parquet write options does not work

2020-06-26 Thread GitBox
nealrichardson closed pull request #7550: URL: https://github.com/apache/arrow/pull/7550 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] nealrichardson closed pull request #7527: ARROW-7018: [R] Non-UTF-8 data in Arrow <--> R conversion

2020-06-26 Thread GitBox
nealrichardson closed pull request #7527: URL: https://github.com/apache/arrow/pull/7527 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to g

[GitHub] [arrow] nealrichardson commented on pull request #7527: ARROW-7018: [R] Non-UTF-8 data in Arrow <--> R conversion

2020-06-26 Thread GitBox
nealrichardson commented on pull request #7527: URL: https://github.com/apache/arrow/pull/7527#issuecomment-650237185 Thanks, I think we should just revisit further work once `cpp11` is available, see how much of that is handled. ---

[GitHub] [arrow] nealrichardson commented on a change in pull request #7527: ARROW-7018: [R] Non-UTF-8 data in Arrow <--> R conversion

2020-06-26 Thread GitBox
nealrichardson commented on a change in pull request #7527: URL: https://github.com/apache/arrow/pull/7527#discussion_r446247176 ## File path: r/R/schema.R ## @@ -83,16 +83,21 @@ Schema <- R6Class("Schema", } ), active = list( -names = function() Schema__field_na

[GitHub] [arrow] nealrichardson commented on a change in pull request #7514: ARROW-6235: [R] Implement conversion from arrow::BinaryArray to R character vector

2020-06-26 Thread GitBox
nealrichardson commented on a change in pull request #7514: URL: https://github.com/apache/arrow/pull/7514#discussion_r446237067 ## File path: r/tests/testthat/test-Array.R ## @@ -18,16 +18,16 @@ context("Array") expect_array_roundtrip <- function(x, type) { - a <- Array$c

[GitHub] [arrow] jacques-n commented on pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-06-26 Thread GitBox
jacques-n commented on pull request #7030: URL: https://github.com/apache/arrow/pull/7030#issuecomment-650227890 > 1. We are package-hacking `org.apache.arrow.memory` in module `arrow-dataset`. >Yes `Ownerships.java` uses some hacks, so I've removed the class in latest commits (

[GitHub] [arrow] jacques-n commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-06-26 Thread GitBox
jacques-n commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r446237545 ## File path: java/dataset/src/main/java/org/apache/arrow/memory/NativeUnderlingMemory.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software F

[GitHub] [arrow] romainfrancois commented on a change in pull request #7527: ARROW-7018: [R] Non-UTF-8 data in Arrow <--> R conversion

2020-06-26 Thread GitBox
romainfrancois commented on a change in pull request #7527: URL: https://github.com/apache/arrow/pull/7527#discussion_r446220256 ## File path: r/src/array_from_vector.cpp ## @@ -159,6 +159,9 @@ struct VectorToArrayConverter { if (s == NA_STRING) { RETURN_NOT_OK(

[GitHub] [arrow] kszucs commented on a change in pull request #7478: ARROW-9055: [C++] Add sum/mean/minmax kernels for Boolean type

2020-06-26 Thread GitBox
kszucs commented on a change in pull request #7478: URL: https://github.com/apache/arrow/pull/7478#discussion_r446219965 ## File path: cpp/src/arrow/compute/kernels/aggregate_test.cc ## @@ -399,15 +434,59 @@ class TestNumericMinMaxKernel : public ::testing::Test { }; templa

[GitHub] [arrow] romainfrancois commented on pull request #7514: ARROW-6235: [R] Implement conversion from arrow::BinaryArray to R character vector

2020-06-26 Thread GitBox
romainfrancois commented on pull request #7514: URL: https://github.com/apache/arrow/pull/7514#issuecomment-650204663 Added support for LargeList: ``` r library(arrow, warn.conflicts = FALSE) a <- Array$create(list(integer()), type = large_list_of(int32())) a #> Large

[GitHub] [arrow] github-actions[bot] commented on pull request #7551: ARROW-9132: [C++] Support Unique and ValueCounts on dictionary data with non-changing dictionaries, add ChunkedArray::Make validat

2020-06-26 Thread GitBox
github-actions[bot] commented on pull request #7551: URL: https://github.com/apache/arrow/pull/7551#issuecomment-650203303 https://issues.apache.org/jira/browse/ARROW-9132 This is an automated message from the Apache Git Serv

[GitHub] [arrow] wesm commented on a change in pull request #7551: ARROW-9132: [C++] Support Unique and ValueCounts on dictionary data with non-changing dictionaries, add ChunkedArray::Make validating

2020-06-26 Thread GitBox
wesm commented on a change in pull request #7551: URL: https://github.com/apache/arrow/pull/7551#discussion_r446206568 ## File path: cpp/src/arrow/chunked_array.cc ## @@ -64,6 +64,24 @@ ChunkedArray::ChunkedArray(ArrayVector chunks, std::shared_ptr type) } } +Result> Chu

[GitHub] [arrow] wesm opened a new pull request #7551: ARROW-9132: [C++] Support Unique and ValueCounts on dictionary data with non-changing dictionaries, add ChunkedArray::Make validating constructor

2020-06-26 Thread GitBox
wesm opened a new pull request #7551: URL: https://github.com/apache/arrow/pull/7551 This dispatches to the hash function for the indices while checking that the dictionaries stay the same on each processed chunk This is an

[GitHub] [arrow] kszucs commented on pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-06-26 Thread GitBox
kszucs commented on pull request #7519: URL: https://github.com/apache/arrow/pull/7519#issuecomment-650158531 I'm considering to apply [cython.freelist](https://cython.readthedocs.io/en/latest/src/userguide/extension_types.html#fast-instantiation) on the scalar extension classes. @w

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-06-26 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r446160460 ## File path: python/pyarrow/scalar.pxi ## @@ -16,1198 +16,704 @@ # under the License. -_NULL = NA = None - - cdef class Scalar: """ -The base

[GitHub] [arrow] wesm commented on pull request #7514: ARROW-6235: [R] Implement conversion from arrow::BinaryArray to R character vector

2020-06-26 Thread GitBox
wesm commented on pull request #7514: URL: https://github.com/apache/arrow/pull/7514#issuecomment-650151220 > @wesm I don't know what you mean by the `BitBlockCounter` treatment. Ah sorry, welcome back =) We created some facilities to improve the performance of processing validity bi

[GitHub] [arrow] wesm closed pull request #7541: ARROW-9224: [Dev][Archery] clone local source with --shared

2020-06-26 Thread GitBox
wesm closed pull request #7541: URL: https://github.com/apache/arrow/pull/7541 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the s

[GitHub] [arrow] wesm commented on pull request #7541: ARROW-9224: [Dev][Archery] clone local source with --shared

2020-06-26 Thread GitBox
wesm commented on pull request #7541: URL: https://github.com/apache/arrow/pull/7541#issuecomment-650150564 thanks! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [arrow] kszucs commented on pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-06-26 Thread GitBox
kszucs commented on pull request #7519: URL: https://github.com/apache/arrow/pull/7519#issuecomment-650149745 > Could we add a `is_valid` attribute to the python scalar as well? Now the only way to check for a null value is to do `.as_py() is None` ? Added.

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-06-26 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r446148546 ## File path: python/pyarrow/scalar.pxi ## @@ -16,1198 +16,704 @@ # under the License. -_NULL = NA = None - - cdef class Scalar: """ -The base

[GitHub] [arrow] mrkn edited a comment on pull request #7477: ARROW-4221: [C++][Python] Add canonical flag in COO sparse index

2020-06-26 Thread GitBox
mrkn edited a comment on pull request #7477: URL: https://github.com/apache/arrow/pull/7477#issuecomment-650148002 > > Without the canonical flag, we need to make a copy and sort the data of non-canonical sparse tensor when serializing it because the current SparseCOOIndex has the constrai

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-06-26 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r446148105 ## File path: python/pyarrow/tests/test_parquet.py ## @@ -2028,7 +2028,7 @@ def test_filters_invalid_pred_op(tempdir, use_legacy_dataset):

[GitHub] [arrow] kszucs commented on a change in pull request #7519: ARROW-9017: [C++][Python] Refactor scalar bindings

2020-06-26 Thread GitBox
kszucs commented on a change in pull request #7519: URL: https://github.com/apache/arrow/pull/7519#discussion_r446147913 ## File path: python/pyarrow/_dataset.pyx ## @@ -216,22 +216,18 @@ cdef class Expression: @staticmethod def _scalar(value): cdef: -

[GitHub] [arrow] mrkn commented on pull request #7477: ARROW-4221: [C++][Python] Add canonical flag in COO sparse index

2020-06-26 Thread GitBox
mrkn commented on pull request #7477: URL: https://github.com/apache/arrow/pull/7477#issuecomment-650148002 > > Without the canonical flag, we need to make a copy and sort the data of non-canonical sparse tensor when serializing it because the current SparseCOOIndex has the constraint that

[GitHub] [arrow] romainfrancois commented on pull request #7514: ARROW-6235: [R] Implement conversion from arrow::BinaryArray to R character vector

2020-06-26 Thread GitBox
romainfrancois commented on pull request #7514: URL: https://github.com/apache/arrow/pull/7514#issuecomment-650122091 Also deals with LargeBinary now, e.g. https://issues.apache.org/jira/browse/ARROW-6543 ``` r library(arrow, warn.conflicts = FALSE) a <- Array$create(list(r

[GitHub] [arrow] rymurr commented on a change in pull request #7030: ARROW-7808: [Java][Dataset] Implement Datasets Java API by JNI to C++

2020-06-26 Thread GitBox
rymurr commented on a change in pull request #7030: URL: https://github.com/apache/arrow/pull/7030#discussion_r446103501 ## File path: java/dataset/src/test/resources/avroschema/user.avsc ## @@ -0,0 +1,26 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

[GitHub] [arrow] pitrou commented on pull request #7477: ARROW-4221: [C++][Python] Add canonical flag in COO sparse index

2020-06-26 Thread GitBox
pitrou commented on pull request #7477: URL: https://github.com/apache/arrow/pull/7477#issuecomment-650091038 > Without the canonical flag, we need to make a copy and sort the data of non-canonical sparse tensor when serializing it because the current SparseCOOIndex has the constraint that

[GitHub] [arrow] mrkn commented on pull request #7477: ARROW-4221: [C++][Python] Add canonical flag in COO sparse index

2020-06-26 Thread GitBox
mrkn commented on pull request #7477: URL: https://github.com/apache/arrow/pull/7477#issuecomment-650089924 @pitrou Without the canonical flag, we need to make a copy and sort the data of non-canonical sparse tensor when serializing it because the current SparseCOOIndex has the constraint

[GitHub] [arrow] romainfrancois commented on a change in pull request #7514: ARROW-6235: [R] Implement conversion from arrow::BinaryArray to R character vector

2020-06-26 Thread GitBox
romainfrancois commented on a change in pull request #7514: URL: https://github.com/apache/arrow/pull/7514#discussion_r446041588 ## File path: r/src/array_from_vector.cpp ## @@ -1067,12 +1110,22 @@ std::shared_ptr InferArrowTypeFromVector(SEXP x) { if (Rf_inherits(x, "data.

[GitHub] [arrow] rdettai commented on pull request #7547: ARROW-8950: [C++] Avoid HEAD when possible in S3 filesystem

2020-06-26 Thread GitBox
rdettai commented on pull request #7547: URL: https://github.com/apache/arrow/pull/7547#issuecomment-650027175 Great! AWS is going to be surprised to see its worldwide S3 HEAD request rate drop by half overnight ! This is a

[GitHub] [arrow] rdettai commented on a change in pull request #7547: ARROW-8950: [C++] Avoid HEAD when possible in S3 filesystem

2020-06-26 Thread GitBox
rdettai commented on a change in pull request #7547: URL: https://github.com/apache/arrow/pull/7547#discussion_r446015106 ## File path: cpp/src/arrow/filesystem/s3fs.cc ## @@ -397,9 +399,17 @@ class ObjectInputFile : public io::RandomAccessFile { ObjectInputFile(Aws::S3::S3C

[GitHub] [arrow] rdettai commented on a change in pull request #7547: ARROW-8950: [C++] Avoid HEAD when possible in S3 filesystem

2020-06-26 Thread GitBox
rdettai commented on a change in pull request #7547: URL: https://github.com/apache/arrow/pull/7547#discussion_r446015106 ## File path: cpp/src/arrow/filesystem/s3fs.cc ## @@ -397,9 +399,17 @@ class ObjectInputFile : public io::RandomAccessFile { ObjectInputFile(Aws::S3::S3C