[GitHub] [arrow] liyafan82 commented on a change in pull request #8605: ARROW-10508 [Java] Allow FixedSizeListVector to have empty children

2020-11-10 Thread GitBox
liyafan82 commented on a change in pull request #8605: URL: https://github.com/apache/arrow/pull/8605#discussion_r520405424 ## File path: java/vector/src/test/java/org/apache/arrow/vector/TestFixedSizeListVector.java ## @@ -402,6 +402,34 @@ public void testSplitAndTransfer()

[GitHub] [arrow] emkornfield commented on pull request #8589: ARROW-10493: [C++][Parquet] Fix offset lost in MaybeReplaceValidity

2020-11-10 Thread GitBox
emkornfield commented on pull request #8589: URL: https://github.com/apache/arrow/pull/8589#issuecomment-724611663 @chrisavl I'm done pushing. I expanded the test case to try to cover all branches in the new slice method and reverted my previous changes. Let me know if the new code

[GitHub] [arrow] romainfrancois commented on pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-10 Thread GitBox
romainfrancois commented on pull request #8256: URL: https://github.com/apache/arrow/pull/8256#issuecomment-724553352 Thanks @bkietz for the review, I'll work towards using a trait instead of a template function. As per yesterday's discussion with @nealrichardson, we also want to

[GitHub] [arrow] jhorstmann commented on pull request #8571: ARROW-10461: [Rust] Fix offset bug in remainder bits

2020-11-10 Thread GitBox
jhorstmann commented on pull request #8571: URL: https://github.com/apache/arrow/pull/8571#issuecomment-724609649 @nevi-me @alamb @jorgecarleitao I had a quick chat with @vertexclique and we think we can merge this PR with the bugfix first and then rebase and integrate his refactoring.

[GitHub] [arrow] alamb commented on pull request #8571: ARROW-10461: [Rust] Fix offset bug in remainder bits

2020-11-10 Thread GitBox
alamb commented on pull request #8571: URL: https://github.com/apache/arrow/pull/8571#issuecomment-724641767 I agree @jhorstmann -- I think we should merge this. I don't yet have the permissions set up to merge things, so I need to wait for one of the other committer such as @andygrove ,

[GitHub] [arrow] romainfrancois commented on a change in pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-10 Thread GitBox
romainfrancois commented on a change in pull request #8256: URL: https://github.com/apache/arrow/pull/8256#discussion_r520380853 ## File path: r/src/arrow_cpp11.h ## @@ -300,22 +297,65 @@ bool GetBoolOption(const std::string& name, bool default_); namespace cpp11 {

[GitHub] [arrow] Jibbow commented on pull request #8611: ARROW-4804: [Rust] Parse Date32 and Date64 in CSV reader

2020-11-10 Thread GitBox
Jibbow commented on pull request #8611: URL: https://github.com/apache/arrow/pull/8611#issuecomment-724606596 Thanks for the feedback @vertexclique and @jorgecarleitao! I'll update the PR when #8609 is merged. This is an

[GitHub] [arrow] github-actions[bot] commented on pull request #8622: ARROW-10543: [Developer] Add a note about being patient after gitbox enable

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8622: URL: https://github.com/apache/arrow/pull/8622#issuecomment-724703442 https://issues.apache.org/jira/browse/ARROW-10543 This is an automated message from the Apache Git

[GitHub] [arrow] nevi-me commented on pull request #8611: ARROW-4804: [Rust] Parse Date32 and Date64 in CSV reader

2020-11-10 Thread GitBox
nevi-me commented on pull request #8611: URL: https://github.com/apache/arrow/pull/8611#issuecomment-724734449 @alamb I can't respond to your comment. I see you're using `T::Native::from_i32(since(days, from_ymd(1970, 1, 1)).num_days() as i32)`. Shouldn't there be a way for us to get

[GitHub] [arrow] pitrou commented on pull request #8524: ARROW-10345: [C++][Compute] Fix NaN error in sorting and topn kernels

2020-11-10 Thread GitBox
pitrou commented on pull request #8524: URL: https://github.com/apache/arrow/pull/8524#issuecomment-724738890 New PR is #8623. This is an automated message from the Apache Git Service. To respond to the message, please log

[GitHub] [arrow] github-actions[bot] commented on pull request #8623: ARROW-10345: [C++][Compute] Fix NaN handling in sorting and topn kernels

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8623: URL: https://github.com/apache/arrow/pull/8623#issuecomment-724738968 https://issues.apache.org/jira/browse/ARROW-10345 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on pull request #8623: ARROW-10345: [C++][Compute] Fix NaN handling in sorting and topn kernels

2020-11-10 Thread GitBox
pitrou commented on pull request #8623: URL: https://github.com/apache/arrow/pull/8623#issuecomment-724738741 Note: original PR is #8524, I had to close it and open a new one because Github wouldn't pick up changes. This is

[GitHub] [arrow] jorisvandenbossche opened a new pull request #8624: ARROW-10532: [Python] Fix metadata in Table.from_pandas conversion with specified schema with different column order

2020-11-10 Thread GitBox
jorisvandenbossche opened a new pull request #8624: URL: https://github.com/apache/arrow/pull/8624 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] alamb commented on pull request #8611: ARROW-4804: [Rust] Parse Date32 and Date64 in CSV reader

2020-11-10 Thread GitBox
alamb commented on pull request #8611: URL: https://github.com/apache/arrow/pull/8611#issuecomment-724690117 > Date32 currently assumes to have DateUnit::Day and Date64 assumes DateUnit::Millisecond respectively. I think this is fine as long as reasonable errors (unsupported XXX)

[GitHub] [arrow] maartenbreddels opened a new pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-10 Thread GitBox
maartenbreddels opened a new pull request #8621: URL: https://github.com/apache/arrow/pull/8621 There is one obvious loose end in this PR, which is where to generate the `std::set` based on the `TrimOptions` (now in the ctor of UTF8TrimBase). I'm not sure what the lifetime guarantees are

[GitHub] [arrow] pitrou commented on pull request #8516: PARQUET-1935: [C++] Fix bug in WriteBatchSpaced

2020-11-10 Thread GitBox
pitrou commented on pull request #8516: URL: https://github.com/apache/arrow/pull/8516#issuecomment-724741673 Rebased. This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] alamb commented on pull request #8611: ARROW-4804: [Rust] Parse Date32 and Date64 in CSV reader

2020-11-10 Thread GitBox
alamb commented on pull request #8611: URL: https://github.com/apache/arrow/pull/8611#issuecomment-724756612 @nevi-me -- good call -- I was simply copy/pasting what was in this PR in terms of `as i32` without looking carefully enough. I updated

[GitHub] [arrow] kszucs closed pull request #8620: ARROW-10539: [Packaging][Python] Use GitHub Actions to build wheels for Windows

2020-11-10 Thread GitBox
kszucs closed pull request #8620: URL: https://github.com/apache/arrow/pull/8620 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] alamb closed pull request #8571: ARROW-10461: [Rust] Fix offset bug in remainder bits

2020-11-10 Thread GitBox
alamb closed pull request #8571: URL: https://github.com/apache/arrow/pull/8571 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [arrow] pitrou closed pull request #8524: ARROW-10345: [C++][Compute] Fix NaN error in sorting and topn kernels

2020-11-10 Thread GitBox
pitrou closed pull request #8524: URL: https://github.com/apache/arrow/pull/8524 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] pitrou commented on pull request #8524: ARROW-10345: [C++][Compute] Fix NaN error in sorting and topn kernels

2020-11-10 Thread GitBox
pitrou commented on pull request #8524: URL: https://github.com/apache/arrow/pull/8524#issuecomment-724735164 I closed because Github didn't pick up the changes I pushed. I'll open a new PR (sorry). This is an automated

[GitHub] [arrow] pitrou commented on a change in pull request #8623: ARROW-10345: [C++][Compute] Fix NaN handling in sorting and topn kernels

2020-11-10 Thread GitBox
pitrou commented on a change in pull request #8623: URL: https://github.com/apache/arrow/pull/8623#discussion_r520605189 ## File path: cpp/src/arrow/compute/kernels/vector_sort.cc ## @@ -30,6 +32,58 @@ namespace internal { namespace { +// NOTE: std::partition is usually

[GitHub] [arrow] github-actions[bot] commented on pull request #8624: ARROW-10532: [Python] Fix metadata in Table.from_pandas conversion with specified schema with different column order

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8624: URL: https://github.com/apache/arrow/pull/8624#issuecomment-724748936 https://issues.apache.org/jira/browse/ARROW-10532 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on pull request #8524: ARROW-10345: [C++][Compute] Fix NaN error in sorting and topn kernels

2020-11-10 Thread GitBox
pitrou commented on pull request #8524: URL: https://github.com/apache/arrow/pull/8524#issuecomment-724773476 Ah, weird. That may be the explanation... This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] pitrou commented on pull request #8472: ARROW-8113: [C++][WIP] Lighter weight variant<>

2020-11-10 Thread GitBox
pitrou commented on pull request #8472: URL: https://github.com/apache/arrow/pull/8472#issuecomment-724772910 @bkietz Do you plan to update this? This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] praveenbingo closed pull request #8614: ARROW-10518: [C++][Gandiva] Adding NativeFunction::kCanReturnErrors to cast function in gandiva

2020-11-10 Thread GitBox
praveenbingo closed pull request #8614: URL: https://github.com/apache/arrow/pull/8614 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] alamb edited a comment on pull request #8571: ARROW-10461: [Rust] Fix offset bug in remainder bits

2020-11-10 Thread GitBox
alamb edited a comment on pull request #8571: URL: https://github.com/apache/arrow/pull/8571#issuecomment-724641767 I agree @jhorstmann -- I think we should merge this. ~~I don't yet have the permissions set up to merge things, so I need to wait for one of the other committer such as

[GitHub] [arrow] nevi-me commented on a change in pull request #8611: ARROW-4804: [Rust] Parse Date32 and Date64 in CSV reader

2020-11-10 Thread GitBox
nevi-me commented on a change in pull request #8611: URL: https://github.com/apache/arrow/pull/8611#discussion_r520597966 ## File path: rust/arrow/src/csv/reader.rs ## @@ -219,6 +226,35 @@ pub fn infer_schema_from_files( Schema::try_merge() } +/// Parses a string into

[GitHub] [arrow] cyb70289 commented on pull request #8524: ARROW-10345: [C++][Compute] Fix NaN error in sorting and topn kernels

2020-11-10 Thread GitBox
cyb70289 commented on pull request #8524: URL: https://github.com/apache/arrow/pull/8524#issuecomment-724745103 Looks github does something wrong with "Allow edits and access to secrets by maintainers". It was always checked by default. But now it's unchecked. Checking it, refreshing the

[GitHub] [arrow] praveenbingo commented on a change in pull request #8614: ARROW-10518: [C++][Gandiva] Adding NativeFunction::kCanReturnErrors to cast function in gandiva

2020-11-10 Thread GitBox
praveenbingo commented on a change in pull request #8614: URL: https://github.com/apache/arrow/pull/8614#discussion_r520519310 ## File path: cpp/src/gandiva/tests/projector_test.cc ## @@ -15,13 +15,13 @@ // specific language governing permissions and limitations // under the

[GitHub] [arrow] naman1996 commented on a change in pull request #8614: ARROW-10518: [C++][Gandiva] Adding NativeFunction::kCanReturnErrors to cast function in gandiva

2020-11-10 Thread GitBox
naman1996 commented on a change in pull request #8614: URL: https://github.com/apache/arrow/pull/8614#discussion_r520524040 ## File path: cpp/src/gandiva/tests/projector_test.cc ## @@ -15,13 +15,13 @@ // specific language governing permissions and limitations // under the

[GitHub] [arrow] alamb commented on a change in pull request #8611: ARROW-4804: [Rust] Parse Date32 and Date64 in CSV reader

2020-11-10 Thread GitBox
alamb commented on a change in pull request #8611: URL: https://github.com/apache/arrow/pull/8611#discussion_r520530077 ## File path: rust/arrow/src/csv/reader.rs ## @@ -219,6 +226,35 @@ pub fn infer_schema_from_files( Schema::try_merge() } +/// Parses a string into

[GitHub] [arrow] github-actions[bot] commented on pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8621: URL: https://github.com/apache/arrow/pull/8621#issuecomment-724695171 https://issues.apache.org/jira/browse/ARROW-9128 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on pull request #8623: ARROW-10345: [C++][Compute] Fix NaN handling in sorting and topn kernels

2020-11-10 Thread GitBox
pitrou commented on pull request #8623: URL: https://github.com/apache/arrow/pull/8623#issuecomment-724742093 +1, will merge if green This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [arrow] alamb commented on a change in pull request #8611: ARROW-4804: [Rust] Parse Date32 and Date64 in CSV reader

2020-11-10 Thread GitBox
alamb commented on a change in pull request #8611: URL: https://github.com/apache/arrow/pull/8611#discussion_r520546828 ## File path: rust/arrow/src/csv/reader.rs ## @@ -219,6 +226,35 @@ pub fn infer_schema_from_files( Schema::try_merge() } +/// Parses a string into

[GitHub] [arrow] alamb opened a new pull request #8622: ARROW-10543: [Developer] Add a note about being patient after gitbox enable

2020-11-10 Thread GitBox
alamb opened a new pull request #8622: URL: https://github.com/apache/arrow/pull/8622 When following the developer instructions, I was hung up for a while on the fact that there seemed to be a time lag between when I completed the gitbox setup and when permissions were actually granted to

[GitHub] [arrow] pitrou opened a new pull request #8623: ARROW-10345: [C++][Compute] Fix NaN handling in sorting and topn kernels

2020-11-10 Thread GitBox
pitrou opened a new pull request #8623: URL: https://github.com/apache/arrow/pull/8623 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] jonkeane commented on a change in pull request #8579: ARROW-10481: [R] Bindings to add, remove, replace Table columns

2020-11-10 Thread GitBox
jonkeane commented on a change in pull request #8579: URL: https://github.com/apache/arrow/pull/8579#discussion_r520602805 ## File path: r/R/table.R ## @@ -254,6 +257,68 @@ names.Table <- function(x) x$ColumnNames() #' @export `[[.Table` <- `[[.RecordBatch` +#' @export

[GitHub] [arrow] pitrou closed pull request #8623: ARROW-10345: [C++][Compute] Fix NaN handling in sorting and topn kernels

2020-11-10 Thread GitBox
pitrou closed pull request #8623: URL: https://github.com/apache/arrow/pull/8623 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] jorgecarleitao commented on pull request #8571: ARROW-10461: [Rust] Fix offset bug in remainder bits

2020-11-10 Thread GitBox
jorgecarleitao commented on pull request #8571: URL: https://github.com/apache/arrow/pull/8571#issuecomment-724694901 @jhorstmann thanks a lot for this fix! @alamb, congrats on your push!      This is an

[GitHub] [arrow] romainfrancois commented on pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-10 Thread GitBox
romainfrancois commented on pull request #8256: URL: https://github.com/apache/arrow/pull/8256#issuecomment-724712441 I've added back an `as_sexp()` for `std::shared_ptr<...>` so that we can return the actual type instead of the catch-all R6, so for example: ```cpp //

[GitHub] [arrow] pitrou closed pull request #8629: ARROW-10353: [C++] Fix handling of compression in Parquet data pages v2

2020-11-10 Thread GitBox
pitrou closed pull request #8629: URL: https://github.com/apache/arrow/pull/8629 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] pitrou commented on pull request #8629: ARROW-10353: [C++] Fix handling of compression in Parquet data pages v2

2020-11-10 Thread GitBox
pitrou commented on pull request #8629: URL: https://github.com/apache/arrow/pull/8629#issuecomment-724899187 Merging now. CI failure looks unrelated. This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] Bei-z commented on a change in pull request #8542: ARROW-10407: [C++] Add BasicDecimal256 Division Support

2020-11-10 Thread GitBox
Bei-z commented on a change in pull request #8542: URL: https://github.com/apache/arrow/pull/8542#discussion_r520841038 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -490,49 +527,60 @@ static void FixDivisionSigns(BasicDecimal128* result, BasicDecimal128* remainder

[GitHub] [arrow] kou commented on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-11-10 Thread GitBox
kou commented on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-724957206 @terencehonles Could you rebase on master to use #8620? This is an automated message from the Apache Git Service. To

[GitHub] [arrow] github-actions[bot] commented on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-724968054 Revision: 3488e1c6fc6b6d5393daa26549a3ea023a627512 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] pitrou opened a new pull request #8629: ARROW-10353: [C++] Fix handling of compression in Parquet data pages v2

2020-11-10 Thread GitBox
pitrou opened a new pull request #8629: URL: https://github.com/apache/arrow/pull/8629 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [arrow] github-actions[bot] commented on pull request #8632: ARROW-10426: [C++] Allow writing large strings to Parquet

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8632: URL: https://github.com/apache/arrow/pull/8632#issuecomment-724902099 https://issues.apache.org/jira/browse/ARROW-10426 This is an automated message from the Apache Git

[GitHub] [arrow] alamb commented on pull request #8553: ARROW-10366: [Rust][DataFusion] Do not buffer intermediate results in merge or HashAggregate

2020-11-10 Thread GitBox
alamb commented on pull request #8553: URL: https://github.com/apache/arrow/pull/8553#issuecomment-724935395 @jorgecarleitao -- when I ran the TPCH benchmark Q1 locally on my machine, I found it kept all my cores busy and the memory profile was low. Thus the improvements offered by this

[GitHub] [arrow] Bei-z commented on a change in pull request #8542: ARROW-10407: [C++] Add BasicDecimal256 Division Support

2020-11-10 Thread GitBox
Bei-z commented on a change in pull request #8542: URL: https://github.com/apache/arrow/pull/8542#discussion_r520841038 ## File path: cpp/src/arrow/util/basic_decimal.cc ## @@ -490,49 +527,60 @@ static void FixDivisionSigns(BasicDecimal128* result, BasicDecimal128* remainder

[GitHub] [arrow] kou commented on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-11-10 Thread GitBox
kou commented on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-724963376 @github-actions crossbow submit wheel-win-* Thanks! This is an automated message from the Apache Git Service.

[GitHub] [arrow] pitrou opened a new pull request #8632: ARROW-10426: [C++] Allow writing large strings to Parquet

2020-11-10 Thread GitBox
pitrou opened a new pull request #8632: URL: https://github.com/apache/arrow/pull/8632 Large strings are still read back as regular strings. This is an automated message from the Apache Git Service. To respond to the

[GitHub] [arrow] pitrou commented on pull request #8632: ARROW-10426: [C++] Allow writing large strings to Parquet

2020-11-10 Thread GitBox
pitrou commented on pull request #8632: URL: https://github.com/apache/arrow/pull/8632#issuecomment-724897273 Perhaps I should add tests on the C++ side. This is an automated message from the Apache Git Service. To respond

[GitHub] [arrow] pitrou commented on a change in pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-10 Thread GitBox
pitrou commented on a change in pull request #8621: URL: https://github.com/apache/arrow/pull/8621#discussion_r520803595 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -1231,6 +1252,302 @@ Result StrptimeResolve(KernelContext* ctx, const std::vector

[GitHub] [arrow] kou commented on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-11-10 Thread GitBox
kou commented on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-724967022 @github-actions crossbow submit wheel-win-* This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-10 Thread GitBox
maartenbreddels commented on a change in pull request #8621: URL: https://github.com/apache/arrow/pull/8621#discussion_r520802117 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -1231,6 +1252,302 @@ Result StrptimeResolve(KernelContext* ctx, const

[GitHub] [arrow] nevi-me commented on pull request #8200: ARROW-8883: [Rust] [Integration] Enable more tests

2020-11-10 Thread GitBox
nevi-me commented on pull request #8200: URL: https://github.com/apache/arrow/pull/8200#issuecomment-724906422 @jorgecarleitao finally passed! There's still some tests that don't pass, but I've updated the status documenation with them (cc @andygrove as you had opened a JIRA for this)

[GitHub] [arrow] terencehonles commented on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-11-10 Thread GitBox
terencehonles commented on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-724962295 done This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [arrow] kou commented on a change in pull request #8622: ARROW-10543: [Developer] Add a note about being patient after gitbox enable

2020-11-10 Thread GitBox
kou commented on a change in pull request #8622: URL: https://github.com/apache/arrow/pull/8622#discussion_r520872414 ## File path: dev/README.md ## @@ -27,6 +27,10 @@ you need to have linked your GitHub and ASF accounts on https://gitbox.apache.org/setup/ to be able to push

[GitHub] [arrow] kou commented on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-11-10 Thread GitBox
kou commented on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-724964363 @github-actions crossbow submit wheel-win-* This is an automated message from the Apache Git Service. To respond to

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8628: ARROW-9489: [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-11-10 Thread GitBox
maartenbreddels commented on a change in pull request #8628: URL: https://github.com/apache/arrow/pull/8628#discussion_r520769300 ## File path: cpp/src/arrow/compute/kernels/scalar_fill_null.cc ## @@ -84,6 +84,52 @@ struct FillNullFunctor::value>> { } }; +template

[GitHub] [arrow] jorisvandenbossche opened a new pull request #8625: ARROW-10511: [Python] Fix to_pandas() conversion in case of metadata mismatch about timezone

2020-11-10 Thread GitBox
jorisvandenbossche opened a new pull request #8625: URL: https://github.com/apache/arrow/pull/8625 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] pitrou closed pull request #8516: PARQUET-1935: [C++] Fix bug in WriteBatchSpaced

2020-11-10 Thread GitBox
pitrou closed pull request #8516: URL: https://github.com/apache/arrow/pull/8516 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [arrow] github-actions[bot] commented on pull request #8625: ARROW-10511: [Python] Fix to_pandas() conversion in case of metadata mismatch about timezone

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8625: URL: https://github.com/apache/arrow/pull/8625#issuecomment-724788448 https://issues.apache.org/jira/browse/ARROW-10511 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-10 Thread GitBox
pitrou commented on pull request #8621: URL: https://github.com/apache/arrow/pull/8621#issuecomment-724796243 > Maybe a good place to put per-kernel pre-compute results are the *Options objects, but I'm not sure if that makes sense in the current architecture. I don't think the

[GitHub] [arrow] pitrou commented on pull request #8626: ARROW-10545: [C++] Fix crash on invalid Parquet file (OSS-Fuzz)

2020-11-10 Thread GitBox
pitrou commented on pull request #8626: URL: https://github.com/apache/arrow/pull/8626#issuecomment-724804154 Note the ASAN CI tests will probably fail until #8617 is merged. This is an automated message from the Apache Git

[GitHub] [arrow] jorisvandenbossche opened a new pull request #8627: ARROW-10546: [Python] Deprecate DaskFileSystem/S3FSWrapper + stop using it internally

2020-11-10 Thread GitBox
jorisvandenbossche opened a new pull request #8627: URL: https://github.com/apache/arrow/pull/8627 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] pitrou commented on a change in pull request #8626: ARROW-10545: [C++] Fix crash on invalid Parquet file (OSS-Fuzz)

2020-11-10 Thread GitBox
pitrou commented on a change in pull request #8626: URL: https://github.com/apache/arrow/pull/8626#discussion_r520685686 ## File path: cpp/src/parquet/level_conversion.cc ## @@ -35,16 +36,17 @@ namespace internal { namespace { using ::arrow::internal::CpuInfo; +using

[GitHub] [arrow] maartenbreddels opened a new pull request #8628: ARROW-9489: [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-11-10 Thread GitBox
maartenbreddels opened a new pull request #8628: URL: https://github.com/apache/arrow/pull/8628 The Python test fails for the `large_binary` test, with output: ``` arr = pa.array([b'a', b'bb', None], type=pa.large_binary()) result = arr.fill_null('ccc')

[GitHub] [arrow] nealrichardson commented on a change in pull request #8256: ARROW-9001: [R] Box outputs as correct type in call_function

2020-11-10 Thread GitBox
nealrichardson commented on a change in pull request #8256: URL: https://github.com/apache/arrow/pull/8256#discussion_r520708027 ## File path: r/src/recordbatch.cpp ## @@ -285,7 +284,8 @@ std::shared_ptr RecordBatch__from_arrays(SEXP schema_sxp, SE int64_t num_rows = 0;

[GitHub] [arrow] jorgecarleitao opened a new pull request #8630: Mutable filter

2020-11-10 Thread GitBox
jorgecarleitao opened a new pull request #8630: URL: https://github.com/apache/arrow/pull/8630 ## Motivation The current code-base to `filter` an array is typed, which means that we can't easily generalize it for arbitrary depths. That code is similar to the code in `take`:

[GitHub] [arrow] github-actions[bot] commented on pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8630: URL: https://github.com/apache/arrow/pull/8630#issuecomment-724862063 https://issues.apache.org/jira/browse/ARROW-10540 This is an automated message from the Apache Git

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8628: ARROW-9489: [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-11-10 Thread GitBox
maartenbreddels commented on a change in pull request #8628: URL: https://github.com/apache/arrow/pull/8628#discussion_r520776198 ## File path: python/pyarrow/tests/test_compute.py ## @@ -860,6 +860,17 @@ def test_fill_null(): expected = pa.array([None, None, None, None])

[GitHub] [arrow] maartenbreddels commented on pull request #8628: ARROW-9489: [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-11-10 Thread GitBox
maartenbreddels commented on pull request #8628: URL: https://github.com/apache/arrow/pull/8628#issuecomment-724881905 Looking at the implementation of `VisitArrayDataInline`, which is quite similar to what is being done in `FillNullFunctor::value>>`, I don't see much room for

[GitHub] [arrow] andygrove commented on a change in pull request #8619: ARROW-10531: [Rust][DataFusion]: Add schema and graphviz formatting for LogicalPlans and a PlanVisitor

2020-11-10 Thread GitBox
andygrove commented on a change in pull request #8619: URL: https://github.com/apache/arrow/pull/8619#discussion_r520655856 ## File path: rust/datafusion/src/logical_plan/mod.rs ## @@ -956,117 +956,567 @@ impl LogicalPlan { } } +/// Trait that implements the [Visitor

[GitHub] [arrow] jorisvandenbossche closed pull request #8557: ARROW-10433 [Python] Swopped the conditions for checking for fsspec filesystems

2020-11-10 Thread GitBox
jorisvandenbossche closed pull request #8557: URL: https://github.com/apache/arrow/pull/8557 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [arrow] github-actions[bot] commented on pull request #8627: ARROW-10546: [Python] Deprecate DaskFileSystem/S3FSWrapper + stop using it internally

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8627: URL: https://github.com/apache/arrow/pull/8627#issuecomment-724813054 https://issues.apache.org/jira/browse/ARROW-10546 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #8626: ARROW-10545: [C++] Fix crash on invalid Parquet file (OSS-Fuzz)

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8626: URL: https://github.com/apache/arrow/pull/8626#issuecomment-724813055 https://issues.apache.org/jira/browse/ARROW-10545 This is an automated message from the Apache Git

[GitHub] [arrow] maartenbreddels commented on a change in pull request #8628: ARROW-9489: [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-11-10 Thread GitBox
maartenbreddels commented on a change in pull request #8628: URL: https://github.com/apache/arrow/pull/8628#discussion_r520701716 ## File path: cpp/src/arrow/compute/kernels/scalar_fill_null.cc ## @@ -84,6 +84,52 @@ struct FillNullFunctor::value>> { } }; +template

[GitHub] [arrow] pitrou commented on a change in pull request #8628: ARROW-9489: [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-11-10 Thread GitBox
pitrou commented on a change in pull request #8628: URL: https://github.com/apache/arrow/pull/8628#discussion_r520724980 ## File path: cpp/src/arrow/compute/kernels/scalar_fill_null.cc ## @@ -84,6 +84,52 @@ struct FillNullFunctor::value>> { } }; +template +struct

[GitHub] [arrow] emkornfield commented on pull request #8626: ARROW-10545: [C++] Fix crash on invalid Parquet file (OSS-Fuzz)

2020-11-10 Thread GitBox
emkornfield commented on pull request #8626: URL: https://github.com/apache/arrow/pull/8626#issuecomment-724844425 LGTM, feel free to merge, or I'll do it when I'm on a computer that i can do it from. This is an automated

[GitHub] [arrow] emkornfield commented on a change in pull request #8626: ARROW-10545: [C++] Fix crash on invalid Parquet file (OSS-Fuzz)

2020-11-10 Thread GitBox
emkornfield commented on a change in pull request #8626: URL: https://github.com/apache/arrow/pull/8626#discussion_r520732790 ## File path: cpp/src/parquet/level_conversion.cc ## @@ -35,16 +36,17 @@ namespace internal { namespace { using ::arrow::internal::CpuInfo; +using

[GitHub] [arrow] andygrove commented on pull request #8619: ARROW-10531: [Rust][DataFusion]: Add schema and graphviz formatting for LogicalPlans and a PlanVisitor

2020-11-10 Thread GitBox
andygrove commented on pull request #8619: URL: https://github.com/apache/arrow/pull/8619#issuecomment-724781932 Thanks @alamb. I am also a fan of using GraphViz to render query plans so this gets a :+1: from me. I will try and find time to review fully later today.

[GitHub] [arrow] pitrou commented on pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-10 Thread GitBox
pitrou commented on pull request #8621: URL: https://github.com/apache/arrow/pull/8621#issuecomment-724797289 Feel free to open a JIRA about that, by the way :-) This is an automated message from the Apache Git Service. To

[GitHub] [arrow] pitrou opened a new pull request #8626: ARROW-10545: [C++] Fix crash on invalid Parquet file (OSS-Fuzz)

2020-11-10 Thread GitBox
pitrou opened a new pull request #8626: URL: https://github.com/apache/arrow/pull/8626 Also removed a memory allocation (probably not performance-critical). This is an automated message from the Apache Git Service. To

[GitHub] [arrow] pitrou commented on a change in pull request #8628: ARROW-9489: [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-11-10 Thread GitBox
pitrou commented on a change in pull request #8628: URL: https://github.com/apache/arrow/pull/8628#discussion_r520727173 ## File path: python/pyarrow/tests/test_compute.py ## @@ -860,6 +860,17 @@ def test_fill_null(): expected = pa.array([None, None, None, None])

[GitHub] [arrow] emkornfield commented on a change in pull request #8629: ARROW-10353: [C++] Fix handling of compression in Parquet data pages v2

2020-11-10 Thread GitBox
emkornfield commented on a change in pull request #8629: URL: https://github.com/apache/arrow/pull/8629#discussion_r520734627 ## File path: cpp/src/parquet/column_reader.cc ## @@ -442,8 +442,10 @@ std::shared_ptr SerializedPageReader::NextPage() {

[GitHub] [arrow] pitrou commented on a change in pull request #8621: ARROW-9128: [C++] Implement string space trimming kernels: trim, ltrim, and rtrim

2020-11-10 Thread GitBox
pitrou commented on a change in pull request #8621: URL: https://github.com/apache/arrow/pull/8621#discussion_r520675172 ## File path: cpp/src/arrow/compute/kernels/scalar_string.cc ## @@ -1231,6 +1252,302 @@ Result StrptimeResolve(KernelContext* ctx, const std::vector

[GitHub] [arrow] github-actions[bot] commented on pull request #8628: ARROW-9489: [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8628: URL: https://github.com/apache/arrow/pull/8628#issuecomment-724819642 https://issues.apache.org/jira/browse/ARROW-9489 This is an automated message from the Apache Git

[GitHub] [arrow] github-actions[bot] commented on pull request #8629: ARROW-10353: [C++] Fix handling of compression in Parquet data pages v2

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8629: URL: https://github.com/apache/arrow/pull/8629#issuecomment-724819641 https://issues.apache.org/jira/browse/ARROW-10353 This is an automated message from the Apache Git

[GitHub] [arrow] pitrou commented on a change in pull request #8628: ARROW-9489: [C++] Add fill_null kernel implementation for (array[string], scalar[string])

2020-11-10 Thread GitBox
pitrou commented on a change in pull request #8628: URL: https://github.com/apache/arrow/pull/8628#discussion_r520725613 ## File path: cpp/src/arrow/compute/kernels/scalar_fill_null.cc ## @@ -84,6 +84,52 @@ struct FillNullFunctor::value>> { } }; +template +struct

[GitHub] [arrow] JayjeetAtGithub closed pull request #8631: Misc fixes 01

2020-11-10 Thread GitBox
JayjeetAtGithub closed pull request #8631: URL: https://github.com/apache/arrow/pull/8631 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [arrow] JayjeetAtGithub opened a new pull request #8631: Misc fixes 01

2020-11-10 Thread GitBox
JayjeetAtGithub opened a new pull request #8631: URL: https://github.com/apache/arrow/pull/8631 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [arrow] github-actions[bot] commented on pull request #8630: Mutable filter

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8630: URL: https://github.com/apache/arrow/pull/8630#issuecomment-724853096 Thanks for opening a pull request! Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Then

[GitHub] [arrow] nevi-me commented on pull request #8590: ARROW-10042: [Rust] BufferData's equality should depend on its capacity; ArrayData's equality should not depend on its BufferDatas' capacities

2020-11-10 Thread GitBox
nevi-me commented on pull request #8590: URL: https://github.com/apache/arrow/pull/8590#issuecomment-724977488 Hi @carols10cents, with the IPC integration and logical equality PRs merged, this should be ready (the Parquet changes). We can merge it in after you rebase :)

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-10 Thread GitBox
yordan-pavlov commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r520890613 ## File path: rust/arrow/benches/filter_kernels.rs ## @@ -14,137 +14,136 @@ // KIND, either express or implied. See the License for the //

[GitHub] [arrow] github-actions[bot] commented on pull request #8386: ARROW-10224: [Python] Build, test, and support Python 3.9

2020-11-10 Thread GitBox
github-actions[bot] commented on pull request #8386: URL: https://github.com/apache/arrow/pull/8386#issuecomment-724990265 Revision: 3488e1c6fc6b6d5393daa26549a3ea023a627512 Submitted crossbow builds: [ursa-labs/crossbow @

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-10 Thread GitBox
yordan-pavlov commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r520898403 ## File path: rust/arrow/benches/filter_kernels.rs ## @@ -14,137 +14,136 @@ // KIND, either express or implied. See the License for the //

[GitHub] [arrow] yordan-pavlov commented on a change in pull request #8630: ARROW-10540 [Rust] Improve filtering

2020-11-10 Thread GitBox
yordan-pavlov commented on a change in pull request #8630: URL: https://github.com/apache/arrow/pull/8630#discussion_r520902160 ## File path: rust/arrow/src/array/transform/mod.rs ## @@ -0,0 +1,496 @@ +// Licensed to the Apache Software Foundation (ASF) under one +// or more

[GitHub] [arrow] carols10cents commented on pull request #8590: ARROW-10042: [Rust] Fix tests involving ArrayData/Buffer equality

2020-11-10 Thread GitBox
carols10cents commented on pull request #8590: URL: https://github.com/apache/arrow/pull/8590#issuecomment-725014079 (updated) This is an automated message from the Apache Git Service. To respond to the message, please log

  1   2   >