mrkn commented on a change in pull request #7477:
URL: https://github.com/apache/arrow/pull/7477#discussion_r443990095
##
File path: cpp/src/arrow/sparse_tensor_test.cc
##
@@ -49,7 +49,10 @@ static inline void AssertCOOIndex(const
std::shared_ptr& sidx, const int
}
}
cyb70289 edited a comment on pull request #7521:
URL: https://github.com/apache/arrow/pull/7521#issuecomment-647914768
> I'm refactoring to nix util::optional. I'm too tired to finish it tonight
so I'll work on it tomorrow morning. If the perf regression isn't gone I'll
rewrite the sort
mrkn commented on a change in pull request #7477:
URL: https://github.com/apache/arrow/pull/7477#discussion_r443990095
##
File path: cpp/src/arrow/sparse_tensor_test.cc
##
@@ -49,7 +49,10 @@ static inline void AssertCOOIndex(const
std::shared_ptr& sidx, const int
}
}
praveenbingo closed pull request #7495:
URL: https://github.com/apache/arrow/pull/7495
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
mrkn commented on a change in pull request #7477:
URL: https://github.com/apache/arrow/pull/7477#discussion_r443982716
##
File path: python/pyarrow/tensor.pxi
##
@@ -339,6 +350,15 @@ shape: {0.shape}""".format(self)
def non_zero_length(self):
return
pitrou commented on pull request #7477:
URL: https://github.com/apache/arrow/pull/7477#issuecomment-648106332
> Can these comments give you an understanding?
No, they don't. They don't explain _why_ the flag is useful. What does it
bring to know that the indices are canonical? The
pitrou commented on pull request #7522:
URL: https://github.com/apache/arrow/pull/7522#issuecomment-648113053
Perhaps @jorisvandenbossche can review this, because I don't much about
Pandas conversions and internals.
This is
wesm commented on pull request #7521:
URL: https://github.com/apache/arrow/pull/7521#issuecomment-648147535
thanks @pitrou and @cyb70289 -- I will spend a little time on the count-sort
implementation and post a new patch
wesm closed pull request #7522:
URL: https://github.com/apache/arrow/pull/7522
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
wesm commented on pull request #7522:
URL: https://github.com/apache/arrow/pull/7522#issuecomment-648145200
+1, I'll go ahead and merge this since I confirmed the memory leak is fixed
This is an automated message from the
jorisvandenbossche commented on pull request #7522:
URL: https://github.com/apache/arrow/pull/7522#issuecomment-648146247
Was just testing it, and can also confirm the case from the issue is fixed
This is an automated
jorisvandenbossche commented on pull request #7395:
URL: https://github.com/apache/arrow/pull/7395#issuecomment-648165633
More comments on this? (apart from ensuring the tests pass)
I should probably still add it to the filesystem docs.
wesm commented on a change in pull request #7321:
URL: https://github.com/apache/arrow/pull/7321#discussion_r444249804
##
File path: format/Schema.fbs
##
@@ -134,11 +134,20 @@ table FixedSizeBinary {
table Bool {
}
+/// Exact decimal value represented as an integer value
pitrou closed pull request #7521:
URL: https://github.com/apache/arrow/pull/7521
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
wesm closed pull request #7516:
URL: https://github.com/apache/arrow/pull/7516
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
romainfrancois opened a new pull request #7524:
URL: https://github.com/apache/arrow/pull/7524
``` r
library(arrow, warn.conflicts = FALSE)
tab <- Table$create(
a = structure(1:4, foo = "bar"),
b = haven::labelled(1:4, label = "description")
)
tab$metadata$r
#>
github-actions[bot] commented on pull request #7523:
URL: https://github.com/apache/arrow/pull/7523#issuecomment-648087751
https://issues.apache.org/jira/browse/ARROW-8733
This is an automated message from the Apache Git
jorisvandenbossche edited a comment on pull request #7522:
URL: https://github.com/apache/arrow/pull/7522#issuecomment-648146247
Was just testing it, and can also confirm the case from the issue is fixed,
and the code looks good to me
wesm commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-648162680
+1. The bot changes can't be done here so going to go ahead and merge this
so I can use it more easily without having to switch branches (to use this
branch) before running
pitrou commented on pull request #7521:
URL: https://github.com/apache/arrow/pull/7521#issuecomment-648019411
Let's leave sorting optimizations for another PR. I'll review this one.
This is an automated message from the
jorisvandenbossche opened a new pull request #7523:
URL: https://github.com/apache/arrow/pull/7523
Not a polished PR, just a quick try (in cython, since that's faster for me)
to expose the RowGroupInfo statistics in Python + convert the expression into
min/max information. More as food
jorisvandenbossche commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444295036
##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
`ARROW-767: [C++] Filesystem
nealrichardson commented on a change in pull request #7514:
URL: https://github.com/apache/arrow/pull/7514#discussion_r444302116
##
File path: r/src/array_from_vector.cpp
##
@@ -1067,12 +1110,22 @@ std::shared_ptr
InferArrowTypeFromVector(SEXP x) {
if (Rf_inherits(x,
wesm opened a new pull request #7525:
URL: https://github.com/apache/arrow/pull/7525
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
romainfrancois commented on a change in pull request #7514:
URL: https://github.com/apache/arrow/pull/7514#discussion_r444308097
##
File path: r/src/array_from_vector.cpp
##
@@ -1067,12 +1110,22 @@ std::shared_ptr
InferArrowTypeFromVector(SEXP x) {
if (Rf_inherits(x,
rjzamora edited a comment on pull request #7523:
URL: https://github.com/apache/arrow/pull/7523#issuecomment-648269136
Thanks for working on this @jorisvandenbossche !
This does seem like the functionality needed by Dask. To test my
understanding (and for the sake of discussion), I
wesm edited a comment on pull request #7525:
URL: https://github.com/apache/arrow/pull/7525#issuecomment-648279829
Here's the sort benchmarks prior to the initial visitor_inline.h changes
gcc-8:
```
benchmark baseline
romainfrancois commented on a change in pull request #7514:
URL: https://github.com/apache/arrow/pull/7514#discussion_r444283172
##
File path: r/src/array_from_vector.cpp
##
@@ -1067,12 +1110,22 @@ std::shared_ptr
InferArrowTypeFromVector(SEXP x) {
if (Rf_inherits(x,
romainfrancois commented on a change in pull request #7514:
URL: https://github.com/apache/arrow/pull/7514#discussion_r444281970
##
File path: r/src/array_from_vector.cpp
##
@@ -1067,12 +1110,22 @@ std::shared_ptr
InferArrowTypeFromVector(SEXP x) {
if (Rf_inherits(x,
wesm commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444285972
##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
`ARROW-767: [C++] Filesystem abstraction
jorisvandenbossche commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444293158
##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
`ARROW-767: [C++] Filesystem
nealrichardson commented on a change in pull request #7524:
URL: https://github.com/apache/arrow/pull/7524#discussion_r444306795
##
File path: r/tests/testthat/test-Table.R
##
@@ -334,5 +334,5 @@ test_that("Table metadata", {
test_that("Table handles null type
wesm commented on pull request #7525:
URL: https://github.com/apache/arrow/pull/7525#issuecomment-648238512
Here are some vector-hash benchmarks comparing this branch with master. The
performance "regressions" are for the 99%-100% null cases, I'll take a quick
look at these in the
fsaintjacques commented on pull request #7517:
URL: https://github.com/apache/arrow/pull/7517#issuecomment-648244980
I can't comment on the production quality of MinIO since I've never used it
in such scenario. I meant this for reference to other developers who wants to
test the S3
nealrichardson commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444333447
##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
`ARROW-767: [C++] Filesystem abstraction
rjzamora commented on pull request #7523:
URL: https://github.com/apache/arrow/pull/7523#issuecomment-648269136
Thanks for working on this @jorisvandenbossche !
This does seem like the functionality needed by Dask. To test my
understanding (and for the sake of discussion), I am
wesm commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444288120
##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
`ARROW-767: [C++] Filesystem abstraction
wesm commented on pull request #7143:
URL: https://github.com/apache/arrow/pull/7143#issuecomment-648230676
I'm not sure what the MSVC failure is about but I'll debug locally
This is an automated message from the Apache Git
nealrichardson commented on a change in pull request #7524:
URL: https://github.com/apache/arrow/pull/7524#discussion_r444311774
##
File path: r/R/table.R
##
@@ -202,7 +210,27 @@ Table$create <- function(..., schema = NULL) {
#' @export
as.data.frame.Table <- function(x,
nealrichardson commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444322251
##
File path: docs/source/developers/contributing.rst
##
@@ -76,46 +96,83 @@ visibility. They may add a "Fix version" to indicate that
they're
nealrichardson commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444330998
##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
`ARROW-767: [C++] Filesystem abstraction
bkietz closed pull request #7513:
URL: https://github.com/apache/arrow/pull/7513
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to
github-actions[bot] commented on pull request #7524:
URL: https://github.com/apache/arrow/pull/7524#issuecomment-648198565
https://issues.apache.org/jira/browse/ARROW-8899
This is an automated message from the Apache Git
jorisvandenbossche commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444268553
##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
`ARROW-767: [C++] Filesystem
romainfrancois commented on a change in pull request #7524:
URL: https://github.com/apache/arrow/pull/7524#discussion_r444273703
##
File path: r/tests/testthat/test-Table.R
##
@@ -334,5 +334,5 @@ test_that("Table metadata", {
test_that("Table handles null type
lionel- commented on a change in pull request #7514:
URL: https://github.com/apache/arrow/pull/7514#discussion_r444292367
##
File path: r/src/array_from_vector.cpp
##
@@ -1067,12 +1110,22 @@ std::shared_ptr
InferArrowTypeFromVector(SEXP x) {
if (Rf_inherits(x,
nealrichardson commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444320449
##
File path: docs/source/developers/contributing.rst
##
@@ -168,11 +274,15 @@ remote repo still holds the old history, you would need
to do a
alippai commented on pull request #7517:
URL: https://github.com/apache/arrow/pull/7517#issuecomment-648247832
Thanks, now I understand. So the pairing with toxiproxy is for the testing
:))
That's what you wrote, I just misunderstood
wesm commented on pull request #7525:
URL: https://github.com/apache/arrow/pull/7525#issuecomment-648270948
OK I'm done twiddling this, here is the latest comparison of the hash
benchmarks versus master with gcc-8:
```
benchmark baseline
wesm commented on pull request #7525:
URL: https://github.com/apache/arrow/pull/7525#issuecomment-648235922
Here's what I see in the sort benchmarks with this patch compared with
7ed698b94, the patch right before the visitor_inline.h changes
```
github-actions[bot] commented on pull request #7525:
URL: https://github.com/apache/arrow/pull/7525#issuecomment-648240180
https://issues.apache.org/jira/browse/ARROW-9214
This is an automated message from the Apache Git
nealrichardson commented on a change in pull request #7520:
URL: https://github.com/apache/arrow/pull/7520#discussion_r444318497
##
File path: docs/source/developers/contributing.rst
##
@@ -124,29 +181,72 @@ To contribute a patch:
`ARROW-767: [C++] Filesystem abstraction
bkietz commented on pull request #7493:
URL: https://github.com/apache/arrow/pull/7493#issuecomment-648252136
Hmm, there's a failure building with GCC 4.8
https://github.com/apache/arrow/pull/7493/checks?check_run_id=791725319#step:9:534
The `#ifdef` condition seems to be failing to
wesm commented on pull request #7525:
URL: https://github.com/apache/arrow/pull/7525#issuecomment-648279829
Here's the sort benchmarks prior to the visitor_inline.h changes
gcc-8:
```
benchmark baseline
kiszk commented on pull request #7507:
URL: https://github.com/apache/arrow/pull/7507#issuecomment-648320579
Are there any comments about this approach for preparing test cases between
different endians? cc @pitrou @wesm
If not, I will prepare other tests (but disabled now) with this
paddyhoran commented on a change in pull request #7500:
URL: https://github.com/apache/arrow/pull/7500#discussion_r43777
##
File path: rust/parquet/src/record/api.rs
##
@@ -893,16 +893,6 @@ mod tests {
assert_eq!(row, Field::TimestampMillis(123854406));
}
paddyhoran closed pull request #7466:
URL: https://github.com/apache/arrow/pull/7466
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
wesm closed pull request #7518:
URL: https://github.com/apache/arrow/pull/7518
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
kszucs commented on pull request #7516:
URL: https://github.com/apache/arrow/pull/7516#issuecomment-648473447
I’m going to update the bot tomorrow.
This is an automated message from the Apache Git Service.
To respond to the
wesm closed pull request #7529:
URL: https://github.com/apache/arrow/pull/7529
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
wesm commented on a change in pull request #7530:
URL: https://github.com/apache/arrow/pull/7530#discussion_r444564799
##
File path: cpp/src/arrow/compute/kernels/scalar_arithmetic_test.cc
##
@@ -39,7 +40,7 @@ namespace arrow {
namespace compute {
template
-class
wesm opened a new pull request #7530:
URL: https://github.com/apache/arrow/pull/7530
I also did a little bit of cleaning, moving some stuff into
`arrow::compute::internal`.
This is an automated message from the Apache Git
wesm commented on pull request #7530:
URL: https://github.com/apache/arrow/pull/7530#issuecomment-648482567
Example use in Python:
```
In [14]: arr = pa.array(pd.date_range('2000-01-01', periods=20))
github-actions[bot] commented on pull request #7530:
URL: https://github.com/apache/arrow/pull/7530#issuecomment-648484942
https://issues.apache.org/jira/browse/ARROW-8934
This is an automated message from the Apache Git
bkietz opened a new pull request #7526:
URL: https://github.com/apache/arrow/pull/7526
The physical schema is required to validate predicates used for filtering
row groups based on statistics.
It can also be explicitly provided to ensure that if no row groups satisfy
the predicate
wesm commented on pull request #7525:
URL: https://github.com/apache/arrow/pull/7525#issuecomment-648410615
I looked at the Parquet read/write benchmarks, the differences look like
mostly noise to me
```
benchmark baselinecontender
wesm commented on pull request #7525:
URL: https://github.com/apache/arrow/pull/7525#issuecomment-648410864
+1. We can work on performance smithing in follow up PRs
This is an automated message from the Apache Git Service.
wesm closed pull request #7525:
URL: https://github.com/apache/arrow/pull/7525
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
maxburke commented on a change in pull request #7500:
URL: https://github.com/apache/arrow/pull/7500#discussion_r70083
##
File path: rust/parquet/src/record/api.rs
##
@@ -893,16 +893,6 @@ mod tests {
assert_eq!(row, Field::TimestampMillis(123854406));
}
maxburke commented on a change in pull request #7500:
URL: https://github.com/apache/arrow/pull/7500#discussion_r70083
##
File path: rust/parquet/src/record/api.rs
##
@@ -893,16 +893,6 @@ mod tests {
assert_eq!(row, Field::TimestampMillis(123854406));
}
jorisvandenbossche commented on a change in pull request #7526:
URL: https://github.com/apache/arrow/pull/7526#discussion_r86902
##
File path: cpp/src/arrow/dataset/file_parquet.cc
##
@@ -357,13 +355,20 @@ static inline Result>
AugmentRowGroups(
return row_groups;
}
github-actions[bot] commented on pull request #7526:
URL: https://github.com/apache/arrow/pull/7526#issuecomment-648401641
https://issues.apache.org/jira/browse/ARROW-9146
This is an automated message from the Apache Git
fsaintjacques commented on a change in pull request #7526:
URL: https://github.com/apache/arrow/pull/7526#discussion_r92049
##
File path: cpp/src/arrow/dataset/file_parquet.cc
##
@@ -357,13 +355,20 @@ static inline Result>
AugmentRowGroups(
return row_groups;
}
jacques-n commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648427018
I'm really struggling with these changes. I don't understand why there is a
validity buffer at the union level as well as at the cell level. I'm not sure
what it even means
bkietz commented on a change in pull request #7526:
URL: https://github.com/apache/arrow/pull/7526#discussion_r444509707
##
File path: cpp/src/arrow/dataset/file_parquet.cc
##
@@ -357,13 +355,20 @@ static inline Result>
AugmentRowGroups(
return row_groups;
}
-Result
wesm closed pull request #7528:
URL: https://github.com/apache/arrow/pull/7528
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
wesm commented on pull request #7528:
URL: https://github.com/apache/arrow/pull/7528#issuecomment-648561165
+1
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
wesm commented on pull request #7530:
URL: https://github.com/apache/arrow/pull/7530#issuecomment-648561822
+1
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and
wesm closed pull request #7530:
URL: https://github.com/apache/arrow/pull/7530
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
wesm commented on pull request #7501:
URL: https://github.com/apache/arrow/pull/7501#issuecomment-648562595
Hm I think this lint step should be merged into the main Lint workflow.
@kszucs can you help?
This is an automated
houqp commented on pull request #7501:
URL: https://github.com/apache/arrow/pull/7501#issuecomment-648576776
@kszucs let me know if there is anything i can help to move it to the main
lint workflow.
This is an automated
jacques-n commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648428724
Adding to my previous comments: if only at the top level, I'm not sure what
the ramification of that would mean at the Java codebase. I think it would
require a fairly massive
jacques-n edited a comment on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648427018
I'm really struggling with these changes. I don't understand why there is a
validity buffer at the union level as well as at the cell level. I'm not sure
what it even
jacques-n commented on a change in pull request #6402:
URL: https://github.com/apache/arrow/pull/6402#discussion_r444514257
##
File path:
java/vector/src/main/java/org/apache/arrow/vector/BaseVariableWidthVector.java
##
@@ -751,55 +757,57 @@ private void
wesm edited a comment on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648435911
> @wesm why would we have validity at both the top level and the inner level?
Well, the way the specification is written
* _All_ nested types including union are
wesm edited a comment on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648435911
> @wesm why would we have validity at both the top level and the inner level?
Well, the way the specification is written
* _All_ nested types including union are
wesm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648439435
FTR I'm OK with dropping the top-level validity bitmap from Union,
especially if it helps us move forward
This is
wesm commented on pull request #7143:
URL: https://github.com/apache/arrow/pull/7143#issuecomment-648446373
I'm able to reproduce the error in VS and set breakpoints, I got this far to
see that GetBatchWithDictSpaced has decoded more values than it was asked to
nealrichardson opened a new pull request #7527:
URL: https://github.com/apache/arrow/pull/7527
Sprinkles `Rf_translateCharUTF8` a few places. I tried to add tests for all
of the different scenarios I could think of where we could have non-UTF strings.
Also includes `$` and `[[`
wesm commented on pull request #7143:
URL: https://github.com/apache/arrow/pull/7143#issuecomment-648451899
there seems to be a situation where the bit run has more values then are
needed to fulfill the call to `GetSpaced`
github-actions[bot] commented on pull request #7527:
URL: https://github.com/apache/arrow/pull/7527#issuecomment-648451652
https://issues.apache.org/jira/browse/ARROW-7018
This is an automated message from the Apache Git
wesm commented on pull request #7143:
URL: https://github.com/apache/arrow/pull/7143#issuecomment-648453423
@emkornfield I'm sort of at a dead end here, hopefully the above gives you
some clues about where there might be a problem
wesm closed pull request #7470:
URL: https://github.com/apache/arrow/pull/7470
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the
github-actions[bot] commented on pull request #7528:
URL: https://github.com/apache/arrow/pull/7528#issuecomment-648472075
https://issues.apache.org/jira/browse/ARROW-8933
This is an automated message from the Apache Git
wesm commented on pull request #7529:
URL: https://github.com/apache/arrow/pull/7529#issuecomment-648472135
I'll merge this ASAP to minimize the number of broken buidls
This is an automated message from the Apache Git
github-actions[bot] commented on pull request #7529:
URL: https://github.com/apache/arrow/pull/7529#issuecomment-648472074
https://issues.apache.org/jira/browse/ARROW-8025
This is an automated message from the Apache Git
wesm opened a new pull request #7529:
URL: https://github.com/apache/arrow/pull/7529
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go
wesm edited a comment on pull request #7529:
URL: https://github.com/apache/arrow/pull/7529#issuecomment-648472135
I'll merge this ASAP to minimize the number of broken builds
This is an automated message from the Apache Git
wesm commented on pull request #7290:
URL: https://github.com/apache/arrow/pull/7290#issuecomment-648435911
> @wesm why would we have validity at both the top level and the inner level?
Well, the way the specification is written
* _All_ nested types including union are
nealrichardson commented on a change in pull request #7527:
URL: https://github.com/apache/arrow/pull/7527#discussion_r444530279
##
File path: r/src/array_from_vector.cpp
##
@@ -159,6 +159,9 @@ struct VectorToArrayConverter {
if (s == NA_STRING) {
1 - 100 of 112 matches
Mail list logo