wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-645600060
+1. Thanks all for the comments
This is an automated message from the Apache Git Service.
To respond to the message,
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-645569875
@ursabot benchmark --benchmark-filter=Filter 04006ff
This is an automated message from the Apache Git Service.
To res
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-645556193
So these "readability" improvements made performance worse so I'll revert
them
This is an automated message from the
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-645526968
@ursabot benchmark --benchmark-filter=Filter 04006ff
This is an automated message from the Apache Git Service.
To res
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-645521577
Something weird with the commit history, I'm not sure those benchmarks are
right. I'll rebase things again and rerun
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-645498297
I think I improved some of the readability problems and addressed the other
comments. I'd like to merge this soon once CI is creen
--
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-645497918
@ursabot benchmark --benchmark-filter=Filter c4f425768
This is an automated message from the Apache Git Service.
To r
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-645004792
I'll have to deal with the string optimization in a follow up PR, so I'm
going to leave this for review as is. It would be good to get this merged
sooner rather than later
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-644920072
True. I think for binary-based types we need to implement
bulk-block-appends. It's beyond the scope of this PR -- I will take a brief
look to see if there's anything dumb (like messi
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-644913406
The string perf regressions are mostly for the cases where 99.9% of the
values are selected. I'll take a closer look at this to see what can be done.
The varbinary case is so importa
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-644892503
@ursabot benchmark --benchmark-filter=Filter 66df3d0
This is an automated message from the Apache Git Service.
To res
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-644892130
@buildbot benchmark --help
This is an automated message from the Apache Git Service.
To respond to the message, pleas
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-644881357
I found some issues in the Python benchmarks I posted before. Here's the
updated setup and current numbers
setup (I was including the cost of converting NumPy booleans to Arrow
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-644870737
I implemented some other optimizations, especially for the case where
neither values nor filter contain nulls. I'm working on updated benchmarks
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-644742275
The RTools 4.0 build is spurious. This is ready for review
This is an automated message from the Apache Git Service.
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-644513681
To show some simple numbers to show the perf before and after in Python,
this example has a high selectivity (all but one value selected) and low
selectivity filter (only 1% of value
wesm commented on pull request #7442:
URL: https://github.com/apache/arrow/pull/7442#issuecomment-644509797
Here's benchmark runs on my machine
* BEFORE: https://gist.github.com/wesm/857a3179e7dbc928d3325b1e7f687086
* AFTER: https://gist.github.com/wesm/ad07cec1613b6327926dfe1d95e7
17 matches
Mail list logo