zhuliquan commented on PR #12270:
URL: https://github.com/apache/datafusion/pull/12270#issuecomment-2543947569
> I wonder if we can see improvements on queries in benchmarks with scalar
regexes, e.g. clickbench?
hello, @Dandandan I have write a benchmark for testing scalar regex match in
PR #13789. I got below diff (before: without pre-compiled pattern, after: with
pre-compiled pattern)
```text
test email address pattern
time: [14.047 ms 14.155 ms 14.264 ms]
change: [+37.888% +41.826% +45.676%] (p = 0.00 <
0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
test ip pattern time: [3.5913 ms 3.6025 ms 3.6139 ms]
change: [-45.614% -45.332% -45.065%] (p = 0.00 <
0.05)
Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
test phone number pattern
time: [12.893 ms 13.067 ms 13.303 ms]
change: [-51.353% -50.433% -49.409%] (p = 0.00 <
0.05)
Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
test html tag pattern time: [13.158 ms 13.491 ms 13.865 ms]
change: [+26.127% +29.636% +33.599%] (p = 0.00 <
0.05)
Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
4 (4.00%) high mild
8 (8.00%) high severe
test url pattern time: [12.467 ms 12.594 ms 12.726 ms]
change: [+38.072% +42.490% +45.651%] (p = 0.00 <
0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) low mild
4 (4.00%) high mild
test date pattern time: [12.429 ms 12.523 ms 12.629 ms]
change: [-37.932% -37.049% -36.163%] (p = 0.00 <
0.05)
Performance has improved.
```
I'am very confused that some cases have improved and others have regressed.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]