Bram Moolenaar wrote:
>
> Dominique Pellé <[email protected]> wrote:
>
>> > Hi
>> >
>> > The regexp .\{1,9000} is very slow with the new regexp engine
>> > compared to the old engine. Compare the timing of those 2
>> > commands where the only difference is set re=1 vs set re=2:
>> >
>> > $ time vim -X -u NONE \
>> > -c 'set re=1' \
>> > -c 'call feedkeys("ihello\<esc>/.\\{1,9000}\<cr>:q!\<cr>")'
>> >
>> > real 0m0.004s
>> > user 0m0.000s
>> > sys 0m0.004s
>> >
>> >
>> > $ time vim -X -u NONE \
>> > -c 'set re=2' \
>> > -c 'call feedkeys("ihello\<esc>/.\\{1,9000}\<cr>:q!\<cr>")'
>> >
>> > real 0m6.499s
>> > user 0m6.432s
>> > sys 0m0.020s
>> >
>> > => new regexp engine is 1624x times times slower here.
>> >
>> > Is this a known limitation of the new NFA regexp engine?
>> >
>> > Regards
>> > Dominique
>>
>>
>> Changing the value of n in the regexp .\{1,n}
>> I see that the time it takes to match quadruples
>> when doubling n with the new regexp engine.
>>
>> => This means that complexity is O(n^2):
>>
>> With the old regexp engine, the time is constant.
>>
>> regexp re=1 re=2
>> ------------------------------
>> .\{1,1000} 0.004s 0.064s
>> .\{1,2000} 0.005s 0.305s
>> .\{1,4000} 0.005s 1.230s
>> .\{1,8000} 0.005s 4.799s
>> .\{1,16000} 0.005s 20.902s
>>
>> This was measured using for n=16000 for example using:
>>
>> $ time vim -X -u NONE -c 'set re=2' \
>> -c 'call feedkeys("ihello\<esc>/.\\{1,16000}\<cr>:q!\<cr>")'
>
> Is this specific for matching with "."?
It's also slow with at least:
re=1 re=2
----------------------------------
[a-z]\{1,8000} 0.004s 5.890s
\h\{1,8000} 0.004s 5.824s
I see that not only the complexity to match is O(n^2)
with the new regexp engine for the regexp .\{1,n},
but the complexity is also linear with the length N
of the line being matched. So the complexity seems
to be: O(N*n^2).
Time to match .\{1,8000} with line length of 1, 2, 4, 8, 16, 32:
line to match re=1 re=2
----------------------------------------------------
1 0.004s 0.016s
12 0.004s 1.527s
1234 0.004s 4.391s
12345678 0.004s 10.167s
1234567890123456 0.004s 21.466s
12345678901234567890123456789012 0.004s 44.849s
This was measured using for example:
$ time vim -X -u NONE -c 'set re=2' \
-c 'call feedkeys("i12345678\<esc>/.\\{1,8000}\<cr>:q!\<cr>")'
real 0m10.167s
user 0m10.085s
sys 0m0.024s
The non-greedy version is fast:
$ time vim -X -u NONE -c 'set re=2' \
-c 'call feedkeys("i12345678\<esc>/.\\{-1,8000}\<cr>:q!\<cr>")'
real 0m0.056s
ser 0m0.020s
sys 0m0.028s
> I would expect the old engine to be slower when what's being matched may
> fail.
I cannot find such an example. The old regexp engine
seems always much faster with things like \{1,8000}
Dominique
--
--
You received this message from the "vim_dev" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups
"vim_dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.