Alright. If you can't see it, then it must have been something in my 
environment. What I did when working on this is run fprof to identify 
potential performance problems, and the version checked showed up as a 
substantial part of the time spent in the regex code. Is that a valid use 
of fprof in your opinion? Since we're running this in a very tight loop I 
actually also wanted to get rid of the keyword.get calls when running 
regexes, and swapped out Regex.run with :re.run, and that substantially 
improved the performance overall.

I think I didn't then go, and profile specifically if removing the version 
check alone will improve the performance by itself. So all I have to back 
up that the version check is the root cause, is fprof.

On Friday, March 15, 2024 at 8:22:29 AM UTC+1 José Valim wrote:

> The 5% also take into account the option processing and result handling. 
> The version check itself is a subset of that. I was not able to measure 
> sensible gains after removing it.
>
> On Fri, Mar 15, 2024 at 7:58 AM Manish sharma <manish...@brsoftech.org> 
> wrote:
>
>> How Machine Learning Services Help Business? 
>> <https://www.brsoftech.com/machine-learning-solutions.html>
>>    
>>    - With Machine Learning consulting services businesses can consider 
>>    cost reduction while boosting performance.
>>    - It helps organizations to timely finish the task with utmost 
>>    accuracy.
>>    - Retrieve information using cutting edge software tools.
>>    - Machine learning works according to recent trends and 
>>    specifications.
>>    - It automates the analysis of past patterns and historical data to 
>>    predict the future.
>>
>>
>> On Fri, Mar 15, 2024 at 12:23 PM 'marcel...@googlemail.com' via 
>> elixir-lang-core <elixir-l...@googlegroups.com> wrote:
>>
>>> The benchmark results I'm getting are indeed not as dramatic as the 
>>> fprof results, but on the other hand also more than the 5% mentioned in the 
>>> PR which introduced the check: 
>>> https://github.com/elixir-lang/elixir/pull/9040
>>>
>>> ```elixir
>>> regex = ~r/^([a-z][a-z0-9\+\-\.]*):/i
>>> re_pattern = regex.re_pattern
>>>
>>> Benchee.run(%{
>>>   "Regex.run/2" => fn -> Regex.run(regex, "foo") end,
>>>   ":re.run/3" => fn -> :re.run("foo", re_pattern, [{:capture, :all, 
>>> :binary}]) end
>>> })
>>> ```
>>>
>>> ```
>>> Name                  ips        average  deviation         median       
>>>   99th %
>>> :re.run/3          2.88 M      346.90 ns  ±3623.51%         333 ns       
>>>   417 ns
>>> Regex.run/2        1.98 M      504.74 ns  ±5851.21%         416 ns       
>>>   542 ns
>>>
>>> Comparison:
>>> :re.run/3          2.88 M
>>> Regex.run/2        1.98 M - 1.46x slower +157.84 ns
>>> ```
>>> On Friday 15 March 2024 at 07:20:11 UTC+1 jan.k...@gmail.com wrote:
>>>
>>>> The difference was definitely measurable just in pure running time of 
>>>> the code, setting aside fprof. I'll post what I have after work today.
>>>>
>>>> On Thursday, March 14, 2024 at 10:21:25 PM UTC+1 José Valim wrote:
>>>>
>>>>> Do you have benchmarks or only the fprof results? fprof is not a 
>>>>> benchmarking tool: comparing fprof results from different code may be 
>>>>> misleading. Proper benchmarking is preferrable. I am benchmarking locally 
>>>>> and I cannot measure any relevant difference even with the whole version 
>>>>> checking removed.
>>>>>
>>>>> On Thu, Mar 14, 2024 at 6:01 PM Jan Krüger <jan.k...@gmail.com> wrote:
>>>>>
>>>>>> Thanks a lot. I'm also happy to share our case, and my fprof results, 
>>>>>> if that helps. I am very sure that my erlang, and elixir versions match, 
>>>>>> on 
>>>>>> the machine where I've tested this. Replacing Regex.run with an 
>>>>>> identical 
>>>>>> call to :re.run should show the performance improvement I've mentioned. 
>>>>>> The 
>>>>>> regex we've tested this on is: 
>>>>>>
>>>>>> ~r/^([a-z][a-z0-9\+\-\.]*):/i
>>>>>>
>>>>>> On Thursday, March 14, 2024 at 5:55:47 PM UTC+1 
>>>>>> marcel...@googlemail.com wrote:
>>>>>>
>>>>>>> I'm the maintainer of RDF.ex library with the RDF.IRI module 
>>>>>>> mentioned in the OP. I can confirm that this fix doesn't affect the 
>>>>>>> problem, since we're actually not using `URI.parse/1` most of the time 
>>>>>>> (we 
>>>>>>> use it only when dealing with relative URIs). Even in this case the 
>>>>>>> `Regex.version/0` call in `Regex.safe_run/3` (
>>>>>>> https://github.com/elixir-lang/elixir/blob/b8fca42e58850b56f65d0fb8a2086f2636141f61/lib/elixir/lib/regex.ex#L533)
>>>>>>>  
>>>>>>> still performs the `:erlang.system_info/0` call. 
>>>>>>>
>>>>>>> On Thursday 14 March 2024 at 17:15:40 UTC+1 jan.k...@gmail.com 
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I read the commit, and I don't it fixes what our actual problem 
>>>>>>>> was. See my comment above. The problem is the actual call to 
>>>>>>>> :re.version, 
>>>>>>>> not the recompilation of the regex
>>>>>>>>
>>>>>>>> On Thursday, March 14, 2024 at 4:37:43 PM UTC+1 José Valim wrote:
>>>>>>>>
>>>>>>>>> I have pushed a fix to main. But also note we provide precompiled 
>>>>>>>>> Elixir versions per OTP version. Using a matching version will always 
>>>>>>>>> give 
>>>>>>>>> you the best results and that's not only about regexes. :)
>>>>>>>>>
>>>>>>>>> On Thu, Mar 14, 2024 at 2:20 PM Jan Krüger <jan.k...@gmail.com> 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I've recently had to work on a code base that parses largish RDF 
>>>>>>>>>> XML files. Part of the code base does relatively simple but regular 
>>>>>>>>>> expression matches, but since the files are large, quite a lot of 
>>>>>>>>>> Regex.run 
>>>>>>>>>> calls. While profiling I've noticed, that there are callouts to 
>>>>>>>>>> :erlang.system_info, which fetches the PCRE version BEAM was 
>>>>>>>>>> compiled 
>>>>>>>>>> against.
>>>>>>>>>>
>>>>>>>>>> An example regular expression from the code base in question 
>>>>>>>>>> matches the schema part of a URL. I've replaced Regex.run with 
>>>>>>>>>> erlang's 
>>>>>>>>>> :re.run for testing purposes, and at least for this case, there 
>>>>>>>>>> performance 
>>>>>>>>>> gain is quite dramatic.
>>>>>>>>>>
>>>>>>>>>> Comparing fprof results:
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> RDF.IRI.scheme/1                                               
>>>>>>>>>> 1176473   30615.618    2354.355
>>>>>>>>>> ---
>>>>>>>>>> RDF.IRI.scheme/1                                               
>>>>>>>>>> 1176473    3531.955    2353.905
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> I found this thread in the google group, which actually talk 
>>>>>>>>>> about the reasoning for fetching the version, and proposes and 
>>>>>>>>>> alternative.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> https://groups.google.com/g/elixir-lang-core/c/CgFdxIONvGg/m/HN9ryeVXAwAJ?pli=1
>>>>>>>>>>
>>>>>>>>>> Especially
>>>>>>>>>>
>>>>>>>>>> ```
>>>>>>>>>> Taking a further look at the code, the issue with recompiling 
>>>>>>>>>> regexes on the fly is that it makes executing the regexes more 
>>>>>>>>>> expensive, 
>>>>>>>>>> as we need to compute the version on every execution. We could store 
>>>>>>>>>> the 
>>>>>>>>>> version in ETS but that would have performance issues. Storing in a 
>>>>>>>>>> persistent_term would be great, but at the moment we support 
>>>>>>>>>> Erlang/OTP 
>>>>>>>>>> 20+. Thoughts?
>>>>>>>>>> ```
>>>>>>>>>>
>>>>>>>>>> Since this has a fairly noticeable impact, at least on all tests 
>>>>>>>>>> I've run, I wanted to start a discussion, if this could be 
>>>>>>>>>> implemented/improved now.
>>>>>>>>>>
>>>>>>>>>> -- 
>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>> Google Groups "elixir-lang-core" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>> send an email to elixir-lang-co...@googlegroups.com.
>>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/44d498c7-82a4-46d2-89be-7919400e0297n%40googlegroups.com
>>>>>>>>>>  
>>>>>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/44d498c7-82a4-46d2-89be-7919400e0297n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "elixir-lang-core" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to elixir-lang-co...@googlegroups.com.
>>>>>>
>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/elixir-lang-core/507e6bd5-9be9-49a3-b039-45c2173fd509n%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/elixir-lang-core/507e6bd5-9be9-49a3-b039-45c2173fd509n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elixir-lang-core" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elixir-lang-co...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elixir-lang-core/fc14260c-67cb-4ee2-801d-6260794b24afn%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elixir-lang-core/fc14260c-67cb-4ee2-801d-6260794b24afn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>>
>> -- 
>> Kind Regards, 
>> Manish Kr. Sharma 
>> Digital Marketing Manager
>>
>> Website: www.brsoftech.com
>> E-mail: manish...@brsoftech.org
>>
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elixir-lang-core" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elixir-lang-co...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elixir-lang-core/CABUB1NRDgRTi1woeWX1Shn%3DfuHQMU3cByAUWASXZp4Ye1jif2g%40mail.gmail.com
>>  
>> <https://groups.google.com/d/msgid/elixir-lang-core/CABUB1NRDgRTi1woeWX1Shn%3DfuHQMU3cByAUWASXZp4Ye1jif2g%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elixir-lang-core/ed6c6a7f-74f8-4a49-8c65-42b1ddd8a400n%40googlegroups.com.

Reply via email to