Re: [pcre-dev] Powerpc optimisation

Zoltán Herczeg Fri, 05 Jun 2015 06:50:32 -0700

Hi Frederic,

thank you for measuring PCRE on PPC. The results are quite interesting.


It seems to me that those patterns are slower whose require heavy backtracking. 
I mean where fast-forward (skipping) algorithms cannot be used (or they match 
too frequently). The /[a-zA-Z]+ing/ is a good example for that. Backtracking 
engines (PCRE, Oniguruma) suffers much more on PPC than those that read input 
once (TRE, RE2). I suspect branch prediction on x86 is better, but only 
statistics profilers can prove that. Oprofile is available everywhere, and can 
profile JIT code. That part is developed by IBM :)

http://oprofile.sourceforge.net/doc/devel/index.html

It needs some extra coding though. If you are interested to work on that, I can 
help.

Btw the Tom.{10,25}river|river.{10,25}Tom pattern is twice as fast on PPC with 
JIT if I understand the numbers correctly.

Regards,
Zoltan

Frederic Bonnard <[email protected]> írta:
>Thanks Zoltan for the quick reply.
>- Ok I think I got it for SSE2.
>- For SIMD instructions, I fear I don't have currently the knowledge for that 
>but
>would be willing to learn/help.
>- A good start would be that 3rd point, about current code and performance
>  status on PPC vs x86.
>  I reused http://sljit.sourceforge.net/regex_perf.html, I hope it is relevant.
>  pcre directory has been updated to use latest 8.37 instead of 8.32.
>  My VMs were :
>  * x86-64 4x2.3GHz 4G memory on a x86-64 host
>  * ppc64el 4x3GHz 4G memory on a P8 host
>  * ppc64 4x3GHz 4G memory on a P8 host
>  All were installed with Ubuntu 14.04 LTS.
>  Note on Ubuntu for ppc64, default is to have binary in 32b running on a 64b
>  kernel, thus the binary 'runtest' is 32b. Maybe I'd need to try with 64b
>  binary.
>  Here is attached the results for those 3 environments. The goal is not to
>  find who's the best but rather find any odd behaviour. Also let's focus on
>  pcre/pcre-jit .
>  Any comment from experts eyes welcomed.
>  On my side, I see very comparable results between ppc64/pcc64el so no major
>  issue on ppc64el. Now, between x86 and ppc64el, the results for the latter
>  seem overall weaker, all the more that the x86 VM has lower freq.
>  Results would need maybe more repetition ? and percentage to compare but I
>  already see some x2 or x3 time slower results for pcre-jit :
>  .{0,3}(Tom|Sawyer|Huckleberry|Finn)
>  [a-zA-Z]+ing
>  ^[a-zA-Z]{0,4}ing[^a-zA-Z]
>  [a-zA-Z]+ing$
>  ^[a-zA-Z ]{5,}$
>  ^.{16,20}$
>  "[^"]{0,30}[?!\.]"
>  Tom.{10,25}river|river.{10,25}Tom
>
>  Any special treatment for these that could make code generated on power 
> weaker ?
>
>  Fred
>
>-- 
>## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 


-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev

Re: [pcre-dev] Powerpc optimisation

Reply via email to