Branch: refs/heads/blead
  Home:   https://github.com/Perl/perl5
  Commit: 4e9d64bf17d377e8e2f1412ef936b804fb0ee215
      
https://github.com/Perl/perl5/commit/4e9d64bf17d377e8e2f1412ef936b804fb0ee215
  Author: Richard Leach <[email protected]>
  Date:   2025-12-02 (Tue, 02 Dec 2025)

  Changed paths:
    M pp.c

  Log Message:
  -----------
  pp_reverse - chunk-at-a-time string reversal

The performance characteristics of string reversal in blead is very
variable depending upon the capabilities of the C compiler. Some
compilers are able to vectorize some cases for better performance.

This commit introduces explicit reversal and swapping of whole
registers at a time, which all builds seem to be able to
benefit from.

The `_swab_xx_` macros for doing this already exist in perl.h,
using them for this purpose was inspired by
https://dev.to/wunk/fast-array-reversal-with-simd-j3p
The bit shifting done by these macros should be portable and reasonably
performant if not optimised further, but it is likely that they will
be optimised to bswap, rev, movbe instructions.

Some performance comparisons:

1. Large string reversal, with different source & destination buffers
    my $x = "X"x(1024*1000*10); my $y; for (0..1_000) { $y = reverse $x }

gcc blead:
          2,388.30 msec task-clock                       #    0.993 CPUs 
utilized
    10,574,195,388      cycles                           #    4.427 GHz
    61,520,672,268      instructions                     #    5.82  insn per 
cycle
    10,255,049,869      branches                         #    4.294 G/sec

clang blead:
            688.37 msec task-clock                       #    0.946 CPUs 
utilized
     3,161,754,439      cycles                           #    4.593 GHz
     8,986,420,860      instructions                     #    2.84  insn per 
cycle
       324,734,391      branches                         #  471.745 M/sec

gcc patched:
            408.39 msec task-clock                       #    0.936 CPUs 
utilized
     1,617,273,653      cycles                           #    3.960 GHz
     6,422,991,675      instructions                     #    3.97  insn per 
cycle
       644,856,283      branches                         #    1.579 G/sec

clang patched:
            397.61 msec task-clock                       #    0.924 CPUs 
utilized
     1,655,838,316      cycles                           #    4.165 GHz
     5,782,487,237      instructions                     #    3.49  insn per 
cycle
       324,586,437      branches                         #  816.350 M/sec

2. Large string reversal, but reversing the buffer in-place
    my $x = "X"x(1024*1000*10); my $y; for (0..1_000) { $y = reverse "foo",$x }

gcc blead:
          6,038.06 msec task-clock                       #    0.996 CPUs 
utilized
    27,109,273,840      cycles                           #    4.490 GHz
    41,987,097,139      instructions                     #    1.55  insn per 
cycle
     5,211,350,347      branches                         #  863.083 M/sec

clang blead:
          5,815.86 msec task-clock                       #    0.995 CPUs 
utilized
    26,962,768,616      cycles                           #    4.636 GHz
    47,111,208,664      instructions                     #    1.75  insn per 
cycle
     5,211,117,921      branches                         #  896.018 M/sec

gcc patched:
          1,003.49 msec task-clock                       #    0.999 CPUs 
utilized
     4,298,242,624      cycles                           #    4.283 GHz
     7,387,822,303      instructions                     #    1.72  insn per 
cycle
       725,892,855      branches                         #  723.367 M/sec

clang patched:
            970.78 msec task-clock                       #    0.973 CPUs 
utilized
     4,436,489,695      cycles                           #    4.570 GHz
     8,028,374,567      instructions                     #    1.81  insn per 
cycle
       725,867,979      branches                         #  747.713 M/sec

3. Short string reversal, different source & destination (checking performance 
on
smaller string reversals - note: this one's vary variable due to noise)
    my $x = "1234567"; my $y; for (0..10_000_000) { $y = reverse $x }

gcc blead:
            401.20 msec task-clock                       #    0.916 CPUs 
utilized
     1,672,263,966      cycles                           #    4.168 GHz
     5,564,078,603      instructions                     #    3.33  insn per 
cycle
      1,250,983,219      branches                         #    3.118 G/sec

clang blead:
            380.58 msec task-clock                       #    0.998 CPUs 
utilized
     1,615,634,265      cycles                           #    4.245 GHz
     5,583,854,366      instructions                     #    3.46  insn per 
cycle
     1,300,935,443      branches                         #    3.418 G/sec

gcc patched:
            381.62 msec task-clock                       #    0.999 CPUs 
utilized
     1,566,807,988      cycles                           #    4.106 GHz
     5,474,069,670      instructions                     #    3.49  insn per 
cycle
     1,240,983,221      branches                         #    3.252 G/sec

clang patched:
            346.21 msec task-clock                       #    0.999 CPUs 
utilized
     1,600,780,787      cycles                           #    4.624 GHz
     5,493,773,623      instructions                     #    3.43  insn per 
cycle
     1,270,915,076      branches                         #    3.671 G/sec


  Commit: dd8309d23d7193f614a9c35def5da5bc2fed9300
      
https://github.com/Perl/perl5/commit/dd8309d23d7193f614a9c35def5da5bc2fed9300
  Author: Richard Leach <[email protected]>
  Date:   2025-12-02 (Tue, 02 Dec 2025)

  Changed paths:
    M pp.c

  Log Message:
  -----------
  Add MSVC support for chunk-at-a-time string reversal

The heavy lifting for this commit was done by bulk88 in GH#23374.

Any deficiencies in transcription are down to richardleach. ;)


Compare: https://github.com/Perl/perl5/compare/407267be97c8...dd8309d23d71

To unsubscribe from these emails, change your notification settings at 
https://github.com/Perl/perl5/settings/notifications

Reply via email to