Surprised no one has yet mentioned:
    
    
    import std/algorithm
    proc main =
      var s = "Hello world"
      when defined(faster): reverse(s, 0, s.len - 1)
      for i in 1..200_000_000: reverse(s, 0, s.len - 1)
    main()
    
    
    Run

and which side steps all the allocation stuff. Someone coming from Rust / new 
to Nim might be unaware of `var` vs. `let` or that this even was an allocation 
benchmark as @treeform observes. As with any measurement, it depends on what 
the point of said measurement is. Not having shown his Rust, I suspect it was 
not actually intended to measure alloc/dealloc cycles which Rust folk tend to 
avoid. So, the above might help @walkr.

FWIW, at least my backend gcc-13.2 compiler is **_NOT_** smart enough to 
optimize out the work (even with LTO & PGO which do make that program about 
2.2X faster, incidentally, down to about 0.316 seconds on an otherwise idle 
i7-6700k (in a not atypical pattern of speed-up @Yardanico and I tend to 
observe; He could try clang if he likes :-) )). It is a good point to check 
that, though. Also, you may do much better with `--passC:-march=native` since 
my `perf report` shows `vpshufb` (i.e. AVX2) in the inner loop which netted 
1.6X for me. { Also, [tim](https://github.com/c-blake/bu/blob/main/doc/tim.md) 
is more careful about "reproduction" / "meaningfulness of ±" on time-sharing 
OSes (i.e. - almost all of 'em) than `benchy` currently is, but both times & 
deltas here are big enough that timing is not subtle. }

* * *

As far as that allocation stuff goes, it seems like what is going on is that 
the extra call before the loop suppresses a destructor call (in both Nim-1.6 
and Nim-2.0/Nim-devel). Doing no `-d:faster` and `-d:faster` and diffing the 
`~/.nim/cache` files got me (I just moved the first version 
`~/.cache/nim/rWalkr_r` to `~/.cache/nim/rWalkr_r0`:
    
    
    diff -Nu rWalkr_r0/@mrWalkr.nim.c rWalkr_r/@mrWalkr.nim.c
    --- rWalkr_r0/@mrWalkr.nim.c    2023-08-08 06:08:32.887747209 -0400
    +++ rWalkr_r/@mrWalkr.nim.c     2023-08-08 06:08:50.344965932 -0400
    @@ -122,9 +122,20 @@
     #line 6 "/tmp/k/rWalkr.nim"
     N_LIB_PRIVATE N_NIMCALL(void, main__r87alkr_u7)(void) {
     #line 7
    -NimStringV2 s;NIM_BOOL* nimErr_;{nimErr_ = nimErrorFlag();     s.len = 0; 
s.p = NIM_NIL;//  let s = "Hello world"
    +NimStringV2 s;
    +#line 8
    +NimStringV2 colontmpD_;NIM_BOOL* nimErr_;{nimErr_ = nimErrorFlag();    
s.len = 0; s.p = NIM_NIL;       colontmpD_.len = 0; colontmpD_.p = NIM_NIL;//  
let s = "Hello world"
    +
    +#line 7
     //  let s = "Hello world"
    -       s = TM__3QhDSXtO082ujkyrhqLedQ_3;       {
    +       s = TM__3QhDSXtO082ujkyrhqLedQ_3;//  when defined(faster): discard 
reverse(s)
    +
    +#line 8
    +//  when defined(faster): discard reverse(s)
    +//  when defined(faster): discard reverse(s)
    +//  when defined(faster): discard reverse(s)
    +       colontmpD_ = reverse__r87alkr_u1(s);    if (NIM_UNLIKELY(*nimErr_)) 
goto BeforeRet_;    (void)(colontmpD_);
    +       {
     #line 9
     NI i;
     #line 90 "/usr/lib/nim/lib/system/iterators_1.nim"
    @@ -153,7 +164,8 @@
                                    if (_.p && !(_.p->cap & NIM_STRLIT_FLAG)) { 
dealloc(_.p);}                      } LA3: ;
                    }
            }
    -       }BeforeRet_: ;
    +//  proc `=destroy`*(x: string) {.inline, magic: "Destroy".} =
    +       if (colontmpD_.p && !(colontmpD_.p->cap & NIM_STRLIT_FLAG)) { 
dealloc(colontmpD_.p);}   }BeforeRet_: ;
     }
     
     N_LIB_PRIVATE void PreMainInner(void) {
    
    
    Run

`nim c --mm:arc --expandArc:reverse` does not show that `=destroy` either 
with/without `-d:faster` { not sure if it is supposed to show _every_ `=op`.. 
}. So, there may be a real issue or two here...

Reply via email to