Surprised no one has yet mentioned:
import std/algorithm
proc main =
var s = "Hello world"
when defined(faster): reverse(s, 0, s.len - 1)
for i in 1..200_000_000: reverse(s, 0, s.len - 1)
main()
Run
and which side steps all the allocation stuff. Someone coming from Rust / new
to Nim might be unaware of `var` vs. `let` or that this even was an allocation
benchmark as @treeform observes. As with any measurement, it depends on what
the point of said measurement is. Not having shown his Rust, I suspect it was
not actually intended to measure alloc/dealloc cycles which Rust folk tend to
avoid. So, the above might help @walkr.
FWIW, at least my backend gcc-13.2 compiler is **_NOT_** smart enough to
optimize out the work (even with LTO & PGO which do make that program about
2.2X faster, incidentally, down to about 0.316 seconds on an otherwise idle
i7-6700k (in a not atypical pattern of speed-up @Yardanico and I tend to
observe; He could try clang if he likes :-) )). It is a good point to check
that, though. Also, you may do much better with `--passC:-march=native` since
my `perf report` shows `vpshufb` (i.e. AVX2) in the inner loop which netted
1.6X for me. { Also, [tim](https://github.com/c-blake/bu/blob/main/doc/tim.md)
is more careful about "reproduction" / "meaningfulness of ±" on time-sharing
OSes (i.e. - almost all of 'em) than `benchy` currently is, but both times &
deltas here are big enough that timing is not subtle. }
* * *
As far as that allocation stuff goes, it seems like what is going on is that
the extra call before the loop suppresses a destructor call (in both Nim-1.6
and Nim-2.0/Nim-devel). Doing no `-d:faster` and `-d:faster` and diffing the
`~/.nim/cache` files got me (I just moved the first version
`~/.cache/nim/rWalkr_r` to `~/.cache/nim/rWalkr_r0`:
diff -Nu rWalkr_r0/@mrWalkr.nim.c rWalkr_r/@mrWalkr.nim.c
--- rWalkr_r0/@mrWalkr.nim.c 2023-08-08 06:08:32.887747209 -0400
+++ rWalkr_r/@mrWalkr.nim.c 2023-08-08 06:08:50.344965932 -0400
@@ -122,9 +122,20 @@
#line 6 "/tmp/k/rWalkr.nim"
N_LIB_PRIVATE N_NIMCALL(void, main__r87alkr_u7)(void) {
#line 7
-NimStringV2 s;NIM_BOOL* nimErr_;{nimErr_ = nimErrorFlag(); s.len = 0;
s.p = NIM_NIL;// let s = "Hello world"
+NimStringV2 s;
+#line 8
+NimStringV2 colontmpD_;NIM_BOOL* nimErr_;{nimErr_ = nimErrorFlag();
s.len = 0; s.p = NIM_NIL; colontmpD_.len = 0; colontmpD_.p = NIM_NIL;//
let s = "Hello world"
+
+#line 7
// let s = "Hello world"
- s = TM__3QhDSXtO082ujkyrhqLedQ_3; {
+ s = TM__3QhDSXtO082ujkyrhqLedQ_3;// when defined(faster): discard
reverse(s)
+
+#line 8
+// when defined(faster): discard reverse(s)
+// when defined(faster): discard reverse(s)
+// when defined(faster): discard reverse(s)
+ colontmpD_ = reverse__r87alkr_u1(s); if (NIM_UNLIKELY(*nimErr_))
goto BeforeRet_; (void)(colontmpD_);
+ {
#line 9
NI i;
#line 90 "/usr/lib/nim/lib/system/iterators_1.nim"
@@ -153,7 +164,8 @@
if (_.p && !(_.p->cap & NIM_STRLIT_FLAG)) {
dealloc(_.p);} } LA3: ;
}
}
- }BeforeRet_: ;
+// proc `=destroy`*(x: string) {.inline, magic: "Destroy".} =
+ if (colontmpD_.p && !(colontmpD_.p->cap & NIM_STRLIT_FLAG)) {
dealloc(colontmpD_.p);} }BeforeRet_: ;
}
N_LIB_PRIVATE void PreMainInner(void) {
Run
`nim c --mm:arc --expandArc:reverse` does not show that `=destroy` either
with/without `-d:faster` { not sure if it is supposed to show _every_ `=op`..
}. So, there may be a real issue or two here...