@Stefan_Salewski, you can also just call the libc memchr (which is what the
current memfiles does to delimit lines aka slices until string conversion). A
good memchr will do the SSE/AVX internally. For example this program:
import memfiles, os, times
var f = memfiles.open(paramStr(1))
var cnt = 0
let t0 = epochTime()
for x in memSlices(f): inc(cnt) # The action here is just one line
let dt = (epochTime() - t0) * 1e9
echo "cnt: ", cnt, " ns: ", dt, " ns/cnt: ", dt / float(cnt),
" B/ns: ", float(f.size) / dt
f.close
On my Linux machine this Nim program runs about as fast as the `memchrcount`
version in Dan Lemiere's approach (which is about 4x the "naive"/"strawman"
approach). While Dan does get a speed-up of another 3x more over that `memchr`
approach with vector code, that also loses the feature to do anything besides
_count_ newlines. Since some kind of non-"mere counting" processing is usually
desired, the `memchr` approach is probably what most people would want.