Hi,
fefe (a german blogger, [http://blog.fefe.de](http://blog.fefe.de)/ ) has
called for a little benchmark. The challenge is to count words in an webserver
logfile (from stdin), then output them ordered like so:
150 foobaa
80 faa
12 www
Run
Blog entries:
* [https://blog.fefe.de/?ts=a2689de5](https://blog.fefe.de/?ts=a2689de5)
* [https://blog.fefe.de/?ts=b7c295e7](https://blog.fefe.de/?ts=b7c295e7)
Benchmarks:
*
[http://ptrace.fefe.de/wp/timings2019.txt](http://ptrace.fefe.de/wp/timings2019.txt)
*
[https://ptrace.fefe.de/wp/timings.txt](https://ptrace.fefe.de/wp/timings.txt)
Competing scripts:
* [https://ptrace.fefe.de/wp](https://ptrace.fefe.de/wp)/
My idiomatic version:
## compile with: nim c -d:release --opt:speed --passL:-s wp_idiomatic.nim
import tables, strutils
proc main() =
var countTable = newCountTable[string]()
var line: string = ""
while true:
if not readLine(stdin, line): break
for word in line.split():
countTable.inc(word)
countTable.sort()
for key, val in countTable.pairs():
echo val, " ", key
main()
Run
for my testfile i appended a few nginx logs together:
292M /tmp/bigger.log
Run
timings of the c version:
time cat /tmp/bigger.log | ./a.out
real 0m2,373s
user 0m2,276s
sys 0m0,276s
Run
timings of the nim idiomatic version:
time cat /tmp/bigger.log | ./wp_idiomatic
real 0m2,869s
user 0m2,766s
sys 0m0,280s
Run
which i think is already a good speed (for this naive approach)?! So can one
beat the c version? :D