Here's another optimized version. Benchmark performance is the same as the [official one](https://github.com/benhoyt/countwords/blob/master/optimized.nim) on my machine, but it uses an iterator for chunking and zero_functional for the main loop. Looks a lot tidier IMHO: import zero_functional from tables import newCountTable, inc, sort, pairs from algorithm import SortOrder from strutils import split, toLowerAscii const minChunkSize = 64*1024 whiteSpace = {' ', '\n'} iterator chunks(buffer: var string, inFile: File): var string = while(not inFile.endOfFile): buffer.setLen minChunkSize buffer.setLen inFile.readChars(buffer, 0, minChunkSize) var rest: string if inFile.readLine(rest): buffer.add rest yield buffer proc main() = var buf = newString(minChunkSize) table = newCountTable[string]() chunks(buf, stdin) --> map(it.toLowerAscii).foreach( strutils.split(it, whiteSpace) --> filter(it.len > 0).foreach( table.inc it ) ) table.sort SortOrder.Descending table.pairs --> foreach( echo(it[0], ' ', it[1]) ) when isMainModule: main() Run
Some casual optimization rules for mere mortals: * Try proven abstractions like `iterator`, `String` and (in this case) zero_functionals' `-->` first. They tend to be pretty efficient in Nim. Manually fiddling around with indices probably just means standing in the way of the optimizer. * Tidy, efficient code has a chance to stay efficient in big, real-life code bases when change happens. A "hand-optimized" code desert doesn't: most people won't touch what they don't understand out of fear of introducing bugs. * Don't over-do it. For the above code, I wrote an in-place version of `toLowerAscii` to be super smart about avoiding memory allocation overhead. Didn't help at all (the official benchmark script uses `--gc:arc`) . * Keep it simple. Over-complicated "optimal" code with a bug is much worse than correct, not-so-optimal code. * The only benchmark which is not at least somewhat of a lie is production code running in a production environment.