First, always best to have reproducible test data! Great initiative, @Zoom!
@tcheran did not specify if the grid was fixed over samples or varying. Either could make sense (e.g. with wandering sensors that use the GPS satellite network to self-locate), but very different perf numbers & optimization ideas arise in the two situations (small hash table, but long lists vs. Zoom's giant hash table, short lists). For example, this generator program is similar, but uses a fixed grid: import std/[os, random, strutils, strformat], cligen/osUt const NLines = 3000 var rng = initRand(0xDEADBEEF'i64) if paramCount() != 2: quit "Usage: start end", 1 let a = parseInt(paramStr(1)) let b = parseInt(paramStr(2)) var grid: seq[(uint64, uint64)] for _ in 1..NLines: let x = rng.next mod 1_000_000 let y = rng.next mod 10_000_00 grid.add (x, y) for fNum in a..b: var f = open(&"{fNum:05}.txt", fmWrite) if not f.isNil: for (x, y) in grid: let c = rng.rand(40000) - 20000 let d = c div 100 let e = abs(c) mod 100 f.urite x, '\t', y, "\tMX890M1E\t", d, '.', &"{e:02}\n" f.close Run I ran the above with "coarse grained parallelism" (usually fine), i.e.: zoomDat 1 4500& zoomDat 4501 9000& zoomDat 9001 13500& zoomDat 13501 18000& Run My prior programs have 2 bugs - to match results, `emptySeq` should be declared simply as `emptySeq: seq[string]`. Second, there needs to be a `write "\n"` post-loop in the CSV output part. Oops. I haven't compared RAM disks on Windows (someone should post more details on that), but on Linux `/dev/shm` on a box with i7-6700k at 4.8GHz and 65 ns latency/40 GB/s DIMMs, I get runtimes (in seconds, big enough & well separated enough to not worry about measurement error..): Program| RanGrid| FixedGrid| TinyGrid ---|---|---|--- Orig| 48| 40| 27 cb1| 36| 30| 19 cb2| 25| 20| 8 That last TinyGrid column is using only 4 distinct grid points (by changing `x` & `y` in the output line to `a` & `b` \- an early accidental bug). So, across columns we mostly see the effect of `seq` being faster than `Table`. One can maybe get decent speed-up by going parallel & merging preliminary gridtable.csv's - that depends on which diversity of grid values mode obtains. Unless/until such parallel scale-up, this should not be an IO bound problem on an SSD. Even a SATA SSD can probably do 750 MB/s and this problem is only 1365 MB or maybe 2 seconds, but processing takes much more. With the above generated data, for example, `cat *.txt >/dev/null` takes only 0.22 seconds. So, at a minimum one would need like 20/.22=90 cores without contention for IO time to == CPU time. @tcheran's "2nd run times" almost surely have data cached in DIMMs anyway.