@ingo \- this is on the same machine with SQLite:
import db_sqlite, times # inMem.nim
const n = 1_500_000
let t0 = epochTime()
let db = open(":memory:", "", "", "") # In RAM!!
db.exec sql"DROP TABLE IF EXISTS bars"
db.exec sql"""CREATE TABLE bars (t INTEGER PRIMARY KEY,
o DECIMAL(15), h DECIMAL(15), l DECIMAL(15),
c DECIMAL(15), v DECIMAL(15))"""
let q="INSERT INTO bars(t,o,h,l,c,v) VALUES (?,?,?,?,?,?)"
db.exec sql"BEGIN"
for i in 1..n: db.exec q.sql, i, 0.0,0.0,0.0,0.0, i.float
db.exec sql"COMMIT" # 4.31 s (/0.0398 =~ 108X slower)
let t1 = epochTime()
for x in db.fastRows(sql"SELECT SUM(v) FROM bars"): echo x
echo t1 - t0, " seconds to create; "
echo epochTime() - t1, " seconds to read"
db.close # 52 ms (/5.2 =~ 10X worse)
Run
10..100x slower than my (as noted) not optimized to within 3x `memfiles`
version. If you are satisfied with 30x slower than what is possible, well,
that's ok. Maybe you don't have big data.
@Scotpip \- I never use Windows and have no experience performance tuning it.
Others may be able to help you. I will say what I can. I would start with a RAM
filesystem if there is any available to factor out device IO. On Linux I
usually get mmap IO only about 1.2x to 1.5x better (for reasons too in the
weeds to explain here). Typically because of how DLLs/.so's work it is well
attended by OS writers.
As a same OS experiment, I can suggest that you try `readBuffer` on a `File`
(with `setFilePos` aka `fseek` if you need random access - i.e. if separating
input files into 2019, 2020, etc. is not enough time window definition
granularity). You will not have as much "built in" syntax support like `[]`,
but Nim lets you add that back in easily. You could even add an optional
parameter to `bopen` saying what style of IO you would like. For this simple
experiment, you would just declare a `var bar: Bar` and do a while loop
`myfile.readBuffer(bar.addr, bar.sizeof)`.
One thing of note is that on Linux there is an `fread_unlocked` in the C
library which reverses the MT-safe default and can (sometimes) be much faster
when you are close to hardware limits. (The default allows multiple threads to
`fread` coherently from the same `File` which has a buffer they would share.)
I have no idea if this is still true, but 1990s era advice was that the raw
system call reads on Windows were faster than the libc buffered IO. To use that
you might have to learn how to use a special Nim `importc` declaration.
Another experiment is to time two passes for the totalling - a first which will
demand page pages into your address space and a second that should be faster.
(for cache reasons, too, but your cache is likely << 72MB).