I think your assumption about the file system is correct - It is hard for the code to produce widely differing times under different
systems as the basic algorithms do not change between systems, only dependencies on file-system or VFS specific api's etc.
The NTFS file system in Windows (hoping this is what you are on) is actually impressively fast in most modern implementations
(though not as fast as some more process-specific linux flavours), but the point being it should not be taking more than 10 to 15%
over the same query run for the same SQLite release in any other system, unless something is wrong. 1 hour+ would set off red lights
in your case, nvm 16hrs+.
Best guess is some other system is trying to also look into that file, making the Windows file manager stutter quite possibly the
Win7 Preview pane, a 3rd party file indexer service, an anti-virus system or some other real-time PC protection/enhancing kind of
software or even memory juggling at low-diskspace situations. Usually any of the above you should have installed explicitly so you
should know which of them are on your system.
If none of the above options are found to be the culprit, I would suggest
looking at some testing of the core system.
a simple test operation would be:
create file #1 for input/output
create file #2 for output
L = 20 * 1024 * 1024 * 1024 // 20 Gig chosen arbitrarily
wInterval = L mod 4; / 4 additional read-write tests along the way
for n = 0 to L-1 do -
write in file#1 byte(randomvalue)
if ((n mod winterval)=0) read new data From #1 into #2
end for;
close #1 & #2
delete
The total time for the above on any moderately modern Windows PC using internal drives should bench under 10 minutes, certainly not
more than 30 mins.
Take care to not use memory buffers for any of this, use file api calls directly and only - and ensure you have 40 GB available on
your HDD.
You could follow the File-System buffering efforts by looking at the
Task-manager Performance graphs while it runs.
if it takes too long, break it and remove the data transfer statement - if it doesn't help much (<40% improvement), you have a 3rd
party problem. If it improves performance by around ~50%, it works normally, if it improves performance to over 80%, you have a file
system or kernel which gets bogged down between read/writes and needs checking. (this last situation usually spells impending disk
failure or such if the FS is healthy, time to backup).
If you don't feel adventurous and coding the above - You could also just download and use such a utility I've made for windows
already from:
www.rifin.co.za/software/utils/FBench.exe
(Disclaimer - just one of my personal non-professional tools - Accuracy < Utility, but still pretty close on higher filesizes, use
at own risk, etc. etc.).
On 2013/06/28 14:54, Christopher W. Steenwyk wrote:
Hi,
I have been struggling with a problem and was hoping I could get some
insight.
I have a rather large database (11 GB) that has two tables (one with
approximately 500,000 rows and another with approximately 50,000,000 rows).
In this database I am performing a query that joins these two tables to
produce approximately 4.4 billion rows and then uses a GROUP BY and COUNT
to return some values to me.
I struggled with the slowness of this query for a while in linux and
through covering indexes, ANALYZE, and a SSD I was able to get the query to
run in about 15 minutes which I am OK with. Unfortunately the same query in
Windows has been running for 16 hours and still hasn't returned.
In my investigation I found that the pre-compiled exe was faster than the
C++ wrapper I was using, so my tool actually spawns a process using the
pre-compiled binary. In linux I am using v 3.7.13 and in Windows I am using
v 3.7.17.
Does anybody have a suggestion on how to either improve my overall
performance or increase my performance on Windows? My general thought is
that this is caused by the file manager layer in Windows, but I don't know.
Any help would be appreciated!
Thanks,
Chris
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users