> I have a fairly large DB that I need to only read (not write) as fast as > possible. I open DB with flags > SQLITE_OPEN_READONLY|SQLITE_OPEN_PRIVATECACHE and then run select > queries in 8 threads. > > When each thread opens its own connection, DB is read in 8 wallclock > seconds using 23 user seconds. > > When DB connection is shared by threads, though, the process reads the > same data in 17 wallclock seconds using 20 user seconds, much slower > overall, killing the parallelism benefit.
> Reusing RO connection for some reason makes threads wait for each other > too much. What makes the single connection to slow the process down? Of course it is slower since only ONE THREAD at a time can be performing a database operation. This makes your code inherently single threaded where only application (outside of the database engine) processing can be overlapped with database operations. Entries into the database engine are serialized. If you application does non-trivial processing between attempts to enter the database engine code (ie, you have a very low multiprogramming ratio) then passing through a serialized gateway will of course be slower than not doing so. In other words, you need sufficient compute usage between each entry into the database engine to permit taking advantage of the multiprocessing capabilities you have engaged. Actually, if you have 8 threads then each thread will have to do sufficient compute between database engine accesses to permit the remaining 7 threads to proceed through the serialization gate and return before the original thread wants to go through the gate again. This is a basi c principle of multiprogramming derived from Amdahl's Law. https://en.wikipedia.org/wiki/Amdahl%27s_law You can lookup information on multiprogramming ratio's and what they mean yourself. Basically, any single threaded process will run at its maximum speed, that is, it will consume 100% of some resource it requires, after which no speed increase for that process is possible. Running two processes sequentially will also run at the maximum speed that each can achieve constrained by whatever resource hits 100% utilization. If both processes are multiprogrammed (whether as separate threads in the same process or separate processes on the same machine -- side-by-each in newfy-speak), then the total time taken will be less that the time taken to run both the two processes sequentially (one after each, in newfy-speak), but more than the time taken to run single process which takes the longest time to complete. This is called the multiprocessing ratio since now both of the side-by-each processes are constrained by the single resource, which they both require, which is 100% utilized. In your case this appears to be serialized access to the database engie. Therefore, in order to obtain (and utilize) the maximum capabilities of a computer system, the "multiprogramming mix" must be carefully designed to achieve as close to 100% usage of all resources simultaneously. Any resource that is not 100% consumed represents a waste of resources (and therefore a waste of the money spent to purchase that resource). That is why mainframe computers in the dinosaur pen are so fast. It is not because on a single-process basis they are fast (they are not, in comparison to modern bitty-boxen) but that they are designed to achieve extremely high multiprogramming rates by offloading processing from the main processor to dedicated processors (mostly I/O processors) wherever possible. This permits many thousands of simultaneous processes to run at a perceived speed that is only slightly slower than a single process running on the machine by itself. This is something that bittyboxes have been unable to achieve because they have not been designed to do so. So instead of a single refrigerator-sized dinosaur you now require a datacenter with a thousand bitty-boxes to achieve the same result -- the cost in $ being about the same -- to achieve the same result). > In an attempt to speed it up as much as possible, I was trying to first > copy into :memory: db, so that threads would read only from memory, but > this requires the shared connection and it is slower. Yes, because your limitation is access to the database engine by multiple threads. Maximum MPR will be achieved by using one connection per thread and setting "MULTITHREADED" in the database engine so that the serialization calls (which are no longer required) are not made (and thus do not consume the small amount of unnecessary processor time they consume). You will also have to make sure that your connection cache sizes are set correctly such that you get a decent cache hit ratio on each thread and ensure that you do not simply move the bottleneck somewhere else (like the I/O channel or memory). Done properly your bottleneck will likely become CPU architecture (or OS implementation of that architecture) constrained (too small a Level 1/Level 2 cache, too much cache flushing at the processor level, or attempting to dispatch "full compute" threads to a "fake" processor -- as in if the CPU is hyperthreaded (where there are insufficient execution resources on the CPU to execute hyperthreads simultaneously such as in most Intel processors that are not specifically designated as SMT cores) -- which will appear as CPU usage constraint. You may (though not likely) hit I/O constraints. If you are using Windows you may also hit OS constraints (Windows is the only Operating System I know of where the rule that the limiting factor in progress is whatever resource is 100% consumed does not apply-- Windows manages to somehow sit around doing nothing and running tasks making no progress although no resource is 100% consumed (in fact, it can sit and play with itself for quite a while doin g nothing at all)). _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users