Well, actually, no. It was a single process that spins up 64 threads each of which accesses its own per-thread in-memory database using an in-that-thread-only-in-thread-database-connection-in-that-thread.
Making some simple modifications (changing the number of threads to 6 and the insertions/thread to 1,000,000 -- so as to consume no more resources than are available (this is on a 4-core Xeon (2 SMT threads per core) 32 GB RAM Win10 1909) yields interesting results (CPU usage was about 90% for all of them, though of course the linear runs consumed only 1 thread on one core so about 12.5% CPU): SERIALIZED MEMSTATUS ON Elapsed Time : 00:00:19.944 MPR=0.75 SERIALIZED MEMSTATUS OFF Elapsed Time : 00:00:04.408 MPR=3.97 MULTITHREAD MEMSTATUS ON Elapsed Time : 00:00:19.815 MPR=0.72 MULTITHREAD MEMSTATUS OFF Elapsed Time : 00:00:04.456 MPR=2.88 *MPR is the Multiprogramming Ratio: (elapsed time running the workload linearly)/(elapsed time running the workload in parallel) In this case the "linear" time is obtained by running the same test with the pthread_join immediately following the pthread_create so that only one thread of the workload runs at a time, one after each. So I conclude that SERIALIZED/MULTITHREAD makes very little difference and that MEMSTATUS ON/OFF makes a huge difference. Since the difference between MUTITHREAD and SERIALIZED is a thread mutex on the connection, I can conclude that with zero contention for that mutex, whether it is being used or not makes little difference and that the fact that it is, when used, a double-indexed indirect call through a global data area, leads me to believe that the benefit derived from replacing that double-indexed indirect call with an immediate call cannot possibly have any significant effect if the elimination of that call entirely has negligible effect. From the MPR _for_this_workload_on_this_particular_CPU_ I can conclude that if one wishes to have MEMSTATUS enabled, then using multiple threads is detrimental to performance and that linear processing is significantly more efficient. However, when MEMSTATUS is turned off, then serialized mode leads to significant increased multiprogramming benefit. GCC 8.1.0 (MinGW-x64) gcc -m64 -mwin32 -mconsole -mthreads -O3 -s -pipe -Wl,-Bstatic,--nxcompat,--dynamicbase test.c -ID:\Source\bld -ID:\Source\bld\tsrc -Ld:\source\bld\gcc\64 -lsqlite3.dll -o test.exe -static-libgcc -lpthread sqlite3.dll compiled using: gcc -s -O3 -pipe -D_HAVE_SQLITE_CONFIG_H -DSQLITE_EXTRA_INIT=core_init -DSQLITE_HAVE_ZLIB -Itsrc -march=native -mtune=native -m64 -mdll -mthreads -Wl,-Bstatic,--nxcompat,--dynamicbase,--high-entropy-va,--image-base,0x180000000,--out-implib,gcc/64/libsqlite3.dll.a,--output-def,gcc/64/sqlite3.def sqlite3.c sqlite3.def -ladvapi32 -lrpcrt4 -lwinmm -lz -static-libgcc -o gcc/64/sqlite3.dll -- The fact that there's a Highway to Hell but only a Stairway to Heaven says a lot about anticipated traffic volume. >-----Original Message----- >From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On >Behalf Of Doug >Sent: Saturday, 4 January, 2020 10:42 >To: 'SQLite mailing list' <sqlite-users@mailinglists.sqlite.org> >Subject: Re: [sqlite] FW: Questions about your "Performance Matters" talk >re SQLite > >Thanks, Jens. I got it. The benchmark sounds like it isn't a real >benchmark, but a made-up scenario to exercise the Coz code. I've let go >now. >Doug > >> -----Original Message----- >> From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> >> On Behalf Of Jens Alfke >> Sent: Friday, January 03, 2020 10:58 PM >> To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> >> Cc: em...@cs.umass.edu; curtsin...@grinnell.edu >> Subject: Re: [sqlite] FW: Questions about your "Performance >> Matters" talk re SQLite >> >> >> > On Jan 2, 2020, at 11:54 AM, Doug <dougf....@comcast.net> wrote: >> > >> > I know there has been a lot of talk about what can and cannot be >> done with the C calling interface because of compatibility issues >> and the myriad set of wrappers on various forms. I’m having a hard >> time letting go of a possible 25% performance improvement. >> >> This was a heavily multithreaded benchmark (64 threads accessing >> the same connection) on a very hefty server-class CPU. From Dr >> Hipp’s results, it sounds like the speed up may be only in similar >> situations, not to more normal SQLite usage. >> >> —Jens >> _______________________________________________ >> sqlite-users mailing list >> sqlite-users@mailinglists.sqlite.org >> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite- >> users > >_______________________________________________ >sqlite-users mailing list >sqlite-users@mailinglists.sqlite.org >http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users