Well, actually, no.  

It was a single process that spins up 64 threads each of which accesses its own 
per-thread in-memory database using an 
in-that-thread-only-in-thread-database-connection-in-that-thread.

Making some simple modifications (changing the number of threads to 6 and the 
insertions/thread to 1,000,000 -- so as to consume no more resources than are 
available (this is on a 4-core Xeon (2 SMT threads per core) 32 GB RAM Win10 
1909) yields interesting results (CPU usage was about 90% for all of them, 
though of course the linear runs consumed only 1 thread on one core so about 
12.5% CPU):

SERIALIZED   MEMSTATUS ON   Elapsed Time :  00:00:19.944  MPR=0.75
SERIALIZED   MEMSTATUS OFF  Elapsed Time :  00:00:04.408  MPR=3.97
MULTITHREAD  MEMSTATUS ON   Elapsed Time :  00:00:19.815  MPR=0.72
MULTITHREAD  MEMSTATUS OFF  Elapsed Time :  00:00:04.456  MPR=2.88

*MPR is the Multiprogramming Ratio:  (elapsed time running the workload 
linearly)/(elapsed time running the workload in parallel)

In this case the "linear" time is obtained by running the same test with the 
pthread_join immediately following the pthread_create so that only one thread 
of the workload runs at a time, one after each.

So I conclude that SERIALIZED/MULTITHREAD makes very little difference and that 
MEMSTATUS ON/OFF makes a huge difference.  Since the difference between 
MUTITHREAD and SERIALIZED is a thread mutex on the connection, I can conclude 
that with zero contention for that mutex, whether it is being used or not makes 
little difference and that the fact that it is, when used, a double-indexed 
indirect call through a global data area, leads me to believe that the benefit 
derived from replacing that double-indexed indirect call with an immediate call 
cannot possibly have any significant effect if the elimination of that call 
entirely has negligible effect.

From the MPR _for_this_workload_on_this_particular_CPU_ I can conclude that if 
one wishes to have MEMSTATUS enabled, then using multiple threads is 
detrimental to performance and that linear processing is significantly more 
efficient.  However, when MEMSTATUS is turned off, then serialized mode leads 
to significant increased multiprogramming benefit.

GCC 8.1.0 (MinGW-x64)
gcc -m64 -mwin32 -mconsole -mthreads -O3 -s -pipe 
-Wl,-Bstatic,--nxcompat,--dynamicbase test.c -ID:\Source\bld 
-ID:\Source\bld\tsrc -Ld:\source\bld\gcc\64 -lsqlite3.dll -o test.exe 
-static-libgcc -lpthread

sqlite3.dll compiled using:
gcc -s -O3 -pipe -D_HAVE_SQLITE_CONFIG_H -DSQLITE_EXTRA_INIT=core_init 
-DSQLITE_HAVE_ZLIB -Itsrc -march=native -mtune=native -m64 -mdll -mthreads 
-Wl,-Bstatic,--nxcompat,--dynamicbase,--high-entropy-va,--image-base,0x180000000,--out-implib,gcc/64/libsqlite3.dll.a,--output-def,gcc/64/sqlite3.def
 sqlite3.c sqlite3.def -ladvapi32 -lrpcrt4 -lwinmm -lz -static-libgcc -o 
gcc/64/sqlite3.dll

-- 
The fact that there's a Highway to Hell but only a Stairway to Heaven says a 
lot about anticipated traffic volume.

>-----Original Message-----
>From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org> On
>Behalf Of Doug
>Sent: Saturday, 4 January, 2020 10:42
>To: 'SQLite mailing list' <sqlite-users@mailinglists.sqlite.org>
>Subject: Re: [sqlite] FW: Questions about your "Performance Matters" talk
>re SQLite
>
>Thanks, Jens. I got it. The benchmark sounds like it isn't a real
>benchmark, but a made-up scenario to exercise the Coz code. I've let go
>now.
>Doug
>
>> -----Original Message-----
>> From: sqlite-users <sqlite-users-boun...@mailinglists.sqlite.org>
>> On Behalf Of Jens Alfke
>> Sent: Friday, January 03, 2020 10:58 PM
>> To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
>> Cc: em...@cs.umass.edu; curtsin...@grinnell.edu
>> Subject: Re: [sqlite] FW: Questions about your "Performance
>> Matters" talk re SQLite
>>
>>
>> > On Jan 2, 2020, at 11:54 AM, Doug <dougf....@comcast.net> wrote:
>> >
>> > I know there has been a lot of talk about what can and cannot be
>> done with the C calling interface because of compatibility issues
>> and the myriad set of wrappers on various forms. I’m having a hard
>> time letting go of a possible 25% performance improvement.
>>
>> This was a heavily multithreaded benchmark (64 threads accessing
>> the same connection) on a very hefty server-class CPU. From Dr
>> Hipp’s results, it sounds like the speed up may be only in similar
>> situations, not to more normal SQLite usage.
>>
>> —Jens
>> _______________________________________________
>> sqlite-users mailing list
>> sqlite-users@mailinglists.sqlite.org
>> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-
>> users
>
>_______________________________________________
>sqlite-users mailing list
>sqlite-users@mailinglists.sqlite.org
>http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to