Re: [sqlite] Massively multithreaded SQLite queries

2017-06-07 Thread Andrew Brown
Although I had experimented with this at great length while I was still running 
a multithreaded model, after switching to the multi process model, I did some 
minor experimentation but it was already extremely fast and it didn't show 
significant gains. Going forward I do intend to experiment with it further now 
that my service is live in production.

Thanks,
Andrew

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Vikas Aditya
Sent: Wednesday, June 7, 2017 10:56 AM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Massively multithreaded SQLite queries

Hi Andrew,

This is a great use case for SQLite. Did you have to use any special flags or 
settings for SQLite to achieve this e.g. cache size, page size Read only etc?

Thanks,
Vikas


> On Jun 7, 2017, at 10:07 AM, Andrew Brown <andrew.br...@economicmodeling.com> 
> wrote:
> 
> Wanted to just let you all know that I really appreciated the help you 
> offered when I asked this a few months ago.
> 
> Since then, I've jettisoned the idea of using multiple threads and moved to 
> using multiple processes instead. I have built out an API that accepts http 
> requests for queries, chunks them into deterministic jobs (which improves the 
> probability of a cache hit), which are combined with the operation that needs 
> to be run on the job and hashed, sent via 0mq to the many dbnode processes 
> that are running, where the dbnode checks the cache for the hash in question, 
> and if it does not exist, then queries the Sqlite databases for the result. 
> From there the subqueries are sent back to the coordinator in the API node 
> and combined into the result set for the client.
> 
> It's really fast. It wouldn't be great if I had to write data, but as a data 
> slicer against fixed data, it's pretty slick.
> 
> Just want to thank those who responded and also those who have created and 
> contributed to SQLite as it's a pretty fantastic way to filter through a few 
> terabytes of information - many billions of rows, in mere milliseconds - even 
> more so when parallelized like this.
> 
> 
> 
> 
> 
> From: Andrew Brown
> Sent: Friday, March 3, 2017 12:14 PM
> To: 'sqlite-users@mailinglists.sqlite.org' 
> <sqlite-users@mailinglists.sqlite.org>
> Subject: Massively multithreaded SQLite queries
> 
> Hello,
> 
> Based on my reading of the documentation it seems like SQLite is a great 
> candidate for cases where you have a large number of threads that only need 
> to read a database simultaneously, in our case for fast data lookup, 
> aggregation, etc. I've been able to generate SQL queries that do this, but 
> once we start running them on a large server with a lot of multithreading 
> going on, I find that we spend a lot of time in __raw_spin_lock - perhaps 
> 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in 
> __raw_spin_lock). This is being run on 64 and 72 core machines, and the more 
> cores I run it on, the slower it ends up going.
> 
> To give a bit more detail, I'm working with dotnet core, have written a 
> custom sqlite wrapper (since the dotnet core one lacks the ability to set 
> connection flags beyond readonly, and doesn't have sqlite_prepare_v2() 
> implemented), and I'm running on linux against a bunch of SQLite files in the 
> 2gb-400gb size range. Individual queries are wicked fast, but once I start 
> spreading the load over all the cores by running simultaneous queries I lose 
> the performance advantage and it actually becomes significantly slower.
> 
> Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried shared 
> cache, read uncommitted. Tried without shared cache, read uncommitted. Tried 
> WAL. If I write a less efficient query, I spend less time in __raw_spin_lock, 
> but of course then it takes longer for the queries themselves to return.
> 
> Any tips to handle massively multithreaded side by side chunked queries on 
> the same database?
> 
> Thank you in advance,
> Andrew Brown
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.brown%40economicmodeling.com%7C4ca732c6b31d453093b408d4adce8ae4%7C7cc1677566a34e8b80fd5b1f1db15061%7C0=M6UfvrnKuMTk3CiL8%2BBfRwgXj9J1aPK1%2By0H9Om8Fmw%3D=0

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.brown%40economicmode

Re: [sqlite] Massively multithreaded SQLite queries

2017-06-07 Thread Vikas Aditya
Hi Andrew,

This is a great use case for SQLite. Did you have to use any special flags or 
settings for SQLite to achieve this e.g. cache size, page size Read only etc?

Thanks,
Vikas


> On Jun 7, 2017, at 10:07 AM, Andrew Brown  
> wrote:
> 
> Wanted to just let you all know that I really appreciated the help you 
> offered when I asked this a few months ago.
> 
> Since then, I've jettisoned the idea of using multiple threads and moved to 
> using multiple processes instead. I have built out an API that accepts http 
> requests for queries, chunks them into deterministic jobs (which improves the 
> probability of a cache hit), which are combined with the operation that needs 
> to be run on the job and hashed, sent via 0mq to the many dbnode processes 
> that are running, where the dbnode checks the cache for the hash in question, 
> and if it does not exist, then queries the Sqlite databases for the result. 
> From there the subqueries are sent back to the coordinator in the API node 
> and combined into the result set for the client.
> 
> It's really fast. It wouldn't be great if I had to write data, but as a data 
> slicer against fixed data, it's pretty slick.
> 
> Just want to thank those who responded and also those who have created and 
> contributed to SQLite as it's a pretty fantastic way to filter through a few 
> terabytes of information - many billions of rows, in mere milliseconds - even 
> more so when parallelized like this.
> 
> 
> 
> 
> 
> From: Andrew Brown
> Sent: Friday, March 3, 2017 12:14 PM
> To: 'sqlite-users@mailinglists.sqlite.org' 
> 
> Subject: Massively multithreaded SQLite queries
> 
> Hello,
> 
> Based on my reading of the documentation it seems like SQLite is a great 
> candidate for cases where you have a large number of threads that only need 
> to read a database simultaneously, in our case for fast data lookup, 
> aggregation, etc. I've been able to generate SQL queries that do this, but 
> once we start running them on a large server with a lot of multithreading 
> going on, I find that we spend a lot of time in __raw_spin_lock - perhaps 
> 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in 
> __raw_spin_lock). This is being run on 64 and 72 core machines, and the more 
> cores I run it on, the slower it ends up going.
> 
> To give a bit more detail, I'm working with dotnet core, have written a 
> custom sqlite wrapper (since the dotnet core one lacks the ability to set 
> connection flags beyond readonly, and doesn't have sqlite_prepare_v2() 
> implemented), and I'm running on linux against a bunch of SQLite files in the 
> 2gb-400gb size range. Individual queries are wicked fast, but once I start 
> spreading the load over all the cores by running simultaneous queries I lose 
> the performance advantage and it actually becomes significantly slower.
> 
> Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried shared 
> cache, read uncommitted. Tried without shared cache, read uncommitted. Tried 
> WAL. If I write a less efficient query, I spend less time in __raw_spin_lock, 
> but of course then it takes longer for the queries themselves to return.
> 
> Any tips to handle massively multithreaded side by side chunked queries on 
> the same database?
> 
> Thank you in advance,
> Andrew Brown
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-06-07 Thread Andrew Brown
Wanted to just let you all know that I really appreciated the help you offered 
when I asked this a few months ago.

Since then, I've jettisoned the idea of using multiple threads and moved to 
using multiple processes instead. I have built out an API that accepts http 
requests for queries, chunks them into deterministic jobs (which improves the 
probability of a cache hit), which are combined with the operation that needs 
to be run on the job and hashed, sent via 0mq to the many dbnode processes that 
are running, where the dbnode checks the cache for the hash in question, and if 
it does not exist, then queries the Sqlite databases for the result. From there 
the subqueries are sent back to the coordinator in the API node and combined 
into the result set for the client.

It's really fast. It wouldn't be great if I had to write data, but as a data 
slicer against fixed data, it's pretty slick.

Just want to thank those who responded and also those who have created and 
contributed to SQLite as it's a pretty fantastic way to filter through a few 
terabytes of information - many billions of rows, in mere milliseconds - even 
more so when parallelized like this.





From: Andrew Brown
Sent: Friday, March 3, 2017 12:14 PM
To: 'sqlite-users@mailinglists.sqlite.org' 

Subject: Massively multithreaded SQLite queries

Hello,

Based on my reading of the documentation it seems like SQLite is a great 
candidate for cases where you have a large number of threads that only need to 
read a database simultaneously, in our case for fast data lookup, aggregation, 
etc. I've been able to generate SQL queries that do this, but once we start 
running them on a large server with a lot of multithreading going on, I find 
that we spend a lot of time in __raw_spin_lock - perhaps 75%+ of the time (In 
one case, 87% of 350 seconds x 72 cores was spent in __raw_spin_lock). This is 
being run on 64 and 72 core machines, and the more cores I run it on, the 
slower it ends up going.

To give a bit more detail, I'm working with dotnet core, have written a custom 
sqlite wrapper (since the dotnet core one lacks the ability to set connection 
flags beyond readonly, and doesn't have sqlite_prepare_v2() implemented), and 
I'm running on linux against a bunch of SQLite files in the 2gb-400gb size 
range. Individual queries are wicked fast, but once I start spreading the load 
over all the cores by running simultaneous queries I lose the performance 
advantage and it actually becomes significantly slower.

Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried shared 
cache, read uncommitted. Tried without shared cache, read uncommitted. Tried 
WAL. If I write a less efficient query, I spend less time in __raw_spin_lock, 
but of course then it takes longer for the queries themselves to return.

Any tips to handle massively multithreaded side by side chunked queries on the 
same database?

Thank you in advance,
Andrew Brown
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-04 Thread J Decker
in my curiosity since any spin should be scheduled until something wake it;
I wondered what __raw_spin_lock was and if sqlite could use it
differently...
it appears to be a linux kernel thing, wrapped around read/write accesses
to devices probably the bottleneck is actually IO rather than
contention.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-04 Thread skywind mailing lists
Hi,

just my few cents: if you are using the RTree module I made very bad 
experiences running SQLite in parallel, even when running only two or three 
threads in parallel. In this case I use a single thread-safe queue that is 
handling all SQLite access.

Regards,
Hartwig

> Am 2017-03-04 um 16:54 schrieb James K. Lowden :
> 
> On Fri, 3 Mar 2017 20:13:52 +
> Andrew Brown  wrote:
> 
>> once we start running them on a large server with a lot of
>> multithreading going on, I find that we spend a lot of time in
>> __raw_spin_lock - perhaps 75%+ of the time (In one case, 87% of 350
>> seconds x 72 cores was spent in __raw_spin_lock). This is being run
>> on 64 and 72 core machines, and the more cores I run it on, the
>> slower it ends up going.
> 
> My initial reaction is that your 72 cores are apt to block on I/O
> contention.  Your description is consistent with that hypothesis.  
> 
> Whatever else you do, I recommend SQLightning.  It's a version of
> SQLite that uses a memory-mapped key-value store, LMDB.  It's designed
> for problems like yours, and demonstrated to sometimes be 10-100x
> faster.  
> 
> You don't mention RAM size.  Even if your data fit in memory, though,
> you still have to pass through the kernel to retrieve each row.
> Memory-mapped files bring the data into userspace.  
> 
> https://github.com/LMDB/sqlightning
> 
> --jkl
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> 

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-04 Thread James K. Lowden
On Fri, 3 Mar 2017 20:13:52 +
Andrew Brown  wrote:

> once we start running them on a large server with a lot of
> multithreading going on, I find that we spend a lot of time in
> __raw_spin_lock - perhaps 75%+ of the time (In one case, 87% of 350
> seconds x 72 cores was spent in __raw_spin_lock). This is being run
> on 64 and 72 core machines, and the more cores I run it on, the
> slower it ends up going.

My initial reaction is that your 72 cores are apt to block on I/O
contention.  Your description is consistent with that hypothesis.  

Whatever else you do, I recommend SQLightning.  It's a version of
SQLite that uses a memory-mapped key-value store, LMDB.  It's designed
for problems like yours, and demonstrated to sometimes be 10-100x
faster.  

You don't mention RAM size.  Even if your data fit in memory, though,
you still have to pass through the kernel to retrieve each row.
Memory-mapped files bring the data into userspace.  

https://github.com/LMDB/sqlightning

--jkl
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-04 Thread Bob Friesenhahn

On Fri, 3 Mar 2017, Andrew Brown wrote:


Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried 
shared cache, read uncommitted. Tried without shared cache, read 
uncommitted. Tried WAL. If I write a less efficient query, I spend 
less time in __raw_spin_lock, but of course then it takes longer for 
the queries themselves to return.


Any tips to handle massively multithreaded side by side chunked 
queries on the same database?


I have not seen mention of trying MMAP mode.  Assuming that your many 
cores comes with a lot of RAM, then using a recent sqlite which 
supports memory-mapping the database may speed access by decreasing 
programmed I/O for queries which only do reads.  This assumes that 
there is some commonality among the data used by the queries so there 
is a high probability that often accessed data is already in RAM.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-04 Thread Clemens Ladisch
Andrew Brown wrote:
> I find that we spend a lot of time in __raw_spin_lock

Please try to find out (stack trace, etc.) which particular resource is
protected by that lock.

If, for example, this would happen to be file lock operations, then you
would need to use fewer transactions (by wrapping multiple queries into
a single transaction).


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Keith Medcalf

You must never have used a Microsoft Operating system ...

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Deon Brewis
> Sent: Saturday, 4 March, 2017 00:33
> To: SQLite mailing list
> Subject: Re: [sqlite] Massively multithreaded SQLite queries
> 
> Can you give an example of an Intel Processor SKU with half-assed cores?
> 
> There's HyperThreading of course, but I don't think anybody has ever
> considered HyperThreading to be separate cores.
> 
> - Deon
> 
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Keith Medcalf
> Sent: Friday, March 3, 2017 4:52 PM
> To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
> Subject: Re: [sqlite] Massively multithreaded SQLite queries
> 
> 
> No, the good rule of thumb is to allocate one thread per CPU.  Depending
> on the particular multi-core CPU you "may" be able to use all the cores
> simultaneously but in many cases diminishing returns will set in long
> before you can execute one thread per core.  If this is an Intel processor
> that claims it has more than one thread per core be *extremely* careful as
> that will give you one thread and one half-assed thread per core.
> Sometimes, half-assed cores are presented as real cores when they lack a
> separate execution unit.  Be vary wary.
> 
> I presume you are forcing separate threads to separate cores by setting
> processor affinity on the threads and not just blindly hoping that the OS
> scheduler does "the right thing"?
> 
> > -Original Message-
> > From: sqlite-users
> > [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> > On Behalf Of Jens Alfke
> > Sent: Friday, 3 March, 2017 17:19
> > To: SQLite mailing list
> > Subject: Re: [sqlite] Massively multithreaded SQLite queries
> >
> >
> > > On Mar 3, 2017, at 3:51 PM, Simon Slavin <slav...@bigfraud.org> wrote:
> > >
> > > Then mess with the '5' until you find a good value.
> >
> > A common rule of thumb with thread pools is to allocate one thread per
> > CPU core.
> >
> > —Jens
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Deon Brewis
Can you give an example of an Intel Processor SKU with half-assed cores?

There's HyperThreading of course, but I don't think anybody has ever considered 
HyperThreading to be separate cores.

- Deon

-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Keith Medcalf
Sent: Friday, March 3, 2017 4:52 PM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Massively multithreaded SQLite queries


No, the good rule of thumb is to allocate one thread per CPU.  Depending on the 
particular multi-core CPU you "may" be able to use all the cores simultaneously 
but in many cases diminishing returns will set in long before you can execute 
one thread per core.  If this is an Intel processor that claims it has more 
than one thread per core be *extremely* careful as that will give you one 
thread and one half-assed thread per core.  Sometimes, half-assed cores are 
presented as real cores when they lack a separate execution unit.  Be vary wary.

I presume you are forcing separate threads to separate cores by setting 
processor affinity on the threads and not just blindly hoping that the OS 
scheduler does "the right thing"?

> -Original Message-
> From: sqlite-users 
> [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Jens Alfke
> Sent: Friday, 3 March, 2017 17:19
> To: SQLite mailing list
> Subject: Re: [sqlite] Massively multithreaded SQLite queries
> 
> 
> > On Mar 3, 2017, at 3:51 PM, Simon Slavin <slav...@bigfraud.org> wrote:
> >
> > Then mess with the '5' until you find a good value.
> 
> A common rule of thumb with thread pools is to allocate one thread per 
> CPU core.
> 
> —Jens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Keith Medcalf

As an aside, you may likely find that the OS will schedule multiple processes 
far more efficiently than it schedules multiple threads (especially from the 
perspective of scheduling the cores independently).  You may find that it is 
far more efficient to perhaps schedule a dozen processes with six to a dozen 
worker threads each (or whatever number seems to be optimal) and use some kind 
of IPC to submit workload to the individual processes.  

Also, make sure the database is in WAL journal mode -- this will help even if 
you are only running read transactions + if you have enough memory make sure 
the page cache and temp space is big enough on each connection and force temp 
work into memory - but do not overcommit memory.

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Andrew Brown
> Sent: Friday, 3 March, 2017 18:24
> To: SQLite mailing list
> Subject: Re: [sqlite] Massively multithreaded SQLite queries
> 
> Now this is a interesting question. As it happens I /am/ blindly assuming
> the os would schedule it properly. I will look at my options this evening
> and see about addressing that.
> 
> On Mar 3, 2017 4:52 PM, Keith Medcalf <kmedc...@dessus.com> wrote:
> 
> No, the good rule of thumb is to allocate one thread per CPU.  Depending
> on the particular multi-core CPU you "may" be able to use all the cores
> simultaneously but in many cases diminishing returns will set in long
> before you can execute one thread per core.  If this is an Intel processor
> that claims it has more than one thread per core be *extremely* careful as
> that will give you one thread and one half-assed thread per core.
> Sometimes, half-assed cores are presented as real cores when they lack a
> separate execution unit.  Be vary wary.
> 
> I presume you are forcing separate threads to separate cores by setting
> processor affinity on the threads and not just blindly hoping that the OS
> scheduler does "the right thing"?
> 
> > -Original Message-
> > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> > On Behalf Of Jens Alfke
> > Sent: Friday, 3 March, 2017 17:19
> > To: SQLite mailing list
> > Subject: Re: [sqlite] Massively multithreaded SQLite queries
> >
> >
> > > On Mar 3, 2017, at 3:51 PM, Simon Slavin <slav...@bigfraud.org> wrote:
> > >
> > > Then mess with the '5' until you find a good value.
> >
> > A common rule of thumb with thread pools is to allocate one thread per
> CPU
> > core.
> >
> > —Jens
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> >
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglis
> ts.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-
> users=01%7C01%7Candrew.brown%40economicmodeling.com%7Ca4729ccfb1754b9
> a534008d46298b2d9%7C7cc1677566a34e8b80fd5b1f1db15061%7C0=C6yNkbM7p0u
> oib5OgXXYS7%2BEy8cJR1I6N8JjaQ4iMCU%3D=0
> 
> 
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglis
> ts.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-
> users=01%7C01%7Candrew.brown%40economicmodeling.com%7Ca4729ccfb1754b9
> a534008d46298b2d9%7C7cc1677566a34e8b80fd5b1f1db15061%7C0=C6yNkbM7p0u
> oib5OgXXYS7%2BEy8cJR1I6N8JjaQ4iMCU%3D=0
> 
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Scott Hess
You might want to try enabling mmap mode:
  pragma mmap_size = 4294967296;
or something like that.  Try to make it larger than your databases.  I'd
expect that if you're running with that many cores, you're _probably_
running in a 64-bit address space, so it'll probably work.

-scott


On Fri, Mar 3, 2017 at 5:22 PM, Andrew Brown <
andrew.br...@economicmodeling.com> wrote:

> Yes, each thread has its own connection.
>
> On Mar 3, 2017 4:45 PM, Keith Medcalf  wrote:
>
> Does each thread have its own connection?
>
> > -Original Message-
> > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> > On Behalf Of Andrew Brown
> > Sent: Friday, 3 March, 2017 13:14
> > To: sqlite-users@mailinglists.sqlite.org
> > Subject: [sqlite] Massively multithreaded SQLite queries
> >
> > Hello,
> >
> > Based on my reading of the documentation it seems like SQLite is a great
> > candidate for cases where you have a large number of threads that only
> > need to read a database simultaneously, in our case for fast data lookup,
> > aggregation, etc. I've been able to generate SQL queries that do this,
> but
> > once we start running them on a large server with a lot of multithreading
> > going on, I find that we spend a lot of time in __raw_spin_lock - perhaps
> > 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in
> > __raw_spin_lock). This is being run on 64 and 72 core machines, and the
> > more cores I run it on, the slower it ends up going.
> >
> > To give a bit more detail, I'm working with dotnet core, have written a
> > custom sqlite wrapper (since the dotnet core one lacks the ability to set
> > connection flags beyond readonly, and doesn't have sqlite_prepare_v2()
> > implemented), and I'm running on linux against a bunch of SQLite files in
> > the 2gb-400gb size range. Individual queries are wicked fast, but once I
> > start spreading the load over all the cores by running simultaneous
> > queries I lose the performance advantage and it actually becomes
> > significantly slower.
> >
> > Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried
> > shared cache, read uncommitted. Tried without shared cache, read
> > uncommitted. Tried WAL. If I write a less efficient query, I spend less
> > time in __raw_spin_lock, but of course then it takes longer for the
> > queries themselves to return.
> >
> > Any tips to handle massively multithreaded side by side chunked queries
> on
> > the same database?
> >
> > Thank you in advance,
> > Andrew Brown
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > https://na01.safelinks.protection.outlook.com/?url=
> http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%
> 2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.
> brown%40economicmodeling.com%7Cc0018bf7b8cd49bb588608d46297c70b%
> 7C7cc1677566a34e8b80fd5b1f1db15061%7C0=
> gad9Y8BC67c4ZAC05GckCghNHgKmm8i5piwgjM2MXgw%3D=0
>
>
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> https://na01.safelinks.protection.outlook.com/?url=
> http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%
> 2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.
> brown%40economicmodeling.com%7Cc0018bf7b8cd49bb588608d46297c70b%
> 7C7cc1677566a34e8b80fd5b1f1db15061%7C0=
> gad9Y8BC67c4ZAC05GckCghNHgKmm8i5piwgjM2MXgw%3D=0
>
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Warren Young
On Mar 3, 2017, at 6:24 PM, Andrew Brown  
wrote:
> 
> I /am/ blindly assuming the os would schedule it properly.

Why would it?  Windows won’t do that for you, either.

Now, if you’re using .NET’s thread pool mechanisms, it *should* be making sure 
it doesn’t oversubscribe the cores too much, but then we’re assuming Microsoft 
is writing good Linux code here.

The .NET Core code is open, so you can see what it does on Linux.

(I could see, too, but I can’t say I care enough to bother. :) )
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Andrew Brown
Now this is a interesting question. As it happens I /am/ blindly assuming the 
os would schedule it properly. I will look at my options this evening and see 
about addressing that.

On Mar 3, 2017 4:52 PM, Keith Medcalf <kmedc...@dessus.com> wrote:

No, the good rule of thumb is to allocate one thread per CPU.  Depending on the 
particular multi-core CPU you "may" be able to use all the cores simultaneously 
but in many cases diminishing returns will set in long before you can execute 
one thread per core.  If this is an Intel processor that claims it has more 
than one thread per core be *extremely* careful as that will give you one 
thread and one half-assed thread per core.  Sometimes, half-assed cores are 
presented as real cores when they lack a separate execution unit.  Be vary wary.

I presume you are forcing separate threads to separate cores by setting 
processor affinity on the threads and not just blindly hoping that the OS 
scheduler does "the right thing"?

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Jens Alfke
> Sent: Friday, 3 March, 2017 17:19
> To: SQLite mailing list
> Subject: Re: [sqlite] Massively multithreaded SQLite queries
>
>
> > On Mar 3, 2017, at 3:51 PM, Simon Slavin <slav...@bigfraud.org> wrote:
> >
> > Then mess with the '5' until you find a good value.
>
> A common rule of thumb with thread pools is to allocate one thread per CPU
> core.
>
> —Jens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.brown%40economicmodeling.com%7Ca4729ccfb1754b9a534008d46298b2d9%7C7cc1677566a34e8b80fd5b1f1db15061%7C0=C6yNkbM7p0uoib5OgXXYS7%2BEy8cJR1I6N8JjaQ4iMCU%3D=0



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.brown%40economicmodeling.com%7Ca4729ccfb1754b9a534008d46298b2d9%7C7cc1677566a34e8b80fd5b1f1db15061%7C0=C6yNkbM7p0uoib5OgXXYS7%2BEy8cJR1I6N8JjaQ4iMCU%3D=0

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Andrew Brown
Yes, each thread has its own connection.

On Mar 3, 2017 4:45 PM, Keith Medcalf  wrote:

Does each thread have its own connection?

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Andrew Brown
> Sent: Friday, 3 March, 2017 13:14
> To: sqlite-users@mailinglists.sqlite.org
> Subject: [sqlite] Massively multithreaded SQLite queries
>
> Hello,
>
> Based on my reading of the documentation it seems like SQLite is a great
> candidate for cases where you have a large number of threads that only
> need to read a database simultaneously, in our case for fast data lookup,
> aggregation, etc. I've been able to generate SQL queries that do this, but
> once we start running them on a large server with a lot of multithreading
> going on, I find that we spend a lot of time in __raw_spin_lock - perhaps
> 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in
> __raw_spin_lock). This is being run on 64 and 72 core machines, and the
> more cores I run it on, the slower it ends up going.
>
> To give a bit more detail, I'm working with dotnet core, have written a
> custom sqlite wrapper (since the dotnet core one lacks the ability to set
> connection flags beyond readonly, and doesn't have sqlite_prepare_v2()
> implemented), and I'm running on linux against a bunch of SQLite files in
> the 2gb-400gb size range. Individual queries are wicked fast, but once I
> start spreading the load over all the cores by running simultaneous
> queries I lose the performance advantage and it actually becomes
> significantly slower.
>
> Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried
> shared cache, read uncommitted. Tried without shared cache, read
> uncommitted. Tried WAL. If I write a less efficient query, I spend less
> time in __raw_spin_lock, but of course then it takes longer for the
> queries themselves to return.
>
> Any tips to handle massively multithreaded side by side chunked queries on
> the same database?
>
> Thank you in advance,
> Andrew Brown
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.brown%40economicmodeling.com%7Cc0018bf7b8cd49bb588608d46297c70b%7C7cc1677566a34e8b80fd5b1f1db15061%7C0=gad9Y8BC67c4ZAC05GckCghNHgKmm8i5piwgjM2MXgw%3D=0



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.brown%40economicmodeling.com%7Cc0018bf7b8cd49bb588608d46297c70b%7C7cc1677566a34e8b80fd5b1f1db15061%7C0=gad9Y8BC67c4ZAC05GckCghNHgKmm8i5piwgjM2MXgw%3D=0

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Scott Hess
Yes, if they are lock bound, then they need to have the number of cores
which reduces the locking overhead to the point where it's not degrading
performance too much.  Though I guess the OP really didn't say that (more
CPUs may spend more time in spinlocks and still spend less wallclock time).

Another thing to look at it whether any queries can be more effectively
scheduled.  Having hundreds of completely-unrelated queries seems unlikely
to me.  More likely is that you have a smaller number of queries which are
targeting various different bind parameters.  Preparing a particular query
once, then looping and running each set of bind parameters on one thread is
probably going to be _much_ more efficient.

-scott


On Fri, Mar 3, 2017 at 5:03 PM, Warren Young  wrote:

> On Mar 3, 2017, at 5:51 PM, Keith Medcalf  wrote:
> >
> > No, the good rule of thumb is to allocate one thread per CPU.
>
> It depends on the workload.  Parallel make (e.g. “make -jN” in GNU make)
> typically improves in speed past N=core count to about 1.5x the core count.
>
> SQLite seems like a similar kind of workload: lots of CPU *and* disk I/O,
> so that you need a bit of oversubscription to keep all the cores busy,
> because some threads/processes will be stalled on I/O.
>
> Not that any of this is relevant at the current point, since the OP is
> currently neither I/O bound nor CPU-bound, but lock-bound.
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Warren Young
On Mar 3, 2017, at 5:51 PM, Keith Medcalf  wrote:
> 
> No, the good rule of thumb is to allocate one thread per CPU.

It depends on the workload.  Parallel make (e.g. “make -jN” in GNU make) 
typically improves in speed past N=core count to about 1.5x the core count.

SQLite seems like a similar kind of workload: lots of CPU *and* disk I/O, so 
that you need a bit of oversubscription to keep all the cores busy, because 
some threads/processes will be stalled on I/O.

Not that any of this is relevant at the current point, since the OP is 
currently neither I/O bound nor CPU-bound, but lock-bound.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Keith Medcalf

No, the good rule of thumb is to allocate one thread per CPU.  Depending on the 
particular multi-core CPU you "may" be able to use all the cores simultaneously 
but in many cases diminishing returns will set in long before you can execute 
one thread per core.  If this is an Intel processor that claims it has more 
than one thread per core be *extremely* careful as that will give you one 
thread and one half-assed thread per core.  Sometimes, half-assed cores are 
presented as real cores when they lack a separate execution unit.  Be vary wary.

I presume you are forcing separate threads to separate cores by setting 
processor affinity on the threads and not just blindly hoping that the OS 
scheduler does "the right thing"?

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Jens Alfke
> Sent: Friday, 3 March, 2017 17:19
> To: SQLite mailing list
> Subject: Re: [sqlite] Massively multithreaded SQLite queries
> 
> 
> > On Mar 3, 2017, at 3:51 PM, Simon Slavin <slav...@bigfraud.org> wrote:
> >
> > Then mess with the '5' until you find a good value.
> 
> A common rule of thumb with thread pools is to allocate one thread per CPU
> core.
> 
> —Jens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Keith Medcalf

Does each thread have its own connection?

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Andrew Brown
> Sent: Friday, 3 March, 2017 13:14
> To: sqlite-users@mailinglists.sqlite.org
> Subject: [sqlite] Massively multithreaded SQLite queries
> 
> Hello,
> 
> Based on my reading of the documentation it seems like SQLite is a great
> candidate for cases where you have a large number of threads that only
> need to read a database simultaneously, in our case for fast data lookup,
> aggregation, etc. I've been able to generate SQL queries that do this, but
> once we start running them on a large server with a lot of multithreading
> going on, I find that we spend a lot of time in __raw_spin_lock - perhaps
> 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in
> __raw_spin_lock). This is being run on 64 and 72 core machines, and the
> more cores I run it on, the slower it ends up going.
> 
> To give a bit more detail, I'm working with dotnet core, have written a
> custom sqlite wrapper (since the dotnet core one lacks the ability to set
> connection flags beyond readonly, and doesn't have sqlite_prepare_v2()
> implemented), and I'm running on linux against a bunch of SQLite files in
> the 2gb-400gb size range. Individual queries are wicked fast, but once I
> start spreading the load over all the cores by running simultaneous
> queries I lose the performance advantage and it actually becomes
> significantly slower.
> 
> Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried
> shared cache, read uncommitted. Tried without shared cache, read
> uncommitted. Tried WAL. If I write a less efficient query, I spend less
> time in __raw_spin_lock, but of course then it takes longer for the
> queries themselves to return.
> 
> Any tips to handle massively multithreaded side by side chunked queries on
> the same database?
> 
> Thank you in advance,
> Andrew Brown
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Andrew Brown
So in this case I'm running on a 72 core machine.

Also the databases have one table in them each... The goal here is to bring all 
the cores to bear on the many queries - each grabbing the next query to be run 
and running it, until finally there are no more chunks to run.

Then within my own apis I aggregate the results and form my response.

I'm going to try preparing all the statements before I run any later tonight. 
I'm also toying with the idea of using a shared nothing architecture, in which 
I run 72 processes instead of threads, in the hope that there will be less 
contention that way. Thoughts on that idea?

I really appreciate everyone's responsiveness.

On Mar 3, 2017 4:19 PM, Jens Alfke  wrote:

> On Mar 3, 2017, at 3:51 PM, Simon Slavin  wrote:
>
> Then mess with the '5' until you find a good value.

A common rule of thumb with thread pools is to allocate one thread per CPU core.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.brown%40economicmodeling.com%7C1941086b72c84122e80e08d462942220%7C7cc1677566a34e8b80fd5b1f1db15061%7C0=ytN2iaMT9eiK%2BktzJva1shKgrBfYhxeUHyesJscJnB8%3D=0

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Warren Young
On Mar 3, 2017, at 1:13 PM, Andrew Brown  
wrote:
> 
> This is being run on 64 and 72 core machines, and the more cores I run it on, 
> the slower it ends up going.

What happens if you rework the app to do only one thing, single-threaded, but 
run between 64 and 108 instances in parallel on the same system?  (108 == 72 * 
1.5, a common work factor value for parallel applications.)

You can use a tool like GNU parallel to manage the work queue:

https://www.gnu.org/software/parallel/

If nothing else, this would separate out the in-process locking from the file 
locking, thereby making more clear where the delays are coming from.

Your __raw_spin_lock() result isn’t terribly enlightening because that’s a very 
low level Linux kernel mechanism.  It doesn’t tell us whether the slowdown is 
due to file locking in SQLite, mutexes in SQLite, some non-optimal pthreads use 
in .NET Core, etc.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Jens Alfke

> On Mar 3, 2017, at 3:51 PM, Simon Slavin  wrote:
> 
> Then mess with the '5' until you find a good value.

A common rule of thumb with thread pools is to allocate one thread per CPU 
core. 

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Simon Slavin

On 3 Mar 2017, at 11:47pm, Scott Hess  wrote:

> I'd say you should consider switching to some sort of queue feeding a
> worker pool, then experimenting with pool sizes.

Agreed.  Set up 5 threads which have actual access to the database.  Have your 
1835 jobs queue up their jobs for servicing by those 5 threads.

Then mess with the '5' until you find a good value.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Scott Hess
I'd say you should consider switching to some sort of queue feeding a
worker pool, then experimenting with pool sizes.  Often problems reward the
first few threads you add, but at some point additional threads become a
negative unless the system is specifically designed for high thread counts
(and such design can be annoying for low-thread-count users).  There also
may be caching interactions which improve with a smaller number of threads.

Something else to try is to have multiple databases which are not sharing
page caches (to reduce locking).  It is entirely possible that having 4
databases each with 8 threads could be faster than one database with 32
threads, because they each keep out of each other's way, more.

[None of the above is really SQLite specific.]

-scott


On Fri, Mar 3, 2017 at 3:37 PM, Andrew Brown <
andrew.br...@economicmodeling.com> wrote:

> Well, in the situation I'm working with, my API is constructing 1835 small
> SQL jobs to be run, and then passing them off to a structure in which 72
> threads are running, each with their own db connection (I assume that's
> what you mean by a database handle, a DB connection, but please, correct me
> if I'm wrong!). So in this case, 72 database handles on my bigger server.
>
> Unfortunately, I'm not running the same queries over and over (one example
> is a 400gb database with 3-5 dimension columns and a few data columns, and
> this is slicing on that data) so preparing them will have somewhat less
> benefit in that sense than in other cases. That said, I can still try
> preparing all the statements before I run any.
>
>
>
> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Jens Alfke
> Sent: Friday, March 3, 2017 3:25 PM
> To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
> Subject: Re: [sqlite] Massively multithreaded SQLite queries
>
>
> > On Mar 3, 2017, at 2:52 PM, Josh Hunsaker <josh+sql...@nispio.net>
> wrote:
> >
> > Your problem might be mitigated if you could compile your queries in
> advance.
>
> Precompiled statements are a must if you want the best performance (and
> you’re running the same queries over and over.)
>
> Andrew, how many database handles are open?
>
> —Jens
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> https://na01.safelinks.protection.outlook.com/?url=
> http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%
> 2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.
> brown%40economicmodeling.com%7C5fba3607e2164acbf45808d4628c85a6%
> 7C7cc1677566a34e8b80fd5b1f1db15061%7C0=
> 1xM5isGNHfqUu7yWfoohbYBryDxcgzed0Qlz37K0FDw%3D=0
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Andrew Brown
Well, in the situation I'm working with, my API is constructing 1835 small SQL 
jobs to be run, and then passing them off to a structure in which 72 threads 
are running, each with their own db connection (I assume that's what you mean 
by a database handle, a DB connection, but please, correct me if I'm wrong!). 
So in this case, 72 database handles on my bigger server.

Unfortunately, I'm not running the same queries over and over (one example is a 
400gb database with 3-5 dimension columns and a few data columns, and this is 
slicing on that data) so preparing them will have somewhat less benefit in that 
sense than in other cases. That said, I can still try preparing all the 
statements before I run any.



-Original Message-
From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On 
Behalf Of Jens Alfke
Sent: Friday, March 3, 2017 3:25 PM
To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org>
Subject: Re: [sqlite] Massively multithreaded SQLite queries


> On Mar 3, 2017, at 2:52 PM, Josh Hunsaker <josh+sql...@nispio.net> wrote:
> 
> Your problem might be mitigated if you could compile your queries in advance.

Precompiled statements are a must if you want the best performance (and you’re 
running the same queries over and over.)

Andrew, how many database handles are open?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users=01%7C01%7Candrew.brown%40economicmodeling.com%7C5fba3607e2164acbf45808d4628c85a6%7C7cc1677566a34e8b80fd5b1f1db15061%7C0=1xM5isGNHfqUu7yWfoohbYBryDxcgzed0Qlz37K0FDw%3D=0
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Jens Alfke

> On Mar 3, 2017, at 2:52 PM, Josh Hunsaker  wrote:
> 
> Your problem might be mitigated if you could compile your queries in advance.

Precompiled statements are a must if you want the best performance (and you’re 
running the same queries over and over.)

Andrew, how many database handles are open?

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Massively multithreaded SQLite queries

2017-03-03 Thread Josh Hunsaker
On Fri, Mar 3, 2017 at 1:13 PM, Andrew Brown wrote:
>
> Any tips to handle massively multithreaded side by side chunked
> queries on the same database?

In my (limited) experience, it seems that multithreaded SQLite
acquires a lock on a shared, in-memory b-tree every time a
statement is prepared.  It might not be the database read itself,
but statement preparation that is causing threads to wait for
each other.  Your problem might be mitigated if you could compile
your queries in advance.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users