Although I had experimented with this at great length while I was still running a multithreaded model, after switching to the multi process model, I did some minor experimentation but it was already extremely fast and it didn't show significant gains. Going forward I do intend to experiment with it further now that my service is live in production.
Thanks, Andrew -----Original Message----- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Vikas Aditya Sent: Wednesday, June 7, 2017 10:56 AM To: SQLite mailing list <sqlite-users@mailinglists.sqlite.org> Subject: Re: [sqlite] Massively multithreaded SQLite queries Hi Andrew, This is a great use case for SQLite. Did you have to use any special flags or settings for SQLite to achieve this e.g. cache size, page size Read only etc? Thanks, Vikas > On Jun 7, 2017, at 10:07 AM, Andrew Brown <andrew.br...@economicmodeling.com> > wrote: > > Wanted to just let you all know that I really appreciated the help you > offered when I asked this a few months ago. > > Since then, I've jettisoned the idea of using multiple threads and moved to > using multiple processes instead. I have built out an API that accepts http > requests for queries, chunks them into deterministic jobs (which improves the > probability of a cache hit), which are combined with the operation that needs > to be run on the job and hashed, sent via 0mq to the many dbnode processes > that are running, where the dbnode checks the cache for the hash in question, > and if it does not exist, then queries the Sqlite databases for the result. > From there the subqueries are sent back to the coordinator in the API node > and combined into the result set for the client. > > It's really fast. It wouldn't be great if I had to write data, but as a data > slicer against fixed data, it's pretty slick. > > Just want to thank those who responded and also those who have created and > contributed to SQLite as it's a pretty fantastic way to filter through a few > terabytes of information - many billions of rows, in mere milliseconds - even > more so when parallelized like this. > > > > > > From: Andrew Brown > Sent: Friday, March 3, 2017 12:14 PM > To: 'sqlite-users@mailinglists.sqlite.org' > <sqlite-users@mailinglists.sqlite.org> > Subject: Massively multithreaded SQLite queries > > Hello, > > Based on my reading of the documentation it seems like SQLite is a great > candidate for cases where you have a large number of threads that only need > to read a database simultaneously, in our case for fast data lookup, > aggregation, etc. I've been able to generate SQL queries that do this, but > once we start running them on a large server with a lot of multithreading > going on, I find that we spend a lot of time in __raw_spin_lock - perhaps > 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in > __raw_spin_lock). This is being run on 64 and 72 core machines, and the more > cores I run it on, the slower it ends up going. > > To give a bit more detail, I'm working with dotnet core, have written a > custom sqlite wrapper (since the dotnet core one lacks the ability to set > connection flags beyond readonly, and doesn't have sqlite_prepare_v2() > implemented), and I'm running on linux against a bunch of SQLite files in the > 2gb-400gb size range. Individual queries are wicked fast, but once I start > spreading the load over all the cores by running simultaneous queries I lose > the performance advantage and it actually becomes significantly slower. > > Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried shared > cache, read uncommitted. Tried without shared cache, read uncommitted. Tried > WAL. If I write a less efficient query, I spend less time in __raw_spin_lock, > but of course then it takes longer for the queries themselves to return. > > Any tips to handle massively multithreaded side by side chunked queries on > the same database? > > Thank you in advance, > Andrew Brown > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7C4ca732c6b31d453093b408d4adce8ae4%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=M6UfvrnKuMTk3CiL8%2BBfRwgXj9J1aPK1%2By0H9Om8Fmw%3D&reserved=0 _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7C4ca732c6b31d453093b408d4adce8ae4%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=M6UfvrnKuMTk3CiL8%2BBfRwgXj9J1aPK1%2By0H9Om8Fmw%3D&reserved=0 _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users