Re: [sqlite] Massively multithreaded SQLite queries
You must never have used a Microsoft Operating system ... > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Deon Brewis > Sent: Saturday, 4 March, 2017 00:33 > To: SQLite mailing list > Subject: Re: [sqlite] Massively multithreaded SQLite queries > > Can you give an example of an Intel Processor SKU with half-assed cores? > > There's HyperThreading of course, but I don't think anybody has ever > considered HyperThreading to be separate cores. > > - Deon > > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Keith Medcalf > Sent: Friday, March 3, 2017 4:52 PM > To: SQLite mailing list > Subject: Re: [sqlite] Massively multithreaded SQLite queries > > > No, the good rule of thumb is to allocate one thread per CPU. Depending > on the particular multi-core CPU you "may" be able to use all the cores > simultaneously but in many cases diminishing returns will set in long > before you can execute one thread per core. If this is an Intel processor > that claims it has more than one thread per core be *extremely* careful as > that will give you one thread and one half-assed thread per core. > Sometimes, half-assed cores are presented as real cores when they lack a > separate execution unit. Be vary wary. > > I presume you are forcing separate threads to separate cores by setting > processor affinity on the threads and not just blindly hoping that the OS > scheduler does "the right thing"? > > > -Original Message- > > From: sqlite-users > > [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > > On Behalf Of Jens Alfke > > Sent: Friday, 3 March, 2017 17:19 > > To: SQLite mailing list > > Subject: Re: [sqlite] Massively multithreaded SQLite queries > > > > > > > On Mar 3, 2017, at 3:51 PM, Simon Slavin wrote: > > > > > > Then mess with the '5' until you find a good value. > > > > A common rule of thumb with thread pools is to allocate one thread per > > CPU core. > > > > —Jens > > ___ > > sqlite-users mailing list > > sqlite-users@mailinglists.sqlite.org > > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
Can you give an example of an Intel Processor SKU with half-assed cores? There's HyperThreading of course, but I don't think anybody has ever considered HyperThreading to be separate cores. - Deon -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Keith Medcalf Sent: Friday, March 3, 2017 4:52 PM To: SQLite mailing list Subject: Re: [sqlite] Massively multithreaded SQLite queries No, the good rule of thumb is to allocate one thread per CPU. Depending on the particular multi-core CPU you "may" be able to use all the cores simultaneously but in many cases diminishing returns will set in long before you can execute one thread per core. If this is an Intel processor that claims it has more than one thread per core be *extremely* careful as that will give you one thread and one half-assed thread per core. Sometimes, half-assed cores are presented as real cores when they lack a separate execution unit. Be vary wary. I presume you are forcing separate threads to separate cores by setting processor affinity on the threads and not just blindly hoping that the OS scheduler does "the right thing"? > -Original Message- > From: sqlite-users > [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Jens Alfke > Sent: Friday, 3 March, 2017 17:19 > To: SQLite mailing list > Subject: Re: [sqlite] Massively multithreaded SQLite queries > > > > On Mar 3, 2017, at 3:51 PM, Simon Slavin wrote: > > > > Then mess with the '5' until you find a good value. > > A common rule of thumb with thread pools is to allocate one thread per > CPU core. > > —Jens > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] [BUG REPORT] Open blob is invalidated by the update to unrelated fields in the same row
The write operation using the open sqlite3_blob object fails after some other field in the same row is updated. The testcase below illustrates the problem. Yuri ---testcase--- #include #include #include "sqlite3.h" int main(int argc, char **argv) { if(argc < 2) { fprintf(stderr, "Usage: %s \n", argv[0]); exit(1); } // file to insert FILE *f = fopen(argv[2], "rb"); if(NULL == f) { fprintf(stderr, "Couldn't open file %s\n", argv[2]); exit(1); } // Calculate size of file fseek(f, 0, SEEK_END); long filesize = ftell(f); fseek(f, 0, SEEK_SET); // Table name we're going to use char tablename[] = "testblob"; char columnname[] = "blobby"; // Actual database handle sqlite3 *db = NULL; // Database commands char create_sql[1024]; snprintf(create_sql, sizeof(create_sql), "CREATE TABLE IF NOT EXISTS %s (" "id INTEGER PRIMARY KEY, fld INTEGER, %s BLOB)", tablename, columnname); // Going to insert a zeroblob of the size of the file char insert_sql[1024]; snprintf(insert_sql, sizeof(insert_sql), "INSERT INTO %s (%s) VALUES (?)", tablename, columnname); // SQLite return value int rc; // Open the database rc = sqlite3_open(argv[1], &db); if(SQLITE_OK != rc) { fprintf(stderr, "Can't open database %s (%i): %s\n", argv[1], rc, sqlite3_errmsg(db)); exit(1); } char *exec_errmsg; rc = sqlite3_exec(db, create_sql, NULL, NULL, &exec_errmsg); if(SQLITE_OK != rc) { fprintf(stderr, "Can't create table (%i): %s\n", rc, sqlite3_errmsg(db)); sqlite3_close(db); exit(1); } sqlite3_stmt *insert_stmt; rc = sqlite3_prepare_v2(db, insert_sql, -1, &insert_stmt, NULL); if(SQLITE_OK != rc) { fprintf(stderr, "Can't prepare insert statment %s (%i): %s\n", insert_sql, rc, sqlite3_errmsg(db)); sqlite3_close(db); exit(1); } // Bind a block of zeros the size of the file we're going to insert later sqlite3_bind_zeroblob(insert_stmt, 1, filesize); if(SQLITE_DONE != (rc = sqlite3_step(insert_stmt))) { fprintf(stderr, "Insert statement didn't work (%i): %s\n", rc, sqlite3_errmsg(db)); exit(1); } sqlite3_int64 rowid = sqlite3_last_insert_rowid(db); printf("Created a row, id %i, with a blank blob size %i\n", (int)rowid, (int)filesize); sqlite3_blob *blob; rc = sqlite3_blob_open(db, "main", tablename, columnname, rowid, 1, &blob); if(SQLITE_OK != rc) { fprintf(stderr, "Couldn't get blob handle (%i): %s\n", rc, sqlite3_errmsg(db)); exit(1); } const int BLOCKSIZE = 1024; int len; void *block = malloc(BLOCKSIZE); int offset = 0; while(0 < (len = fread(block, 1, BLOCKSIZE, f))) { if(SQLITE_OK != (rc = sqlite3_blob_write(blob, block, len, offset))) { fprintf(stderr, "Error writing to blob handle. Offset %i, len %i, rc=%d\n", offset, len, rc); exit(1); } offset+=len; printf("... wrote block @off=%d\n", offset); // update that breaks the following sqlite3_blob_write if(SQLITE_OK != (rc = sqlite3_exec(db, "UPDATE testblob SET fld=1", NULL, NULL, &exec_errmsg))) { fprintf(stderr, "Error while updating the integer field, error=%d\n", rc); exit(1); } } sqlite3_blob_close(blob); printf("Successfully wrote to blob\n"); free(block); fclose(f); sqlite3_finalize(insert_stmt); sqlite3_close(db); return 0; } ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
As an aside, you may likely find that the OS will schedule multiple processes far more efficiently than it schedules multiple threads (especially from the perspective of scheduling the cores independently). You may find that it is far more efficient to perhaps schedule a dozen processes with six to a dozen worker threads each (or whatever number seems to be optimal) and use some kind of IPC to submit workload to the individual processes. Also, make sure the database is in WAL journal mode -- this will help even if you are only running read transactions + if you have enough memory make sure the page cache and temp space is big enough on each connection and force temp work into memory - but do not overcommit memory. > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Andrew Brown > Sent: Friday, 3 March, 2017 18:24 > To: SQLite mailing list > Subject: Re: [sqlite] Massively multithreaded SQLite queries > > Now this is a interesting question. As it happens I /am/ blindly assuming > the os would schedule it properly. I will look at my options this evening > and see about addressing that. > > On Mar 3, 2017 4:52 PM, Keith Medcalf wrote: > > No, the good rule of thumb is to allocate one thread per CPU. Depending > on the particular multi-core CPU you "may" be able to use all the cores > simultaneously but in many cases diminishing returns will set in long > before you can execute one thread per core. If this is an Intel processor > that claims it has more than one thread per core be *extremely* careful as > that will give you one thread and one half-assed thread per core. > Sometimes, half-assed cores are presented as real cores when they lack a > separate execution unit. Be vary wary. > > I presume you are forcing separate threads to separate cores by setting > processor affinity on the threads and not just blindly hoping that the OS > scheduler does "the right thing"? > > > -Original Message- > > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > > On Behalf Of Jens Alfke > > Sent: Friday, 3 March, 2017 17:19 > > To: SQLite mailing list > > Subject: Re: [sqlite] Massively multithreaded SQLite queries > > > > > > > On Mar 3, 2017, at 3:51 PM, Simon Slavin wrote: > > > > > > Then mess with the '5' until you find a good value. > > > > A common rule of thumb with thread pools is to allocate one thread per > CPU > > core. > > > > —Jens > > ___ > > sqlite-users mailing list > > sqlite-users@mailinglists.sqlite.org > > > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglis > ts.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite- > users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7Ca4729ccfb1754b9 > a534008d46298b2d9%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=C6yNkbM7p0u > oib5OgXXYS7%2BEy8cJR1I6N8JjaQ4iMCU%3D&reserved=0 > > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglis > ts.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite- > users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7Ca4729ccfb1754b9 > a534008d46298b2d9%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=C6yNkbM7p0u > oib5OgXXYS7%2BEy8cJR1I6N8JjaQ4iMCU%3D&reserved=0 > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UNION
Thanks, Don. -Original Message- From: don v nielsen Sent: Friday, March 3, 2017 3:21 PM To: SQLite mailing list Subject: Re: [sqlite] UNION Might I suggest: https://www.w3schools.com/sql/ dvn On 03/01/2017 09:02 AM, jose isaias cabrera wrote: Ryan, I just want to thank you for your kindness and display of goodwill to mankind. This is a great response. I even learned something from this post. Thanks so much for your responses. There are others like us in this group that love to learn and your posts always are well received. Thanks. In Spanish we say, "muchas gracias." josé On 2017-03-01 09:51, R Smith wrote: On 2017/03/01 3:40 AM, do...@mail.com wrote: # SELECT * FROM processors UNION SELECT * FROM storage; Error: SELECTs to the left and right do not have the same number of result columns. All tables that I created in my database have differing column names, values, and amounts of columns with the noted exception of the one column which is common (board). I've no idea what to do now. Why is this an error? I think that perhaps you are new to SQL and other replies assumed you simply wanted what you wrote. I could be wrong, but just in case, here are my suggestions: Perhaps what you would rather like to do is JOIN these tables and not UNION them? Do you wish to match /every/ processor with /every/ board? In this case, the statement should read: (Warning: this could produced excessively long listings) SELECT * FROM processors, storage; Do you wish to match only processors and storage that fit on the same boards? In this case the statement might read something like: SELECT * FROM processors JOIN storage USING board ; Do you wish to list /all/ processors and add the storage options for the same board /if/ there are any? In this case the statement might read something like: SELECT * FROM processors LEFT JOIN storage ON storage.board = processors.board ; As you can see, lots of different things can be achieved. A quick course in SQL via perhaps W3Schools will teach all these in a few hours. Cheers, Ryan ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
You might want to try enabling mmap mode: pragma mmap_size = 4294967296; or something like that. Try to make it larger than your databases. I'd expect that if you're running with that many cores, you're _probably_ running in a 64-bit address space, so it'll probably work. -scott On Fri, Mar 3, 2017 at 5:22 PM, Andrew Brown < andrew.br...@economicmodeling.com> wrote: > Yes, each thread has its own connection. > > On Mar 3, 2017 4:45 PM, Keith Medcalf wrote: > > Does each thread have its own connection? > > > -Original Message- > > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > > On Behalf Of Andrew Brown > > Sent: Friday, 3 March, 2017 13:14 > > To: sqlite-users@mailinglists.sqlite.org > > Subject: [sqlite] Massively multithreaded SQLite queries > > > > Hello, > > > > Based on my reading of the documentation it seems like SQLite is a great > > candidate for cases where you have a large number of threads that only > > need to read a database simultaneously, in our case for fast data lookup, > > aggregation, etc. I've been able to generate SQL queries that do this, > but > > once we start running them on a large server with a lot of multithreading > > going on, I find that we spend a lot of time in __raw_spin_lock - perhaps > > 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in > > __raw_spin_lock). This is being run on 64 and 72 core machines, and the > > more cores I run it on, the slower it ends up going. > > > > To give a bit more detail, I'm working with dotnet core, have written a > > custom sqlite wrapper (since the dotnet core one lacks the ability to set > > connection flags beyond readonly, and doesn't have sqlite_prepare_v2() > > implemented), and I'm running on linux against a bunch of SQLite files in > > the 2gb-400gb size range. Individual queries are wicked fast, but once I > > start spreading the load over all the cores by running simultaneous > > queries I lose the performance advantage and it actually becomes > > significantly slower. > > > > Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried > > shared cache, read uncommitted. Tried without shared cache, read > > uncommitted. Tried WAL. If I write a less efficient query, I spend less > > time in __raw_spin_lock, but of course then it takes longer for the > > queries themselves to return. > > > > Any tips to handle massively multithreaded side by side chunked queries > on > > the same database? > > > > Thank you in advance, > > Andrew Brown > > ___ > > sqlite-users mailing list > > sqlite-users@mailinglists.sqlite.org > > https://na01.safelinks.protection.outlook.com/?url= > http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin% > 2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew. > brown%40economicmodeling.com%7Cc0018bf7b8cd49bb588608d46297c70b% > 7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata= > gad9Y8BC67c4ZAC05GckCghNHgKmm8i5piwgjM2MXgw%3D&reserved=0 > > > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > https://na01.safelinks.protection.outlook.com/?url= > http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin% > 2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew. > brown%40economicmodeling.com%7Cc0018bf7b8cd49bb588608d46297c70b% > 7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata= > gad9Y8BC67c4ZAC05GckCghNHgKmm8i5piwgjM2MXgw%3D&reserved=0 > > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
On Mar 3, 2017, at 6:24 PM, Andrew Brown wrote: > > I /am/ blindly assuming the os would schedule it properly. Why would it? Windows won’t do that for you, either. Now, if you’re using .NET’s thread pool mechanisms, it *should* be making sure it doesn’t oversubscribe the cores too much, but then we’re assuming Microsoft is writing good Linux code here. The .NET Core code is open, so you can see what it does on Linux. (I could see, too, but I can’t say I care enough to bother. :) ) ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
Now this is a interesting question. As it happens I /am/ blindly assuming the os would schedule it properly. I will look at my options this evening and see about addressing that. On Mar 3, 2017 4:52 PM, Keith Medcalf wrote: No, the good rule of thumb is to allocate one thread per CPU. Depending on the particular multi-core CPU you "may" be able to use all the cores simultaneously but in many cases diminishing returns will set in long before you can execute one thread per core. If this is an Intel processor that claims it has more than one thread per core be *extremely* careful as that will give you one thread and one half-assed thread per core. Sometimes, half-assed cores are presented as real cores when they lack a separate execution unit. Be vary wary. I presume you are forcing separate threads to separate cores by setting processor affinity on the threads and not just blindly hoping that the OS scheduler does "the right thing"? > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Jens Alfke > Sent: Friday, 3 March, 2017 17:19 > To: SQLite mailing list > Subject: Re: [sqlite] Massively multithreaded SQLite queries > > > > On Mar 3, 2017, at 3:51 PM, Simon Slavin wrote: > > > > Then mess with the '5' until you find a good value. > > A common rule of thumb with thread pools is to allocate one thread per CPU > core. > > —Jens > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7Ca4729ccfb1754b9a534008d46298b2d9%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=C6yNkbM7p0uoib5OgXXYS7%2BEy8cJR1I6N8JjaQ4iMCU%3D&reserved=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7Ca4729ccfb1754b9a534008d46298b2d9%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=C6yNkbM7p0uoib5OgXXYS7%2BEy8cJR1I6N8JjaQ4iMCU%3D&reserved=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
Yes, each thread has its own connection. On Mar 3, 2017 4:45 PM, Keith Medcalf wrote: Does each thread have its own connection? > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Andrew Brown > Sent: Friday, 3 March, 2017 13:14 > To: sqlite-users@mailinglists.sqlite.org > Subject: [sqlite] Massively multithreaded SQLite queries > > Hello, > > Based on my reading of the documentation it seems like SQLite is a great > candidate for cases where you have a large number of threads that only > need to read a database simultaneously, in our case for fast data lookup, > aggregation, etc. I've been able to generate SQL queries that do this, but > once we start running them on a large server with a lot of multithreading > going on, I find that we spend a lot of time in __raw_spin_lock - perhaps > 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in > __raw_spin_lock). This is being run on 64 and 72 core machines, and the > more cores I run it on, the slower it ends up going. > > To give a bit more detail, I'm working with dotnet core, have written a > custom sqlite wrapper (since the dotnet core one lacks the ability to set > connection flags beyond readonly, and doesn't have sqlite_prepare_v2() > implemented), and I'm running on linux against a bunch of SQLite files in > the 2gb-400gb size range. Individual queries are wicked fast, but once I > start spreading the load over all the cores by running simultaneous > queries I lose the performance advantage and it actually becomes > significantly slower. > > Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried > shared cache, read uncommitted. Tried without shared cache, read > uncommitted. Tried WAL. If I write a less efficient query, I spend less > time in __raw_spin_lock, but of course then it takes longer for the > queries themselves to return. > > Any tips to handle massively multithreaded side by side chunked queries on > the same database? > > Thank you in advance, > Andrew Brown > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7Cc0018bf7b8cd49bb588608d46297c70b%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=gad9Y8BC67c4ZAC05GckCghNHgKmm8i5piwgjM2MXgw%3D&reserved=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7Cc0018bf7b8cd49bb588608d46297c70b%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=gad9Y8BC67c4ZAC05GckCghNHgKmm8i5piwgjM2MXgw%3D&reserved=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
Yes, if they are lock bound, then they need to have the number of cores which reduces the locking overhead to the point where it's not degrading performance too much. Though I guess the OP really didn't say that (more CPUs may spend more time in spinlocks and still spend less wallclock time). Another thing to look at it whether any queries can be more effectively scheduled. Having hundreds of completely-unrelated queries seems unlikely to me. More likely is that you have a smaller number of queries which are targeting various different bind parameters. Preparing a particular query once, then looping and running each set of bind parameters on one thread is probably going to be _much_ more efficient. -scott On Fri, Mar 3, 2017 at 5:03 PM, Warren Young wrote: > On Mar 3, 2017, at 5:51 PM, Keith Medcalf wrote: > > > > No, the good rule of thumb is to allocate one thread per CPU. > > It depends on the workload. Parallel make (e.g. “make -jN” in GNU make) > typically improves in speed past N=core count to about 1.5x the core count. > > SQLite seems like a similar kind of workload: lots of CPU *and* disk I/O, > so that you need a bit of oversubscription to keep all the cores busy, > because some threads/processes will be stalled on I/O. > > Not that any of this is relevant at the current point, since the OP is > currently neither I/O bound nor CPU-bound, but lock-bound. > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
On Mar 3, 2017, at 5:51 PM, Keith Medcalf wrote: > > No, the good rule of thumb is to allocate one thread per CPU. It depends on the workload. Parallel make (e.g. “make -jN” in GNU make) typically improves in speed past N=core count to about 1.5x the core count. SQLite seems like a similar kind of workload: lots of CPU *and* disk I/O, so that you need a bit of oversubscription to keep all the cores busy, because some threads/processes will be stalled on I/O. Not that any of this is relevant at the current point, since the OP is currently neither I/O bound nor CPU-bound, but lock-bound. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
No, the good rule of thumb is to allocate one thread per CPU. Depending on the particular multi-core CPU you "may" be able to use all the cores simultaneously but in many cases diminishing returns will set in long before you can execute one thread per core. If this is an Intel processor that claims it has more than one thread per core be *extremely* careful as that will give you one thread and one half-assed thread per core. Sometimes, half-assed cores are presented as real cores when they lack a separate execution unit. Be vary wary. I presume you are forcing separate threads to separate cores by setting processor affinity on the threads and not just blindly hoping that the OS scheduler does "the right thing"? > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Jens Alfke > Sent: Friday, 3 March, 2017 17:19 > To: SQLite mailing list > Subject: Re: [sqlite] Massively multithreaded SQLite queries > > > > On Mar 3, 2017, at 3:51 PM, Simon Slavin wrote: > > > > Then mess with the '5' until you find a good value. > > A common rule of thumb with thread pools is to allocate one thread per CPU > core. > > —Jens > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
Does each thread have its own connection? > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Andrew Brown > Sent: Friday, 3 March, 2017 13:14 > To: sqlite-users@mailinglists.sqlite.org > Subject: [sqlite] Massively multithreaded SQLite queries > > Hello, > > Based on my reading of the documentation it seems like SQLite is a great > candidate for cases where you have a large number of threads that only > need to read a database simultaneously, in our case for fast data lookup, > aggregation, etc. I've been able to generate SQL queries that do this, but > once we start running them on a large server with a lot of multithreading > going on, I find that we spend a lot of time in __raw_spin_lock - perhaps > 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in > __raw_spin_lock). This is being run on 64 and 72 core machines, and the > more cores I run it on, the slower it ends up going. > > To give a bit more detail, I'm working with dotnet core, have written a > custom sqlite wrapper (since the dotnet core one lacks the ability to set > connection flags beyond readonly, and doesn't have sqlite_prepare_v2() > implemented), and I'm running on linux against a bunch of SQLite files in > the 2gb-400gb size range. Individual queries are wicked fast, but once I > start spreading the load over all the cores by running simultaneous > queries I lose the performance advantage and it actually becomes > significantly slower. > > Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried > shared cache, read uncommitted. Tried without shared cache, read > uncommitted. Tried WAL. If I write a less efficient query, I spend less > time in __raw_spin_lock, but of course then it takes longer for the > queries themselves to return. > > Any tips to handle massively multithreaded side by side chunked queries on > the same database? > > Thank you in advance, > Andrew Brown > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
So in this case I'm running on a 72 core machine. Also the databases have one table in them each... The goal here is to bring all the cores to bear on the many queries - each grabbing the next query to be run and running it, until finally there are no more chunks to run. Then within my own apis I aggregate the results and form my response. I'm going to try preparing all the statements before I run any later tonight. I'm also toying with the idea of using a shared nothing architecture, in which I run 72 processes instead of threads, in the hope that there will be less contention that way. Thoughts on that idea? I really appreciate everyone's responsiveness. On Mar 3, 2017 4:19 PM, Jens Alfke wrote: > On Mar 3, 2017, at 3:51 PM, Simon Slavin wrote: > > Then mess with the '5' until you find a good value. A common rule of thumb with thread pools is to allocate one thread per CPU core. —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7C1941086b72c84122e80e08d462942220%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=ytN2iaMT9eiK%2BktzJva1shKgrBfYhxeUHyesJscJnB8%3D&reserved=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
On Mar 3, 2017, at 1:13 PM, Andrew Brown wrote: > > This is being run on 64 and 72 core machines, and the more cores I run it on, > the slower it ends up going. What happens if you rework the app to do only one thing, single-threaded, but run between 64 and 108 instances in parallel on the same system? (108 == 72 * 1.5, a common work factor value for parallel applications.) You can use a tool like GNU parallel to manage the work queue: https://www.gnu.org/software/parallel/ If nothing else, this would separate out the in-process locking from the file locking, thereby making more clear where the delays are coming from. Your __raw_spin_lock() result isn’t terribly enlightening because that’s a very low level Linux kernel mechanism. It doesn’t tell us whether the slowdown is due to file locking in SQLite, mutexes in SQLite, some non-optimal pthreads use in .NET Core, etc. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
> On Mar 3, 2017, at 3:51 PM, Simon Slavin wrote: > > Then mess with the '5' until you find a good value. A common rule of thumb with thread pools is to allocate one thread per CPU core. —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Can sqlite3_blob_write be done through SQL?
On 03/01/2017 23:41, Clemens Ladisch wrote: It would certainly be possible to add your own user-defined SQL function to call sqlite3_blob_write(). I think this should be added to sqlite itself. Writing a portion of blob is very much like updating a field. There is currently no way to do this in SQL fashion in an efficient way. So such command can be chained with other SQL statements and make the client code much simpler. Yuri ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
On 3 Mar 2017, at 11:47pm, Scott Hess wrote: > I'd say you should consider switching to some sort of queue feeding a > worker pool, then experimenting with pool sizes. Agreed. Set up 5 threads which have actual access to the database. Have your 1835 jobs queue up their jobs for servicing by those 5 threads. Then mess with the '5' until you find a good value. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
I'd say you should consider switching to some sort of queue feeding a worker pool, then experimenting with pool sizes. Often problems reward the first few threads you add, but at some point additional threads become a negative unless the system is specifically designed for high thread counts (and such design can be annoying for low-thread-count users). There also may be caching interactions which improve with a smaller number of threads. Something else to try is to have multiple databases which are not sharing page caches (to reduce locking). It is entirely possible that having 4 databases each with 8 threads could be faster than one database with 32 threads, because they each keep out of each other's way, more. [None of the above is really SQLite specific.] -scott On Fri, Mar 3, 2017 at 3:37 PM, Andrew Brown < andrew.br...@economicmodeling.com> wrote: > Well, in the situation I'm working with, my API is constructing 1835 small > SQL jobs to be run, and then passing them off to a structure in which 72 > threads are running, each with their own db connection (I assume that's > what you mean by a database handle, a DB connection, but please, correct me > if I'm wrong!). So in this case, 72 database handles on my bigger server. > > Unfortunately, I'm not running the same queries over and over (one example > is a 400gb database with 3-5 dimension columns and a few data columns, and > this is slicing on that data) so preparing them will have somewhat less > benefit in that sense than in other cases. That said, I can still try > preparing all the statements before I run any. > > > > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Jens Alfke > Sent: Friday, March 3, 2017 3:25 PM > To: SQLite mailing list > Subject: Re: [sqlite] Massively multithreaded SQLite queries > > > > On Mar 3, 2017, at 2:52 PM, Josh Hunsaker > wrote: > > > > Your problem might be mitigated if you could compile your queries in > advance. > > Precompiled statements are a must if you want the best performance (and > you’re running the same queries over and over.) > > Andrew, how many database handles are open? > > —Jens > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > https://na01.safelinks.protection.outlook.com/?url= > http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin% > 2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew. > brown%40economicmodeling.com%7C5fba3607e2164acbf45808d4628c85a6% > 7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata= > 1xM5isGNHfqUu7yWfoohbYBryDxcgzed0Qlz37K0FDw%3D&reserved=0 > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
Well, in the situation I'm working with, my API is constructing 1835 small SQL jobs to be run, and then passing them off to a structure in which 72 threads are running, each with their own db connection (I assume that's what you mean by a database handle, a DB connection, but please, correct me if I'm wrong!). So in this case, 72 database handles on my bigger server. Unfortunately, I'm not running the same queries over and over (one example is a 400gb database with 3-5 dimension columns and a few data columns, and this is slicing on that data) so preparing them will have somewhat less benefit in that sense than in other cases. That said, I can still try preparing all the statements before I run any. -Original Message- From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Jens Alfke Sent: Friday, March 3, 2017 3:25 PM To: SQLite mailing list Subject: Re: [sqlite] Massively multithreaded SQLite queries > On Mar 3, 2017, at 2:52 PM, Josh Hunsaker wrote: > > Your problem might be mitigated if you could compile your queries in advance. Precompiled statements are a must if you want the best performance (and you’re running the same queries over and over.) Andrew, how many database handles are open? —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailinglists.sqlite.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fsqlite-users&data=01%7C01%7Candrew.brown%40economicmodeling.com%7C5fba3607e2164acbf45808d4628c85a6%7C7cc1677566a34e8b80fd5b1f1db15061%7C0&sdata=1xM5isGNHfqUu7yWfoohbYBryDxcgzed0Qlz37K0FDw%3D&reserved=0 ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
> On Mar 3, 2017, at 2:52 PM, Josh Hunsaker wrote: > > Your problem might be mitigated if you could compile your queries in advance. Precompiled statements are a must if you want the best performance (and you’re running the same queries over and over.) Andrew, how many database handles are open? —Jens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Massively multithreaded SQLite queries
On Fri, Mar 3, 2017 at 1:13 PM, Andrew Brown wrote: > > Any tips to handle massively multithreaded side by side chunked > queries on the same database? In my (limited) experience, it seems that multithreaded SQLite acquires a lock on a shared, in-memory b-tree every time a statement is prepared. It might not be the database read itself, but statement preparation that is causing threads to wait for each other. Your problem might be mitigated if you could compile your queries in advance. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] sqlite3 feature or regression
On 3 Mar 2017, at 1:12pm, Vermes Mátyás wrote: > The attached ruby script demonstrates a feature of the newer sqlite3 > libraries, which may be a regression. Just a note that you cannot attach files to posts to this list. If your file is text, just paste it into a message. Otherwise please put it on the web and post a URL. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] UNION
Might I suggest: https://www.w3schools.com/sql/ dvn On 03/01/2017 09:02 AM, jose isaias cabrera wrote: Ryan, I just want to thank you for your kindness and display of goodwill to mankind. This is a great response. I even learned something from this post. Thanks so much for your responses. There are others like us in this group that love to learn and your posts always are well received. Thanks. In Spanish we say, "muchas gracias." josé On 2017-03-01 09:51, R Smith wrote: On 2017/03/01 3:40 AM, do...@mail.com wrote: # SELECT * FROM processors UNION SELECT * FROM storage; Error: SELECTs to the left and right do not have the same number of result columns. All tables that I created in my database have differing column names, values, and amounts of columns with the noted exception of the one column which is common (board). I've no idea what to do now. Why is this an error? I think that perhaps you are new to SQL and other replies assumed you simply wanted what you wrote. I could be wrong, but just in case, here are my suggestions: Perhaps what you would rather like to do is JOIN these tables and not UNION them? Do you wish to match /every/ processor with /every/ board? In this case, the statement should read: (Warning: this could produced excessively long listings) SELECT * FROM processors, storage; Do you wish to match only processors and storage that fit on the same boards? In this case the statement might read something like: SELECT * FROM processors JOIN storage USING board ; Do you wish to list /all/ processors and add the storage options for the same board /if/ there are any? In this case the statement might read something like: SELECT * FROM processors LEFT JOIN storage ON storage.board = processors.board ; As you can see, lots of different things can be achieved. A quick course in SQL via perhaps W3Schools will teach all these in a few hours. Cheers, Ryan ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Massively multithreaded SQLite queries
Hello, Based on my reading of the documentation it seems like SQLite is a great candidate for cases where you have a large number of threads that only need to read a database simultaneously, in our case for fast data lookup, aggregation, etc. I've been able to generate SQL queries that do this, but once we start running them on a large server with a lot of multithreading going on, I find that we spend a lot of time in __raw_spin_lock - perhaps 75%+ of the time (In one case, 87% of 350 seconds x 72 cores was spent in __raw_spin_lock). This is being run on 64 and 72 core machines, and the more cores I run it on, the slower it ends up going. To give a bit more detail, I'm working with dotnet core, have written a custom sqlite wrapper (since the dotnet core one lacks the ability to set connection flags beyond readonly, and doesn't have sqlite_prepare_v2() implemented), and I'm running on linux against a bunch of SQLite files in the 2gb-400gb size range. Individual queries are wicked fast, but once I start spreading the load over all the cores by running simultaneous queries I lose the performance advantage and it actually becomes significantly slower. Have tried setting SQLITE_OPEN_NOMUTEX, SQLITE_OPEN_FULLMUTEX. Tried shared cache, read uncommitted. Tried without shared cache, read uncommitted. Tried WAL. If I write a less efficient query, I spend less time in __raw_spin_lock, but of course then it takes longer for the queries themselves to return. Any tips to handle massively multithreaded side by side chunked queries on the same database? Thank you in advance, Andrew Brown ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] sqlite3 feature or regression
Hi! The attached ruby script demonstrates a feature of the newer sqlite3 libraries, which may be a regression. -- Vermes Mátyás ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
Thanks all for your input, it has really helped. In my real world application tmp_keep is a temporary table populated by examinining a number of other tables etc., and I suddenly realsied that it could even contain duplicate ids. Sloppy thinking on my part. I get the best results by creating another table: CREATE TEMPORARY TABLE tmp_keep_unique (id1 integer primary key); INSERT INTO tmp_keep_unique SELECT DISTINCT id1 from tmp_keep; It takes far longer to create an index on tmp_keep, than it save times on the above query with one. Then *with a primary key on both table1 and tmp_keep_unique*DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep_unique); is acceptably efficient. On SQLite NOT EXISTS is quicker than NOT IN, but I also need to use same SQL on a MySQL implementation, and it behaves the oppoiste. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
Dave Blake wrote: > Yes the id1 are integer primary keys. In both tables? If yes, then there is not much you could do, and the problem probably are all the modifications done to the actual table and its indexes. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
On 3 Mar 2017, at 10:53am, Dave Blake wrote: > DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); > > But this becomes inefficient when tmp_keep is large. Do you have an appropriate index on tmp_keep ? CREATE INDEX tk_id1 ON tmp_keep (id1) then try it again. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
DELETE FROM table1 WHERE NOT EXISTS (SELECT 1 from tmp_keep WHERE id1 = table1.id1); Does it in a single pass by doing a correlated subquery on each row in table1 to see if the id is in tmp_keep. > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Dave Blake > Sent: Friday, 3 March, 2017 03:53 > To: SQLite mailing list > Subject: [sqlite] Deleting records from a large table > > Say table1 has more then 50 records, and there is a second table > tmp_keep with the ids of the records in table1 to be kept, the rest need > to > be deleted. The number of records in tmp_keep can vary from 0 to all the > records in table1, with any values in between. > > What is the best strategy for doing the deletion? > > For deleting a large number of records (tmp_keep is small), this works > fine: > DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); > > But this becomes inefficient when tmp_keep is large. > > Any suggestions? > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Fossil version 2.0
Fossil is the version control system (VCS) written specifically to support SQLite. See https://www.fossil-scm.org/ for details. Version 2.0 of Fossil has just been released. The 2.0 release supports the use of SHA3-256 hashes for naming files and check-ins in the repository. SHA1 hashes are also still supported for backwards compatibility. In other words, Fossil 2.0 will read both SHA1 and SHA3-256 hashes. On the other hand, older Fossil-1.x releases will only read SHA1 hashes and will generate errors if they encounter SHA3-256 hashes. To be clear: Fossil-2.0 and Fossil-1.x are fully compatible and will seamlessly interoperate on repositories that contain only SHA1 content. However, once a repository acquires SHA3-256 content, Fossil-1.x will stop working and you will need to upgrade to Fossil-2.0. If you are using Fossil to access the SQLite source repository, you should upgrade since we will soon begin pushing SHA3-256 hashed check-ins and files to the main SQLite repository. Once that happens, you will no longer be able to pull or clone the SQLite repository using older 1.x versions of Fossil. The web interface to Fossil continues to operate exactly has it did before. All legacy URLs (even those involving SHA1 hash names) continue to work without change. If you only access Fossil through the web interface, no action is necessary. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
Could be keep almost all the records so ~50, but it varies greatly so sometimes will be just keep 10. I can adjust approach depending on size if necessary. Yes the id1 are integer primary keys. Table1 has a number of indexes and views, so the create new table approach is less attractive ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Index usefulness for GROUP BY
Jeffrey Mattox wrote: > is my index on weekday worthwhile, time-wise and space-wise? (Query > speed is not a big issue for me, and the DB is relatively small Indexes are optimizations. In a small DB, the effect is probably not noticeable, which implies that you should not bother. Where exactly the point is at which the index becomes useful in your system is something which you have to measure yourself. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
Dave Blake wrote: > For deleting a large number of records (tmp_keep is small), this works > fine: > DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); > > But this becomes inefficient when tmp_keep is large. SQLite usually creates a temporary index for the values in the IN clause. How large is "large"? Are the id1 values integers? Then you can make tmp_keep.id1 the INTEGER PRIMARY KEY. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
On 2017/03/03 12:53 PM, Dave Blake wrote: Say table1 has more then 50 records, and there is a second table tmp_keep with the ids of the records in table1 to be kept, the rest need to be deleted. The number of records in tmp_keep can vary from 0 to all the records in table1, with any values in between. What is the best strategy for doing the deletion? For deleting a large number of records (tmp_keep is small), this works fine: DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); But this becomes inefficient when tmp_keep is large. BEGIN TRANSACTION; ALTER TABLE table1 RENAME TO tmp1; CREATE TABLE table1 ( -- Your standard Table creation code here for table1... ); INSERT INTO table1 SELECT tmp1.* FROM tmp1 JOIN tmp_keep ON tmp_keep.id1 = tmp1.id1; DROP TABLE tmp1; COMMIT; ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Deleting records from a large table
Say table1 has more then 50 records, and there is a second table tmp_keep with the ids of the records in table1 to be kept, the rest need to be deleted. The number of records in tmp_keep can vary from 0 to all the records in table1, with any values in between. What is the best strategy for doing the deletion? For deleting a large number of records (tmp_keep is small), this works fine: DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); But this becomes inefficient when tmp_keep is large. Any suggestions? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Index usefulness for GROUP BY
YES. AFAIK if SQLite detects that the rows are/can be made to be returned in GROUP BY order it can use internal variables to accumulate the group results. This is expected to be significantly faster than locating and updating a temporary BTree row for each record scanned. -Ursprüngliche Nachricht- Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im Auftrag von Jeffrey Mattox Gesendet: Freitag, 03. März 2017 11:30 An: SQLite mailing list Betreff: [sqlite] Index usefulness for GROUP BY Given this DB schema (simplified, there are other columns): CREATE TABLE History ( history_ID INTEGER PRIMARY KEY, gameCount INTEGER, weekday INTEGER, /* 0=Sunday, 6=Saturday */ hour INTEGER, /* (0..23) */ datetime INTEGER /* unix datetime */ ); CREATE INDEX Idx_weekday ON History( weekday ); - Now, I look at a recent set of rows... SELECT TOTAL(gameCount), weekday FROM History WHERE datetime >= strftime('%s','now','-28 days') GROUP BY weekday ORDER BY 1 DESC QUERY PLANS: without the index: 0 0 0 SCAN TABLE History 0 0 0 USE TEMP B-TREE FOR GROUP BY <-- weekday (7 groups) 0 0 0 USE TEMP B-TREE FOR ORDER BY with the index: 0 0 0 SCAN TABLE History USING INDEX Idx_weekday 0 0 0 USE TEMP B-TREE FOR ORDER BY Either way, the entire table is scanned (right?). My index covers the entire table, but the TEMP B-TREE FOR GROUP BY contains only the rows matching the WHERE clause, so the TEMP B-TREE is much smaller (right?). So, is my index on weekday worthwhile, time-wise and space-wise? (Query speed is not a big issue for me, and the DB is relatively small -- there are, at most, 60 rows added per day. Memory is plentiful, OSX). --- Jeff ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ Gunter Hick Software Engineer Scientific Games International GmbH FN 157284 a, HG Wien Klitschgasse 2-4, A-1130 Vienna, Austria Tel: +43 1 80100 0 E-Mail: h...@scigames.at This communication (including any attachments) is intended for the use of the intended recipient(s) only and may contain information that is confidential, privileged or legally protected. Any unauthorized use or dissemination of this communication is strictly prohibited. If you have received this communication in error, please immediately notify the sender by return e-mail message and delete all copies of the original communication. Thank you for your cooperation. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Index usefulness for GROUP BY
Given this DB schema (simplified, there are other columns): CREATE TABLE History ( history_ID INTEGER PRIMARY KEY, gameCount INTEGER, weekday INTEGER, /* 0=Sunday, 6=Saturday */ hour INTEGER, /* (0..23) */ datetime INTEGER /* unix datetime */ ); CREATE INDEX Idx_weekday ON History( weekday ); - Now, I look at a recent set of rows... SELECT TOTAL(gameCount), weekday FROM History WHERE datetime >= strftime('%s','now','-28 days') GROUP BY weekday ORDER BY 1 DESC QUERY PLANS: without the index: 0 0 0 SCAN TABLE History 0 0 0 USE TEMP B-TREE FOR GROUP BY <-- weekday (7 groups) 0 0 0 USE TEMP B-TREE FOR ORDER BY with the index: 0 0 0 SCAN TABLE History USING INDEX Idx_weekday 0 0 0 USE TEMP B-TREE FOR ORDER BY Either way, the entire table is scanned (right?). My index covers the entire table, but the TEMP B-TREE FOR GROUP BY contains only the rows matching the WHERE clause, so the TEMP B-TREE is much smaller (right?). So, is my index on weekday worthwhile, time-wise and space-wise? (Query speed is not a big issue for me, and the DB is relatively small -- there are, at most, 60 rows added per day. Memory is plentiful, OSX). --- Jeff ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users