Note that, as I understand it, if you use only a single connection for the CherryPi server, all the threads on the server will be running the queries sequentially. Try using a database connection per thread?
On Thu, May 18, 2017, 8:47 PM Gabriele Lanaro <gabriele.lan...@gmail.com> wrote: > Thanks everyone for all the tips! This is all very useful. > > We are using SQLite’s FTS5 feature to search a large number of text files. > There are 50M records in total but they are split across 1000 smaller > databases of 50K records each. Each DB is 250MB in size. > > I am trying to test query performance and for that I am issuing queries for > same term over and over on a random subset of the databases. > > Each query will execute on its own thread (I am using Python’s CherryPy > server) and the connection to each DB will be kept alive (multiple requests > will certainly hit the same connection). I am using PRAGMA > cache_size=256MB, so it should cache the entire DB in memory. > > If I only use a single database then I would expect that the query could be > served entirely from cached the copy of DB. Since we have 100 of those, I > could reasonably expect that there will be some reading from disk. > > But for a smaller number of DBs, say 10, I would expect the performance > similar to in-memory. > > The schema is roughly constituted of two tables, one is a full text index, > fts5_table, while the other is called datatable. > > The query in question is a full text query on the full text index joined > with another table. > > *SELECT* fts5.column, datatable.column2 *FROM* fts5_table, datatable > *WHERE* > fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid *ORDER > BY* > rank *ASC* *LIMIT* 10; > > The total number 'term' matches in the documents is about 30000 documents > (given the orderby query, this means that all of those documents will need > to be ranked). > > As I increase the number of threads I reach a peak performance at 4 > threads, which is less than the number of available cores (after that, the > performance degrades). If I make my application use more threads (i.e. more > requests get processed in paralle) the performance linearly degrades. > > All I am getting is ~40 requests per seconds (when we have 100 users making > queries in parallel). But I believe SQLite is capable of doing much more. I > am running on a Red Hat Linux on an Intel machine with 8-cores, 16-threads > and 64GB of system memory. Disks are SSD. > > Thanks, > > Gabriele > > > On Thu, May 18, 2017 at 7:51 AM, Eduardo Morras <emorr...@yahoo.es> wrote: > > > On Wed, 17 May 2017 22:18:19 -0700 > > Gabriele Lanaro <gabriele.lan...@gmail.com> wrote: > > > > > Hi, I'm trying to assess if the performance of my application is > > > dependent on disk access from sqlite. > > > > > > To rule this out I wanted to make sure that the SQLite DB is > > > completely accessed from memory and there are no disk accesses. > > > > > > Is it possible to obtain this effect by using pragmas such as > > > cache_size? > > > > > > Another solution is to copy the existing db to a :memory: db but I'd > > > like to achieve the same effect without doing so (because it will > > > require substantial modification of the application). For the sake of > > > argument, let's image that using :memory: db is not an option. > > > > > > Also using a ramdisk is not an option because I don't have root > > > access to the machine. > > > > What OS are you using? > > > > You can next tips to make the app less dependant on disk I/O access: > > > > a) change where store temporal tables (mat views, subqueries, temp > > tables) and indices (transient, to use ram always (pragma > > temp_store=2), > > > > b) increase cache size, the more, the better (os disk cache is shared > > with other processes and is slower), if cache is equal or bigger than > > your db, it'll be fit in ram, > > > > c) set config SQLITE_CONFIG_STMTJRNL_SPILL=-1 if you use statement > > journal files (check http://www.sqlite.org/tempfiles.html#stmtjrnl), > > > > d) use wal mode to avoid *-shm files, and set wal_checkpointing, > > > > e) use mmap_size pragma to minimize I/O (check > > http://www.sqlite.org/mmap.html it has disadvanteges too) > > > > > > > Thanks, > > > > > > Gabriele > > > > HTH > > > > --- --- > > Eduardo Morras <emorr...@yahoo.es> > > > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users