Note that, as I understand it, if you use only a single connection for the
CherryPi server, all the threads on the server will be running the queries
sequentially. Try using a database connection per thread?

On Thu, May 18, 2017, 8:47 PM Gabriele Lanaro <gabriele.lan...@gmail.com>
wrote:

> Thanks everyone for all the tips! This is all very useful.
>
> We are using SQLite’s FTS5 feature to search a large number of text files.
> There are 50M records in total but they are split across 1000 smaller
> databases of 50K records each. Each DB is 250MB in size.
>
> I am trying to test query performance and for that I am issuing queries for
> same term over and over on a random subset of the databases.
>
> Each query will execute on its own thread (I am using Python’s CherryPy
> server) and the connection to each DB will be kept alive (multiple requests
> will certainly hit the same connection). I am using PRAGMA
> cache_size=256MB, so it should cache the entire DB in memory.
>
> If I only use a single database then I would expect that the query could be
> served entirely from cached the copy of DB. Since we have 100 of those, I
> could reasonably expect that there will be some reading from disk.
>
> But for a smaller number of DBs, say 10, I would expect the performance
> similar to in-memory.
>
> The schema is roughly constituted of two tables, one is a full text index,
> fts5_table, while the other is called datatable.
>
> The query in question is a full text query on the full text index joined
> with another table.
>
> *SELECT* fts5.column, datatable.column2 *FROM* fts5_table, datatable
> *WHERE*
>  fts5_table *MATCH* ‘term’ *AND* datatable.id = fts5_column.rowid *ORDER
> BY*
>  rank *ASC* *LIMIT* 10;
>
> The total number 'term' matches in the documents is about 30000 documents
> (given the orderby query, this means that all of those documents will need
> to be ranked).
>
> As I increase the number of threads I reach a peak performance at 4
> threads, which is less than the number of available cores (after that, the
> performance degrades). If I make my application use more threads (i.e. more
> requests get processed in paralle) the performance linearly degrades.
>
> All I am getting is ~40 requests per seconds (when we have 100 users making
> queries in parallel). But I believe SQLite is capable of doing much more. I
> am running on a Red Hat Linux on an Intel machine with 8-cores, 16-threads
> and 64GB of system memory. Disks are SSD.
>
> Thanks,
>
> Gabriele
>
>
> On Thu, May 18, 2017 at 7:51 AM, Eduardo Morras <emorr...@yahoo.es> wrote:
>
> > On Wed, 17 May 2017 22:18:19 -0700
> > Gabriele Lanaro <gabriele.lan...@gmail.com> wrote:
> >
> > > Hi, I'm trying to assess if the performance of my application is
> > > dependent on disk access from sqlite.
> > >
> > > To rule this out I wanted to make sure that the SQLite DB is
> > > completely accessed from memory and there are no disk accesses.
> > >
> > > Is it possible to obtain this effect by using pragmas such as
> > > cache_size?
> > >
> > > Another solution is to copy the existing db to a :memory: db but I'd
> > > like to achieve the same effect without doing so (because it will
> > > require substantial modification of the application). For the sake of
> > > argument, let's image that using :memory: db is not an option.
> > >
> > > Also using a ramdisk is not an option because I don't have root
> > > access to the machine.
> >
> > What OS are you using?
> >
> > You can next tips to make the app less dependant on disk I/O access:
> >
> > a) change where store temporal tables (mat views, subqueries, temp
> > tables) and indices (transient, to use ram always (pragma
> > temp_store=2),
> >
> > b) increase cache size, the more, the better (os disk cache is shared
> > with other processes and is slower), if cache is equal or bigger than
> > your db, it'll be fit in ram,
> >
> > c) set config SQLITE_CONFIG_STMTJRNL_SPILL=-1 if you use statement
> > journal files (check http://www.sqlite.org/tempfiles.html#stmtjrnl),
> >
> > d) use wal mode to avoid *-shm files, and set wal_checkpointing,
> >
> > e) use mmap_size pragma to minimize I/O (check
> > http://www.sqlite.org/mmap.html it has disadvanteges too)
> >
> >
> > > Thanks,
> > >
> > > Gabriele
> >
> > HTH
> >
> > ---   ---
> > Eduardo Morras <emorr...@yahoo.es>
> >
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to