On Wed, Sep 26, 2018 at 02:33:27AM +0000, Vladimir Panteleev via Digitalmars-d wrote: > On Wednesday, 26 September 2018 at 01:52:31 UTC, H. S. Teoh wrote: [...] > > but basically, any column used in a WHERE clause is a candidate for > > indexing. > > Yep, I think we're past that already. > > The last issue I ran into was subscriptions. Some people seem to be > creating subscriptions to collect and email them frequently, sometimes > on every post - not that those work well, because the forum stops > emailing people as soon as they have unread messages in their > subscriptions, but they still get saved to the queue. Still, the > longer the forum was online, the more subscriptions have accumulated, > and every new post resulted in all those subscriptions getting > triggered. Now, every time a subscription with an email action was > triggered, we had to check if there are any unread messages in their > subscription queue, and there can be a lot of messages in there - > thus, this caused something like an O(m*n) database operation (with > the underlying database implementation also not having a constant > execution time of course). I fixed this by limiting the check to the > first unread post instead of reusing a function to count all unread > messages in the subscription queue: [...]
Hmm. I wonder if it might help if you separated the subscription queue into its own database. You're right that SQLite locks the entire database when writing, so if there's a lot of write activity going on, readers will be frequently blocked. Separating part of the data into its own DB may help increase the parallelizability of the system. In my experience in working with SQLite, I find that generally you want to design your schema so that writes are as short as possible -- the global DB write lock can be a big bottleneck, as you said, so the less time you spend holding the write lock, the better. If it's possible to split up data for different functionalities into different DBs, that might help improve performance by avoiding waiting for the global write lock on a single DB all the time. Now glancing over your schema, I wonder if it might make a difference if you used the implicit rowId for your 'ID' fields instead of strings. The rowId in SQLite is special, because it exists for every table implicitly, is always unique, and AFAIK allows fast lookups (or faster lookups than strings, AIUI). It may not be practical to do that now, given the large amount of data already stored with string IDs, but it could potentially make a difference. Of course, you may need to map it to strings somewhere, so I'm not sure if the tradeoff is worth it, but it might be instructive to experiment with it in an offline system to see if you could gain some performance that way. T -- We are in class, we are supposed to be learning, we have a teacher... Is it too much that I expect him to teach me??? -- RL