Hi mailing list gurus!
I will start with TL;DR version as this may be enough for some of you:
 * We are trying to investigate an issue that we see in diagnostic data of our C++ product.  * The issue was pinpointed to be caused by timeout on `sqlite3_open_v2` which supposedly takes over 60s to complete (we only give it 60s).  * We tried multiple different configurations, but never were able to reproduce even 5s delay on this call.

So the question is if maybe there are some known scenarios in which `sqlite3_open_v2` can take that long (on windows)?

Now to the details:
 * We are using version `3.10.2` of SQLite. We went through changelogs from this version till now and nothing we've found in bugfixes section seems to suggest that there was some issue that was addressed in consecutive sqlite releases and may have caused our problem.  * The issue we see affects around 0.1% unique user across all supported versions of windows (Win 7, Win 8, Win 10). There are no manual user complains/reports about that - this can suggest that problem happens in the context where something serious enough is happening with user machine/system that he doesn't expect anything to work. So something that indicates system wide failure is a valid possibility as long as it can possibly happen for 0.1% of random windows users.  * There are no data indicating that the same issue ever occurred on Mac which is also supported platform with large enough sample of diagnostic data.  * We are using Poco (https://github.com/pocoproject/poco, version: 1.7.2) as a tool for accessing our SQLite database, but we've analyzed the Poco code and it seems that failure on this code level can only (possibly) explain ~1% of all collected samples. This is how we've determined that problem lies in `sqlite3_open_v2` taking long time.
 * This happens on both `DELETE` journal mode as well as on `WAL`.
 * It seems like after this problem happens first time for a particular user each consecutive call to `sqlite3_open_v2` takes that long until user restarts whole application (possibly machine, no way to tell from our data).
 * We are using following flags setup for `sqlite3_open_v2` (as in Poco):

 > sqlite3_open_v2(..., ..., SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | SQLITE_OPEN_URI, NULL);

 * This usually doesn't happen on startup of the application so it's not likely to be caused by something happening while our application is not running. This includes power cuts offs causing data destruction (which tend to return SQLITE_CORRUPT anyway, as mentioned in: https://www.sqlite.org/howtocorrupt.html).  * We were never able to reproduce this issue locally even though we tried different things:    ** Multiple threads writing and reading from DB with synchronization required by particular journaling system.    ** Keeping sqlite connection open for long time and working on db normally in a mean while.    ** Trying to hit HDD hard with other data (dumping /dev/rand (WSL) to multiple files from different processes while accessing DB normally).    ** Trying to force antivirus software to scan db on every file access (tested with Avast with basically everything enabled including "scan on open" and "scan on write").    ** Breaking our internal synchronization required by particular journaling systems.    ** Calling WinAPI CreateFile with all possible combinations of file sharing options on db file - this caused issues but `sqlite3_open_v2` always returned fast - just with error.    ** Calling WinAPI LockFile on random parts of DB file which is btw. nice way of reproducing `SQLITE_IOERR`, but no luck with reproducing the discussed issue.    ** Some additional attempts to actually stretch Poco layer and double check if our static analysis of codes are right.  * We've tried to look for similar issues online but anything somewhat relevant we've found was here http://sqlite.1065341.n5.nabble.com/sqlite3-open-v2-performance-degrades-as-number-of-opens-increase-td37482.html . This doesn't seem to explain our case though, as the numbers of parallel connections are way beyond what we have as well as what would typical windows user have (unless there is some somewhat popular app exploiting sqlite which we don't know about).

Do you have any ideas what can cause that issue?

Maybe some hints what else should we check or what additional diagnostic data that we can collect from users would be useful to pinpoint the real reason why that happens?

Thanks in advance :)
Andrzej 'Yester' Fiedukowicz

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to