Re: [sqlite] expected performance of selecting large numbers of records
On 1/31/2015 12:21 AM, Godmar Back wrote: If I want to do a dump of the table, as in "SELECT value FROM " I'll quickly get "database is locked" errors. Googling revealed that those are because a long running select keeps a cursor, and thus readlock on the entire database. I have since rewritten the query using multiple SELECT * FROM ... LIMIT a, b where b = 50 and a = 0, 50, 100, 150, . However, it takes 20 hours to fully extract the table's 850,000 records, with only minimal per record processing. That's a quadratic algorithm you have here. In order to start reading from the b-th record, SQLite needs to read and discard the previous b records. You would be better off with a "WHERE id > $LastID ORDER BY id " clause, where $LastID would be the key of the last record in the previous chunk. Better still, switch to WAL mode ( https://www.sqlite.org/wal.html ) It allows readers to work simultaneously with a single writer. Then just run a single query across the whole table. My question: is this performance expected, or am I doing something wrong? Is there a quick way to count and/or dump an approximate snapshot of a single 2-column table such as the one shown above, even while keeping the database available for concurrent writes? Do you ever delete records from this table? If you don't, you can get a count with select max(rowid) from mytable; That should be nearly instantaneous. If you do delete rows, and all you need is a quick count, then I would suggest adding an ON INSERT and ON DELETE triggers that would update a count stored in a separate, singleton table. I'm not sure what you mean by "dump an approximate snapshot". -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] expected performance of selecting large numbers of records
On Sat, 31 Jan 2015 00:21:18 -0500, Godmar Back wrote: >I have a single SQLite 2-column table with a primary TEXT key and a value, >like so (id TEXT PRIMARY KEY, value TEXT). One process adds new records >periodically, perhaps 1-10 per minute. The database currently has 850,000 >entries and is 7.3GB large on disk. You could benchmark an alternative structure with CREATE TABLE ... ( id TEXT PRIMARY KEY, value TEXT ) WITHOUT ROWID; http://www.sqlite.org/withoutrowid.html >I also need to perform bulk queries, which appear to be very slow. On an >average desktop PC, for instance, a "SELECT COUNT(*)" takes over 5 minutes. count(*) does a full table scan, which takes time. You could add ON INSERT/ON DELETE triggers to maintain a running count in a separate table. >If I want to do a dump of the table, as in "SELECT value FROM " I'll >quickly get "database is locked" errors. Googling revealed that those are >because a long running select keeps a cursor, and thus readlock on the >entire database. > > I have since rewritten the query using multiple SELECT * >FROM ... LIMIT a, b where b = 50 and a = 0, 50, 100, 150, . However, >it takes 20 hours to fully extract the table's 850,000 records, with only >minimal per record processing. With the LIMIT clause SELECT will usually be slower, especially when ORDER BY is something else than the primary key. >My question: is this performance expected, or am I doing something wrong? >Is there a quick way to count and/or dump an approximate snapshot of a >single 2-column table such as the one shown above, even while keeping the >database available for concurrent writes? You probably should use PRAGMA journal_mode=WAL; , which allows one writer and many readers concurrently. http://www.sqlite.org/pragma.html#pragma_journal_mode Also, have a look at http://www.sqlite.org/pragma.html#pragma_busy_timeout >I'm using Ubuntu 12.10 with 3.7.9-2ubuntu1.1 libsqlite3 using the Python >binding. Check whether that the version reported by SELECT sqlite_version(); has the features I mentioned above (WITHOUT ROWID and WAL mode); If necessary, compile your own library from source, http://www.sqlite.org/download.html (bottom) Hope this helps. -- Regards, Kees Nuyt ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] expected performance of selecting large numbers of records
Thank you very much for those suggestions! Select max(rowid) completes immediately, I haven't tried WAL mode and the idea of sorting yet. https://www.sqlite.org/wal.html appears to be down right now, but http://www.sqlite.org/wal.html is up. On Sat, Jan 31, 2015 at 5:31 AM, Igor Tandetnik wrote: > > I'm not sure what you mean by "dump an approximate snapshot". I meant that I don't care whether records that were recently added, say during the last few seconds, are included in the total count()/dump of all records or not. So I'm fine with an approximate snapshot that may not have a few of the latest updates. - Godmar ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] "database disk image is malformed" error occurs more
1. No client-server, I use MySQL, SQL-Server or ORACLE for that. 2. No access to the SQLite database ever by more than one process concurrently in writable mode. In readable mode, yes. But the reported damage cases were always single user, one PC. 3. I cannot prevent or disallow users to keep their databases on NAS or remote server storage. Telling them that keeping a file on a NAS box will probably damage the file would be sales venom. Despite, the reported cases were all databases stored on local disks, except one. I keep databases between 0.5 and 10 GB on NAS (Linux/SAMBA) and Windows servers for my test scenarios and perf/load tests. No troubles, no corruption. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] "database disk image is malformed" error occurs more
As I wrote above, damaged databases are replaced. No user continues working with a damaged database once it has been identified. The issue here is to detect this early and avoid it altogether. > One column of one row of one table may get corrupted. > If that's the case then the database can be used for years > without any problem being noticed. > Theoretically "PRAGMA integrity_check" will notice it, however. a) As I wrote above. b) integrity_check must find such issues. That’s how I understand it and Richard told me once. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] expected performance of selecting large numbers of records
On 1/31/15, Godmar Back wrote: > Thank you very much for those suggestions! > > Select max(rowid) completes immediately, I haven't tried WAL mode and the > idea of sorting yet. > > https://www.sqlite.org/wal.html appears to be down right now, but > http://www.sqlite.org/wal.html is up. > There was some server maintenance. And for whatever reason, the HTTPS side didn't come back up after reboot. I've manually restarted it so it should be working now. For future reference, if www.sqlite.org is ever down, you can always visit one of the (geographically dispersed) backup sites: http://www2.sqlite.org/ http://www3.sqlite.org/ -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] v3.5.0 of DB Browser for SQLite released
Hi all, We've just released v3.5.0 of DB Browser for SQLite: https://github.com/sqlitebrowser/sqlitebrowser/releases/tag/v3.5.0 This release has a fair amount of bug fixes (eg in CVS import), plus several usability tweaks, and now supports Simplified Chinese. This is the first release to support encrypted databases too, via SQLCipher. Only available in the non-Windows builds for now (eg OSX, Linux, etc). Windows and MacOS X binaries are available from the above URL. Linux, FreeBSD, OS/2 users will need to compile it themselves (pretty easy) for now, until ports/packages/etc become available. Hope that's helpful for people. :) Regards and best wishes, Justin Clift ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] "database disk image is malformed" error occurs more
On Fri, 30 Jan 2015 13:17:26 -0500 Stephen Chrzanowski wrote: > 2.1 Filesystems with broken or missing lock implementations > > SQLite depends on the underlying filesystem to do locking as the > documentation says it will. But some filesystems contain bugs in their > locking logic such that the locks do not always behave as advertised. The problem is even deeper than that. NFS does not implement Posix semantics. The actual behavior is hard to reason about and far outside SQLite's scope. On a local file system the kernel guarantees filebuffer cache coherency. If process A reads a block from the disk, and process B modifies the same block, the next time process A consults that block it will see B's changes. Note this happens whether or not locking is involved, and regardless of the state of the disk. It's a by-product of a unified buffer cache. On a network filesystem there is no unified buffer cache. Writes by B are not seen when A consults its cached block. NFS does not promise that a second read by A will reflect changes made by B. Even if all locks are implemented corrected and honored, A stands to read invalid data unless steps are taken to manage the cache, something SQLite doesn't do afaik. The subject has been discussed here before, as it turns out. The Googles returned http://sqlite.1065341.n5.nabble.com/SQLite-on-NFS-cache-coherency-td33697.html, which contains much more detail and references. DBMS implementations are always about arbitrating access to shared data. They require a single, unified view of the data. Getting that view over a remote filesystem is difficult in the best of circumstances and requires explicit measures be taken. SQLite doesn't attempt to do so, by design. (A reasonable choice IMO.) If you want multi-node access to a database over a network, there are many other options. Unsurprisingly, none of them use a network filesystem either. HTH. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Truncation of floating point numbers in SQLite?
On Fri, 30 Jan 2015 10:39:31 +0200 RSmith wrote: > At first I thought SQLite (or any RDBMS) should really strore > whatever you give and return it untouched, but that cannot be true > for an RDBMS because it has to interpret the data, it isn't just a > binary store. It has to answer questions like SELECT ("colA"+3), > (7/"colB"); or sort by colA or use a collation on strings etc. etc. > - all of which means it must care about what the value relates to and > cannot simply ignore it unless stored as an ignorant type (i.e Blob). Yes, that's true. Whenever a column value is interpreted by SQLite, that interpretation might differ from the one used by the application that inserted it. For example, I might have an 8-bit unsigned integer stored in column A with the value 0xFF (all bits on). The clause "WHERE A > 128" would not return that row because SQLite interprets the column as signed and the value as -1. Similarly ORDER BY would not work in accordance with the application's interpretation. That said, https://www.sqlite.org/datatype3.html says REAL is "stored as an 8-byte IEEE floating point number". I can't see why a non-signalling NaN couldn't be stored and interpreted. You'd need some convention for collation, and a function like is_nan() would be helpful for WHERE clauses. It could be argued that SQLite should *not* do that, in the interest of simplicity. NaN represents an invalid output, say sqrt(-2). For any function F, NaN = F (NaN), and NaN <> NaN. Because SQL NULL works similarly, ISTM a defensible alternative would be to say all NaNs become NULL in the database, or raise an error. That loses the distinction between "missing" and "not computable", but that probably doesn't matter in most practical senses. On the third hand, many numerical and stats packages, e.g. R, use NaN (mistakenly, in my view) to represent missing data. Integration with such packages might be facilitated by "NaN fidelity" if we can call it that, so that they can retrieve what they saved. > > Trying to retrieve a stored qNaN or sNaN returns a column type of > > NULL and a value of 0. That's consistent with division-by-zero yielding zero, and is an impediment to using SQLite for scientific work. --jkl ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Truncation of floating point numbers in SQLite?
On 31 Jan 2015, at 11:14pm, James K. Lowden wrote: > That said, https://www.sqlite.org/datatype3.html says REAL is "stored > as an 8-byte IEEE floating point number". I can't see why a > non-signalling NaN couldn't be stored and interpreted. You'd need some > convention for collation, and a function like is_nan() would be helpful > for WHERE clauses. So, having established that NaN and -0 do not make the round trip from a C variable through a database and back into a C variable ... at least I think we have ... There's a bit representation for storing NaN in a binary64. Presumably, if you have NaN in a C variable, it's using that representation in memory. Then you use sqlite3_bind_double() to get this into an SQLite variable, and use INSERT to get SQLite to put that into a database. Do we expect to be able to see that pattern by looking through the database file using hexdump ? Because there's nothing in the SQLite documentation that says it can store values like NaN or -0.0. The documentation just says it can store numbers. It should be possible to find out when the distinction between 0.0 and -0.0 gets lost. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Truncation of floating point numbers in SQLite?
On Sun, Feb 1, 2015 at 2:07 AM, Simon Slavin wrote: > So, having established that NaN and -0 do not make the round trip from a C > variable through a database and back into a C variable ... at least I think > we have ... > If you're assuming C89 (which sqlite3 is, by and large), it's a technical fact that there is no standard representation of either negative zero, NaN, or Infinity. Any such support would be non-C-standard. > Because there's nothing in the SQLite documentation that says it can store > values like NaN or -0.0. The documentation just says it can store > numbers. It should be possible to find out when the distinction between > 0.0 and -0.0 gets lost. > sqlite's platform (C89) doesn't natively support it, so if it's there then sqlite3 is either supporting it itself (custom code) or is relying on C99-specific APIs (which, to the best of my (fallible) knowledge, it does not do). -- - stephan beal http://wanderinghorse.net/home/stephan/ http://gplus.to/sgbeal "Freedom is sloppy. But since tyranny's the only guaranteed byproduct of those who insist on a perfect world, freedom will have to do." -- Bigby Wolf ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users