Re: [sqlite] Scrolling through results of select
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 >>> 1. Is there any significant overhead on SQLite from my selecting from a >>> view representing the original arbitrary select? That is, will SQLite still >>> use any indexes etc correctly? Or do I need to dissect/parse the original >>> select statement, changing the where statement etc? >> >> You can work out the answer yourself by using EXPLAIN and EXPLAIN QUERY PLAN >> of some representative examples. > > Yes, but I wondered if there was some overriding logic that SQLite uses that > would provide a theoretical/logical rather than experimental guide. There isn't as trying EXPLAIN would show you. >> The rest of your questions assume a particular solution. The only thing >> that will reliably work is to reissue the query using skip and limit > > By "skip" do you mean select where rowid > , or offset > or something else? No, I mean skipping the first N results. You can't use rowid since it won't exist in many cases. For example 'select 3+4' or 'select x+y from a,b where ...'. > Hmm, true. I hadn't thought of user defined function side effects. I don't > have to allow for that at the moment, but I'll keep it in mind. Is it common > or even good practice for a user function (used in a select statement) to > modify the table from which it's selecting? That seems like bad practice to > me and I can't see why you'd do that rather than use update, insert or delete > rather than select to make changes. The UDF could take a filename as a parameter and return the size or last access time. By changing when bits of the query execute you'll get different answers (eg the file size changes between page scrolls in the query). >> then the solution is to 'CREATE TEMP TABLE results AS ...select...'. This >> will also work if someone uses >> "ORDER BY random()" or any other udf that depends on more than its arguments. > > Hmm, good thinking. I'll consider that. The downside is that creating a > temporary table would require SQLite to process every row in the select, > whereas prepare/step only processes the rows as they are shown. This would > make a big difference for very large data sets or for a view/select > containing a complex calculation for each row. I think you are overthinking the problem. No one is going to scroll through 100,000 results so there is no need to save 100,000 of them. Pick an arbitrary number (eg 1000), use the CREATE TEMP TABLE ... AS .. select ... approach, and add a 'LIMIT 1000' on the end. This will work with any query and work reliably no matter what else happens to the database (eg other processes modifying it). If you want to refine things then there are several callbacks you can use. For example if the database is coming back with one row per second then you don't really want to wait 1000 seconds. You can have a monitoring thread and call sqlite3_interrupt to abort the query. If you don't want to use another thread then you can register a progress callback which knows when the query started. Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk1jMKcACgkQmOOfHg372QTfbQCgoO3rzpBFmcZIZf2FKJitXaWv t7AAniZ//1kazi0NIXFeUoGCqTkUwKs3 =ISAF -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] COUNT() extremely slow first time!
On Mon, Feb 21, 2011 at 11:35 AM, Jim Wilcoxsonwrote: > On Mon, Feb 21, 2011 at 11:05 AM, Sven L wrote: >> >> Thank you for your detailed explanation! >> First, can you please tell me how to purge the cache in Windows 7? This >> could be very useful for my tests! > > Sorry, dunno for Windows. On Mac OSX it is the purge command, in the > development tools. On Linux, you do: echo 3 > > /prog/sys/vm/drop_caches Just make sure you either (a) quote the 3 (echo '3' > /proc/sys/vm/drop_caches) or (b) put a space between the 3 and the >. If you don't quote it, and you don't put the space in (echo 3>/proc/sys/vm/drop_caches) it doesn't do anything, for reasons I won't go into. -- -- Stevie-O Real programmers use COPY CON PROGRAM.EXE ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite server using execnet ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/21/2011 12:37 PM, Jay A. Kreibich wrote: > Yes, but in something like memcached, the database is not aware of > that structure, and can't take advantage of it. Memcached does understand some types and has atomic increment/decrement etc but does not understand structure. > When storing serialized objects, it is all too common to see code that > fetches an object, un-marshals it, alters some simple value, re-marshals > the whole object, and then write the whole thing back. This all circles back to what you are doing and in particular what the OP was doing. If you need to do queries, modifications and durability then what you really need falls under the category of 'database'. On the other hand if there is no need for querying or changes then something like memcached is a great way for a bunch of machines/processes to get the data. > Redis Redis is a database :-) > Yes and no. Redis, like memcached, is essentially an always in-memory > key/value store. *All* databases are in-memory for practical purposes. Their working set will need to be in memory either explicitly due to their implementation, or implicitly via the operating system or through administration (eg indices). If accesses to the working set of data require disk accesses then the performance will be dismal. (Some exceptions for data only accessed sequentially.) > Its main selling point is memcached-like speed, Incidentally MongoDB claims the same thing :-) The places I use memcached are where I do not want disk touched. >... but it is a good fit ... It looks like we are seeing what happened with the first generation of DVCS. Relational representation is being changed to be less constrained/structured. There are numerous "databases" with varying and overlapping sweet spots in terms of querying, persistence, performance, distribution etc. I expect we'll see similar shakeouts and end up with a small number of strong products., > Like SQLite itself, I tend do all my virtual table modules in > extremely vanilla C. Brave :-) My personal preference is to do the initial development in Python and then reimplement in C if needed for portability/performance reasons. The Python development is a lot quicker and then acts as a test suite for the C implementation. I wonder how many of the other bindings for SQLite have bothered to implement virtual tables as that probably holds back usage of virtual tables a lot. (A 'hello world' virtual table in Python/APSW is about half a screenful of code. An example one I have that represents information about files on disk is just under a screenful.) > I happen to think virtual tables are one of the more powerful features > of SQLite, but also one of the most under-utilized features. Agreed. Unfortunately it does require that the underlying data be representable in a relational manner which is also very constraining. > Since a big part of writing these is to get them out for other people > to use them, Where do you publish them? It is probably also worth trying to encourage a 'contrib' location for SQLite that is more active and in wider use than http://www.sqlite.org/contrib > Working in C avoids adding complexity, like someone working > in Java wanting to use your MongoDB module. I suppose it could be > done, but I wouldn't want to be the one trying to make it all work. MongoDB is client server so this issue does not arise. (Nor do they have virtual tables.) In order to perform "programming" on the server side such as for map/reduce you have to use Javascript which is slowly becoming the most popular language for that kind of thing including on the desktop. (Eg see node.js and Seed.) Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk1i/zkACgkQmOOfHg372QT9cgCgyV7NaECzQUrrrDZr9zYri0tq RkkAoKSuRlclVshN/oIxSXOy0dtXZcot =xEyA -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Scrolling through results of select
On 21/02/2011, at 12:41 PM, Roger Binns wrote: >> How can I best scroll though the results of an arbitrary select query? > > If you have infinite time and memory then there is no problem. Memory and processor are limited (typically iPad/iPhone). Disk space would cope with creating temporary tables if necessary. In general, I am trying to cater for a table or view (ie arbitrary select), the results of which would not fit entirely into memory. > You haven't stated your constraints or how arbitrary "arbitrary" really is. By arbitrary, I mean that the user can type any select query that SQLite allows (or have a view in their schema), which might make use of order by, group by, where, union etc. >> 1. Is there any significant overhead on SQLite from my selecting from a view >> representing the original arbitrary select? That is, will SQLite still use >> any indexes etc correctly? Or do I need to dissect/parse the original select >> statement, changing the where statement etc? > > You can work out the answer yourself by using EXPLAIN and EXPLAIN QUERY PLAN > of some representative examples. Yes, but I wondered if there was some overriding logic that SQLite uses that would provide a theoretical/logical rather than experimental guide. >> 2. If the arbitrary select statement already contains an "order by" clause, >> then I obviously need to use the order by column(s) specified there as the >> keyColumn for the scrolling, rather than override the order by. Is there any >> way to get the sort order of an arbitrary select statement? > > Results are returned in the order requested or randomly(*) if not. Given you > can have subqueries with ordering, collations and all sorts of other things, > trying to extract the actual ordering is as difficult as implementing the > SQLite engine itself. > > (*) In practise it is in btree iteration order but that is not something you > should depend on. Yes, I understand that the sort order of results cannot be counted on as consistent if no order by clause is give. But if I am imposing a sort order (eg by rowid) I want to as closely as possible match the undefined sort order so the results look the same. Is sorting by rowid in a table as close I could get to this? What order by sequence could I best use to match the results of a select from joined tables? > You can even "ORDER BY random()". Hmm, good point. I guess in that case (are there other cases?) I can't count on the results being the same from one select to the next, so preparing the statement, extracting some rows, closing, then preparing and extracting again when the user scrolls won't work, since the results will change. If there is a random() component then I would have to leave the query/prepare open, denying all other access to that database file, until there will definitely be no more scrolling. Correct? > The rest of your questions assume a particular solution. The only thing that > will reliably work is to reissue the query using skip and limit By "skip" do you mean select where rowid > , or offset or something else? > assuming no changes in between. Yes, I can assume no changes in between (though the random() possibility above will make this approach fail I think). > This is if you are trying to save memory/disk and there is no possibility of > changes between scrolling operations. Yes. > If you need to be resilient to that too (implied by "arbitrary" since user > defined functions could have side effects) Hmm, true. I hadn't thought of user defined function side effects. I don't have to allow for that at the moment, but I'll keep it in mind. Is it common or even good practice for a user function (used in a select statement) to modify the table from which it's selecting? That seems like bad practice to me and I can't see why you'd do that rather than use update, insert or delete rather than select to make changes. > then the solution is to 'CREATE TEMP TABLE results AS ...select...'. This > will also work if someone uses > "ORDER BY random()" or any other udf that depends on more than its arguments. Hmm, good thinking. I'll consider that. The downside is that creating a temporary table would require SQLite to process every row in the select, whereas prepare/step only processes the rows as they are shown. This would make a big difference for very large data sets or for a view/select containing a complex calculation for each row. Thanks for taking the time to explore some possibilities for me. Tom BareFeetWare -- Comparison of SQLite GUI tools: http://www.barefeetware.com/sqlite/compare/?ml ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Scrolling through results of select
On 21/02/2011, at 12:41 PM, Simon Slavin wrote: >> How can I best scroll though the results of an arbitrary select query? > > Suppose the results of the SELECT change between your original decision to do > the scrolling and the time the user decides to scroll. Should what's shown > on the display reflect the data as it originally was, or up-to-date data ? > Or could there never be any such changes ? Thanks for the reply Simon. Good question. In most or all cases for now, the select results will not change mid scroll. The user interface will be clearly either scrolling or editing. In the longer term I will probably allow editing mid scroll, but that's fine, since I can flag within my app that results need to be re-selected. >> 1. Is there any significant overhead on SQLite from my selecting from a view >> representing the original arbitrary select? That is, will SQLite still use >> any indexes etc correctly? Or do I need to dissect/parse the original select >> statement, changing the where statement etc? > > A VIEW is a saved SELECT query -- the query, not the results, are saved. So > I think you don't need to make the extra effort you describe. Yes, I understand that a view doesn't save any results. I'm unclear, however, as to how smart is SQLite's query optimizer to, for instance, realise that when it's selecting from a view, it can use the indexes of source columns for the where filter. >> 2. If the arbitrary select statement already contains an "order by" clause, >> then I obviously need to use the order by column(s) specified there as the >> keyColumn for the scrolling, rather than override the order by. Is there any >> way to get the sort order of an arbitrary select statement? > > I don't know of any. Hmm, OK. I guess I'll need to impose a sort order on otherwise unordered query results. For a table, I'll just sort by rowid. For a view, I'll probably just look for any indexed column. >> 3. This method requires that keyColumn is defined as unique (or primary >> key), otherwise it can skip rows of data. Is there any way to allow for a >> non-unique keyColumn? > > No, but instead of using just keyColumn you could use (keyColumn,rowid). > This would ensure your key was always unique, and will work on arbitrary > SQLite tables unless someone is intentionally messing with how SQLite works. Yes, I was thinking along those lines (ie order by keyColumn, rowid). That will work for tables. I'll have to adapt it somehow for views. >> 5. I understand that "Rule Number 1" is to "not leave queries open". > > Correct. Don't allow a user to create and close a query just by choosing > when they want to scroll through a list. > >> So what's the best way to minimize the overhead of repeatedly running the >> same query but with a different where clause and limit (and order if >> reversing). I'm thinking I would be best to actually keep the query (ie >> prepared statement) open while live scrolling (eg flicking through rows on >> an iPhone/iPad), not using a limit clause at all, but instead just keep >> getting more rows as needed to fill the scrolling, until the user stops >> scrolling, then finalize, close etc. When they begin scrolling again, fire >> up a new prepare (with a new maxVisibleKeyValue) . > > To get the following or previous line to one which is already being shown, > find the key for that row (which you should save in memory as you're > displaying the line) and use > > SELECT FROM WHERE (keyColumn||rowid)>lastlineKey ORDER BY > keyColumn,rowid LIMIT 1 > > to get the following line or > > SELECT FROM WHERE (keyColumn||rowid) keyColumn,rowid LIMIT 1 Yes, that's where I was headed, but I wondered if running that query continuously might have unnecessary overhead since it's constantly preparing a new query for each row. I wondered if I'd be better just having SELECT FROM WHERE (keyColumn||rowid)>lastlineKey ORDER BY keyColumn,rowid ie with no limit, leaving the query open and getting new rows as the scroll progresses, finally closing the query when the scrolling stops. Obviously I would need to guarantee that no changes are attempted to be made to the database while the query is open and the rows are scrolling. Thanks for your thoughts, Tom BareFeetWare -- Comparison of SQLite GUI tools: http://www.barefeetware.com/sqlite/compare/?ml ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite GUI comparison
On 22/02/2011, at 4:31 AM, skywind mailing lists wrote: > "Supports SQLite extension" would be an accurate feature description. And in > the cell (value) I suggest to put - if supported - FTS2, FTS3, RTree etc., > otherwise a "-". A yes or no is insufficient because some support RTree but > not FTS and vice versa. OK, that sounds good. I'll probably use "no" or "none" if no extension is supported. Can anyone please tell me what should go in this cell for any SQLite GUI app they know (ie either FTS2, FTS3, RTree or none). Or how can I easily test this feature? Thanks, Tom BareFeetWare -- Comparison of SQLite GUI tools: http://www.barefeetware.com/sqlite/compare/?ml ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] sqlite3_busy_handler
On 21 Feb 2011, at 9:46pm, Frank Chang wrote: > I was wondering why other selects who need to only read from a sqlite > database need to use the sqlite3_busy_handlers. Is it because database > connection which are writing to the sqlite database have a higher priority > then database connections which are reading from the sqlite database? Correct. You cannot read from a database while something is making changes in it. You might get part of a batch of changes. For instance, you might get a new transaction, but not the updated account balance. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] EXT :Re: sqlite3_busy_handler
If I'm not mistaken only WAL mode supports simulaneous read/write. For any other mode any write function will lock the database. So...selects may run into a need to wait until a write finishes. Michael D. Black Senior Scientist NG Information Systems Advanced Analytics Directorate From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on behalf of Frank Chang [frank_chan...@hotmail.com] Sent: Monday, February 21, 2011 3:46 PM To: sqlite-users@sqlite.org Subject: EXT :Re: [sqlite] sqlite3_busy_handler Michael D. Black, I will try different batch sizes so that other processes do their thing potentially. I was wondering why other selects who need to only read from a sqlite database need to use the sqlite3_busy_handlers. Is it because database connection which are writing to the sqlite database have a higher priority then database connections which are reading from the sqlite database? Thank you for your suggestions. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] sqlite3_busy_handler
Michael D. Black, I will try different batch sizes so that other processes do their thing potentially. I was wondering why other selects who need to only read from a sqlite database need to use the sqlite3_busy_handlers. Is it because database connection which are writing to the sqlite database have a higher priority then database connections which are reading from the sqlite database? Thank you for your suggestions. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite server using execnet ?
On Feb 21, 2011, at 9:37 PM, Jay A. Kreibich wrote: > I was once forced to look at SOAP over SMTP Ah, yes... double S! The S stands for Simple http://wanderingbarque.com/nonintersecting/2006/11/15/the-s-stands-for-simple/ ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite server using execnet ?
On Sun, Feb 20, 2011 at 08:37:04PM -0800, Roger Binns scratched on the wall: > On 02/20/2011 06:23 PM, Jay A. Kreibich wrote: > > On Sun, Feb 20, 2011 at 05:23:09PM -0800, Roger Binns scratched on the wall: > >> If you want to use SQL then use Postfix. > > > > I might suggest PostgreSQL instead. > > (Sorry, Roger, I couldn't resist.) > > Yeah, long night :-) However, technically SQL over SMTP is possible and > would actually work. And if anyone is insane enough to try that then using > Postfix and Postgres are a good combination. I was once forced to look at SOAP over SMTP. Thankfully we were able to talk them out of it. As if SOAP wasn't bad enough, I sure didn't want to mix it with SMTP. "Any networking protocol with 'Simple' in the name isn't." > >> If you need lots of processes on the network to access data quickly then > >> consider memcached. > > > > More seriously, in this category you might also consider Redis. > > Redis allows your data to have some structure, > > The Python binding pylibmc does structure the data for you automagically. Yes, but in something like memcached, the database is not aware of that structure, and can't take advantage of it. When storing serialized objects, it is all too common to see code that fetches an object, un-marshals it, alters some simple value, re-marshals the whole object, and then write the whole thing back. While this is fine for complex or less frequently accessed objects, where the ease of use outweighs the overhead, it seems like a lot of work for simpler items, such as lists and basic dictionaries or hashes. Of course, these types of objects are also the most frequently modified data-types in many applications. The best way to do key lists in memcached is practically a religious topic. Having the database aware of a few very simple and primitive structures can provide significant improvements for these basic, most common operations. For example, Redis can append an element to a list with a single O(1) database operation that is independent of list size, and the only payload data that crosses the network is the list key and the value to insert. It has a few other tricks, like atomically incrementing counter values. > > plus it has the ability to persist the data to disk. > > The moment you talk about persistence you then have significant overlap with > databases. My personal favourite is MongoDB but there are loads of others > such as Cassandra, HBase, Tokyo Cabinet etc. Yes and no. Redis, like memcached, is essentially an always in-memory key/value store. Functionally, it is much more like memcached than most of these other examples. Its main selling point is memcached-like speed, without the "cache" aspects of memcached-- you're data stays there until you get rid of it, even across restarts. In many cases, that also allows you to get rid of your backing database. Redis also provides just enough internal structure to be useful, without really getting in the way, in much the same way that a very basic container library provides building-block tools without defining a whole class tree. Those features definitely bring it closer into the domain of NoSQL databases, but I'd argue that's only because memcached is so far removed from the rest of these. Thanks to its caching nature, it can't really be considered a "database" at all (useful tool, yes; database, not really). Each of these products has its place. I don't mean to sound like such a Redis fan-boy, but I've been messing around with it a lot lately, and found it to be both extremely simple to setup and configure (something most of these other products cannot claim), while also extremely useful. You're not going to use Redis to replace something like Cassandra or HBase, but it is a good fit for situations where memcached is a good fit, but you want more structure and a known life-cycle to your data. > What programming language are you using to implement the virtual tables? Like SQLite itself, I tend do all my virtual table modules in extremely vanilla C. I happen to think virtual tables are one of the more powerful features of SQLite, but also one of the most under-utilized features. As a way to relax and explore, I sometimes write virtual table modules to bolt together odd data stores or storage formats. Partly this is just for fun, but the process also helps develop a deep understanding of the native data model used by these different products, and that knowledge is useful for other work I do. If it opens some eyes and helps promote virtual tables, and SQLite in general, that's also a great bonus. As such, much of the virtual table development work I do is somewhat isolated, outside of any environment or problem context. My approach tends to be extremely general-purpose, since I'm often approaching the problem from a very
Re: [sqlite] process monitor: wal file access
> I find especially the *-wal access attempt pretty strange as we do not have > WAL enabled for our database: it is set to the default journal mode (DELETE). > Is this normal behaviour? Or is this influenced by some setting I don't know > of? I think when SQLite opens the database it can't trust its contents right away, so it can't know yet that it's in DELETE journal mode. SQLite must check for presence of both files *-wal and *-journal. If none of them exist (failed QueryOpen) then database should be in complete and trusted state, so SQLite can read its contents and see what journal mode is in it. That's what you see and it can't be changed. Pavel On Mon, Feb 21, 2011 at 9:15 AM, Gert Corthoutwrote: > > > > > hello, > > we recently switched from SQLite version 3.5.4 to 3.7.5. When I run process > monitor on our machines I see a lot of failed file access attempts that > weren't there before (v 3.5.4). > It is failed QueryOpen (NAME NOT FOUND) events for DBASE-NAME.db3-wal and > DBASE-NAME.db3-journal. > > I find especially the *-wal access attempt pretty strange as we do not have > WAL enabled for our database: it is set to the default journal mode (DELETE). > Is this normal behaviour? Or is this influenced by some setting I don't know > of? > > kind regards, > Gert > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] SQLite GUI comparison (was: ANN: Base 2.0, Mac SQLite GUI)
Hi Tom, "Supports SQLite extension" would be an accurate feature description. And in the cell (value) I suggest to put - if supported - FTS2, FTS3, RTree etc., otherwise a "-". A yes or no is insufficient because some support RTree but not FTS and vice versa. Alternatively you may have a row for each type of extension but I find this a bit too much. Hartwig Am 20.02.2011 um 21:48 schrieb BareFeetWare: > On 21/02/2011, at 3:20 AM, skywind mailing lists wrote: > >> in your comparison chart it would also be nice to see which software is able >> to support SQLite extension. A couple of them do not support the FTS nor >> RTree capabilities of SQLite. > > Sure, I'd be happy to add that. How do you suggest that the feature is worded > in the table? Is "Supports SQLite extension" accurate? Please let me know > what value (eg yes or no) I should show for this feature for any apps you > know so I can add those entries. > > Thanks, > Tom > BareFeetWare > > -- > Comparison of SQLite GUI tools: > http://www.barefeetware.com/sqlite/compare/?ml > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] update of a blob
On Mon, 2011-02-21 at 15:13 +0100, Dietmar Hummel wrote: > std::string strStatement( "UPDATE persistence SET > name=\"blub\",expiration=\"2011-04-02\",value=\"?\" WHERE id=\"1\"" ); In addition to what Igor said, it isn't really proper (standard?) SQL to put double quotes around the value literals because these should be reserved for identifiers (e.g. schema, column or table names). I know that MS-Access (and probably SQL Server) allows it; perhaps SQLite does, too, but other databases won't -- you need to enclose them in single quotes (but only if the value is a string literal, or a date-time value formatted as a string). With some RDBMS's the character used to enclose identifiers is optional or configurable, e.g. the backtick character (`) used by MySQL. Bob ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] COUNT() extremely slow first time!
Thanks :) This did the trick: First make a copy of the database: copy HugeDatabase.db HugeDatabase_copy.db Then for each run, replace the database with its copy. This is why I thought the COUNT operation was somehow written to the database after its first run... :P > From: slav...@bigfraud.org > Date: Mon, 21 Feb 2011 16:56:01 + > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] COUNT() extremely slow first time! > > > On 21 Feb 2011, at 4:35pm, Jim Wilcoxson wrote: > > > On Mon, Feb 21, 2011 at 11:05 AM, Sven Lwrote: > >> > >> Thank you for your detailed explanation! > >> First, can you please tell me how to purge the cache in Windows 7? This > >> could be very useful for my tests! > > > > Sorry, dunno for Windows. On Mac OSX it is the purge command, in the > > development tools. > > In current versions it's just /usr/bin/purge/, which should be in your path. > No idea why a non-programmer should need it, but there it is. > > > On Linux, you do: echo 3 > > > /prog/sys/vm/drop_caches > > And in Windows it's ... almost impossible. You can sync to disk, using > fflush(), and there's no reason you can't do this from the command-line: > > http://technet.microsoft.com/en-us/sysinternals/bb897438.aspx > > But for a purge you have to invalidate the contents of cache, and there's no > way for an application to tell Windows to do this. For a start, each > application has its own user mode address cache, so if you run another > utility to flush cache, it just messes with its own space. Second, there's no > system call that allows you access to the cache because Microsoft considers > it private to the device level. I am not dissing Microsoft for this: there > are fair arguments that this is actually the Right Way to do it. > > So the way to do it is to overrun the cache by yourself. If you know your > cache is 2Gig, find a 2Gig disk file that has nothing to do with your test > suite and read it. Or generate 2Gig of gibberish and write it to disk, then > delete that file. Pah. > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] COUNT() extremely slow first time!
On 21 Feb 2011, at 4:35pm, Jim Wilcoxson wrote: > On Mon, Feb 21, 2011 at 11:05 AM, Sven Lwrote: >> >> Thank you for your detailed explanation! >> First, can you please tell me how to purge the cache in Windows 7? This >> could be very useful for my tests! > > Sorry, dunno for Windows. On Mac OSX it is the purge command, in the > development tools. In current versions it's just /usr/bin/purge/, which should be in your path. No idea why a non-programmer should need it, but there it is. > On Linux, you do: echo 3 > > /prog/sys/vm/drop_caches And in Windows it's ... almost impossible. You can sync to disk, using fflush(), and there's no reason you can't do this from the command-line: http://technet.microsoft.com/en-us/sysinternals/bb897438.aspx But for a purge you have to invalidate the contents of cache, and there's no way for an application to tell Windows to do this. For a start, each application has its own user mode address cache, so if you run another utility to flush cache, it just messes with its own space. Second, there's no system call that allows you access to the cache because Microsoft considers it private to the device level. I am not dissing Microsoft for this: there are fair arguments that this is actually the Right Way to do it. So the way to do it is to overrun the cache by yourself. If you know your cache is 2Gig, find a 2Gig disk file that has nothing to do with your test suite and read it. Or generate 2Gig of gibberish and write it to disk, then delete that file. Pah. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] COUNT() extremely slow first time!
On Mon, Feb 21, 2011 at 11:05 AM, Sven Lwrote: > > Thank you for your detailed explanation! > First, can you please tell me how to purge the cache in Windows 7? This could > be very useful for my tests! Sorry, dunno for Windows. On Mac OSX it is the purge command, in the development tools. On Linux, you do: echo 3 > /prog/sys/vm/drop_caches > > I'm quite sure my database itself is not fragmented, since I have only > inserted data. If you insert the data in primary key order, or you don't have any key (SQLite generates one) it's probably not very fragmented. But keep in mind that while to you, things are happening sequentially as you add records, inside SQLite, things are happening quite differently. For example, let's say you have 4 indexes on this database. So for every insert you do, there are 5 database pages being affected (one for the data row, one for each index). These pages will fill up at different rates, depending on the size of your keys, and will be written to disk at different times. So you are very likely to have data pages and various index pages all intermixed in your SQLite file. Also with multiple indexes, it's unlikely that every index will be in sorted order, based on the records you are inserting. So IMO, there's no practical way to avoid fragmentation within the SQLite file. > The file system is in good shape too; Windows reports 0% fragmentation. > Perhaps there is some other bottleneck, like disk performance in general > (this is a 5400 rpm)? Or antivirus-related CPU holds? (I did turn real-time > scanning off though.) > I have even turned Windows Search off (and got a 20% performance gain!). My > 32-bit application is running under Windows 7 (64-bit). Could WOW64 have > something to do with this performance issue? > > The size of the database is 1,98 GB, with 1.5M rows. Couldn't it be that the > table scan simply is darn slow for huge tables? How much free RAM do you have? Windows being the pig that it is, I'm guessing not much, unless your system has > 4GB of RAM. For comparison, you could write a program that just read through the file, 4K at a time (I think that's the default page size for SQLite on Windows). Starting with a cold cache, this is near the best time you will ever see for count(*) in SQLite. > > In an ideal world the COUNT() would be performed on the primary key in RAM. > That's not possible? :P Sure, if all of the primary key records are in either the SQLite cache or OS cache. If they aren't, you have seek time. I just did a small test to compare count(*) with count(primary key). Here is count(*): sqlite> explain select count(*) from logs; addr opcode p1p2p3p4 p5 comment - - -- - 0 Trace 0 0 000 1 Goto 0 8 000 2 OpenRead 1 120 keyinfo(1,BINARY) 00 3 Count 1 1 000 4 Close 1 0 000 5 SCopy 1 2 000 6 ResultRow 2 1 000 7 Halt 0 0 000 8 Transaction0 0 000 9 VerifyCookie 0 23000 10TableLock 0 8 0 logs 00 11Goto 0 2 000 Here is count(primary key): sqlite> explain select count(logid) from logs; addr opcode p1p2p3p4 p5 comment - - -- - 0 Trace 0 0 000 1 Null 0 2 000 2 Null 0 1 000 3 Goto 0 14000 4 OpenRead 0 8 0 0 00 5 Rewind 0 9 000 6 Rowid 0 3 000 7 AggStep0 3 1 count(1) 01 8 Next 0 6 001 9 Close 0 0 000 10AggFinal 1 1 0 count(1) 00 11SCopy 1 4 000 12ResultRow 4 1 000 13Halt 0 0 000 14Transaction0 0 000 15VerifyCookie 0 23000 16TableLock 0 8 0 logs 00 17Goto 0 4 000 It's not clear to me whether for count(*), SQLite is reading each data page or reading the primary key index pages. But when timed, count(*) is 16x faster than count(logid) the first time, and 8x faster once all pages are cached. This is just
Re: [sqlite] COUNT() extremely slow first time!
On Mon, Feb 21, 2011 at 03:37:50PM +0100, Sven L scratched on the wall: > I've learnt that COUNT(*) is slower than COUNT(ID), That's usually not true. > since * means the engine has to traverse all columns Actually, count(*) is the one case when the engine does *not* need to traverse any columns. > and it might even return another value if there are NULL-values... Yes, but that's how count() is defined to work. The expression "count(id)" only counts rows where id is not NULL. This requires that the database engine retrieve the value of the id column from each row in order to test for NULL. The expression "count(*)" strictly counts rows, without concern over NULLs. As such, there is no need to actually retrieve any row data, because there is no need to test any row values. The count(*) function can scan the table tree, counting the number of rows, without actually loading or decoding the row data. -j -- Jay A. Kreibich < J A Y @ K R E I B I.C H > "Intelligence is like underwear: it is important that you have it, but showing it to the wrong people has the tendency to make them feel uncomfortable." -- Angela Johnson ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] COUNT() extremely slow first time!
On Mon, Feb 21, 2011 at 10:38 AM, Puneet Kishorwrote: > On Mon, Feb 21, 2011 at 10:17:03AM -0500, Jim Wilcoxson wrote: >> This is a common issue on the mailing list. The first time you do >> count(*), SQLite (actually your OS) has to load data into memory. >> Most OS's will keep the file in a buffer cache, so the 2nd count(*) >> doesn't have to read from disk. > > > One question I have related to the above -- how long does that buffer > cache remain filled with the data? I am assuming it is until new stuff > to be cached pushes out old stuff, no? For most OS's, the time data remains in the cache and the size of the cache will vary as a function of available RAM. > > I was doing some R*Tree selects, and the first query was dog slow, > although benchmarking showed that the actual CPU time was very small. > Subsequent queries were lightning fast. I am assuming that the buffer is > not getting filled with the results as much as it is getting filled with > whatever part of the db that the program needs to open to do its work. Right. SQLite doesn't cache query results. It does cache database pages in its own cache, which by default is rather small: 2000 pages. At the default page size is 1K, that's a 2MB cache. To fetch records not in it's own cache, SQLite will use the OS. If the page is in the OS cache, there is no seek time and no read latency (for spinning media). Jim > > >> >> Here's a timing from my own system, after a purge command to clear the >> buffer cache: >> >> $ time sqlite3 hb.db 'select count(*) from logs' >> -- Loading resources from /Users/jim/.sqliterc >> count(*) >> -- >> 734909 >> >> real 0m0.580s >> user 0m0.190s >> sys 0m0.034s >> >> Same command again, with the file cached: >> >> $ time sqlite3 hb.db 'select count(*) from logs' >> -- Loading resources from /Users/jim/.sqliterc >> count(*) >> -- >> 734909 >> >> real 0m0.189s >> user 0m0.165s >> sys 0m0.019s >> >> This time is consistent no matter how many times I run it, because the >> file is still cached. Doing a purge command to clear the cache and >> re-running the query, we get: >> >> $ purge >> $ time sqlite3 hb.db 'select count(*) from logs' >> -- Loading resources from /Users/jim/.sqliterc >> count(*) >> -- >> 734909 >> >> real 0m0.427s >> user 0m0.175s >> sys 0m0.024s >> >> On my system, there is not a huge difference, but it is consistent. >> Now, if you have a fragmented file system, you will see a much larger >> difference. There are many posts on the mailing list about both file >> system fragmentation and logical fragmentation within the SQLite file >> itself. Your first count(*) is subject to these fragmentation >> effects, while your 2nd usually is not, because the file is in memory. >> >> Some people on the list believe fragmentation is an unimportant detail >> you shouldn't worry about, because you have little control over it. >> That may be true, but it's useful to understand how it can affect >> performance. I think you are seeing this first hand. >> >> Jim >> -- >> HashBackup: easy onsite and offsite Unix backup >> http://www.hashbackup.com >> >> >> >> On Mon, Feb 21, 2011 at 9:37 AM, Sven L wrote: >> > >> > Same result :( >> > Note that I have compiled SQLite with the following switches: >> > SQLITE_ENABLE_STAT2 >> > SQLITE_THREADSAFE=2 >> > >> > I've learnt that COUNT(*) is slower than COUNT(ID), since * means the >> > engine has to traverse all columns and it might even return another value >> > if there are NULL-values... >> > >> > Also, this is quite interesting: >> > >> > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items; >> > 0|0|0|SCAN TABLE Items (~100 rows) >> > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items; >> > sqlite> >> > >> > I would expect an index scan on the first statement. The second statement >> > tells me nada?! >> > >> > Thanks for your help! >> > >> > >> >> From: slav...@bigfraud.org >> >> Date: Mon, 21 Feb 2011 14:24:50 + >> >> To: sqlite-users@sqlite.org >> >> Subject: Re: [sqlite] COUNT() extremely slow first time! >> >> >> >> >> >> On 21 Feb 2011, at 2:23pm, Sven L wrote: >> >> >> >> > SELECT COUNT(ItemID) FROM Items; >> >> > >> >> > This takes around 40 seconds the first time! WHY?! >> >> >> >> Try again, doing everything identically except that instead of the above >> >> line use >> >> >> >> SELECT COUNT(*) FROM Items; >> >> >> >> Simon. >> >> ___ >> >> sqlite-users mailing list >> >> sqlite-users@sqlite.org >> >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > >> > ___ >> > sqlite-users mailing list >> > sqlite-users@sqlite.org >> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users >> > >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >>
Re: [sqlite] COUNT() extremely slow first time!
Thank you for your detailed explanation! First, can you please tell me how to purge the cache in Windows 7? This could be very useful for my tests! I'm quite sure my database itself is not fragmented, since I have only inserted data. The file system is in good shape too; Windows reports 0% fragmentation. Perhaps there is some other bottleneck, like disk performance in general (this is a 5400 rpm)? Or antivirus-related CPU holds? (I did turn real-time scanning off though.) I have even turned Windows Search off (and got a 20% performance gain!). My 32-bit application is running under Windows 7 (64-bit). Could WOW64 have something to do with this performance issue? The size of the database is 1,98 GB, with 1.5M rows. Couldn't it be that the table scan simply is darn slow for huge tables? In an ideal world the COUNT() would be performed on the primary key in RAM. That's not possible? :P Thanks again! > Date: Mon, 21 Feb 2011 10:17:03 -0500 > From: pri...@gmail.com > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] COUNT() extremely slow first time! > > This is a common issue on the mailing list. The first time you do > count(*), SQLite (actually your OS) has to load data into memory. > Most OS's will keep the file in a buffer cache, so the 2nd count(*) > doesn't have to read from disk. > > Here's a timing from my own system, after a purge command to clear the > buffer cache: > > $ time sqlite3 hb.db 'select count(*) from logs' > -- Loading resources from /Users/jim/.sqliterc > count(*) > -- > 734909 > > real 0m0.580s > user 0m0.190s > sys 0m0.034s > > Same command again, with the file cached: > > $ time sqlite3 hb.db 'select count(*) from logs' > -- Loading resources from /Users/jim/.sqliterc > count(*) > -- > 734909 > > real 0m0.189s > user 0m0.165s > sys 0m0.019s > > This time is consistent no matter how many times I run it, because the > file is still cached. Doing a purge command to clear the cache and > re-running the query, we get: > > $ purge > $ time sqlite3 hb.db 'select count(*) from logs' > -- Loading resources from /Users/jim/.sqliterc > count(*) > -- > 734909 > > real 0m0.427s > user 0m0.175s > sys 0m0.024s > > On my system, there is not a huge difference, but it is consistent. > Now, if you have a fragmented file system, you will see a much larger > difference. There are many posts on the mailing list about both file > system fragmentation and logical fragmentation within the SQLite file > itself. Your first count(*) is subject to these fragmentation > effects, while your 2nd usually is not, because the file is in memory. > > Some people on the list believe fragmentation is an unimportant detail > you shouldn't worry about, because you have little control over it. > That may be true, but it's useful to understand how it can affect > performance. I think you are seeing this first hand. > > Jim > -- > HashBackup: easy onsite and offsite Unix backup > http://www.hashbackup.com > > > > On Mon, Feb 21, 2011 at 9:37 AM, Sven Lwrote: > > > > Same result :( > > Note that I have compiled SQLite with the following switches: > > SQLITE_ENABLE_STAT2 > > SQLITE_THREADSAFE=2 > > > > I've learnt that COUNT(*) is slower than COUNT(ID), since * means the > > engine has to traverse all columns and it might even return another value > > if there are NULL-values... > > > > Also, this is quite interesting: > > > > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items; > > 0|0|0|SCAN TABLE Items (~100 rows) > > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items; > > sqlite> > > > > I would expect an index scan on the first statement. The second statement > > tells me nada?! > > > > Thanks for your help! > > > > > >> From: slav...@bigfraud.org > >> Date: Mon, 21 Feb 2011 14:24:50 + > >> To: sqlite-users@sqlite.org > >> Subject: Re: [sqlite] COUNT() extremely slow first time! > >> > >> > >> On 21 Feb 2011, at 2:23pm, Sven L wrote: > >> > >> > SELECT COUNT(ItemID) FROM Items; > >> > > >> > This takes around 40 seconds the first time! WHY?! > >> > >> Try again, doing everything identically except that instead of the above > >> line use > >> > >> SELECT COUNT(*) FROM Items; > >> > >> Simon. > >> ___ > >> sqlite-users mailing list > >> sqlite-users@sqlite.org > >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] COUNT() extremely slow first time!
On Mon, Feb 21, 2011 at 10:17:03AM -0500, Jim Wilcoxson wrote: > This is a common issue on the mailing list. The first time you do > count(*), SQLite (actually your OS) has to load data into memory. > Most OS's will keep the file in a buffer cache, so the 2nd count(*) > doesn't have to read from disk. One question I have related to the above -- how long does that buffer cache remain filled with the data? I am assuming it is until new stuff to be cached pushes out old stuff, no? I was doing some R*Tree selects, and the first query was dog slow, although benchmarking showed that the actual CPU time was very small. Subsequent queries were lightning fast. I am assuming that the buffer is not getting filled with the results as much as it is getting filled with whatever part of the db that the program needs to open to do its work. > > Here's a timing from my own system, after a purge command to clear the > buffer cache: > > $ time sqlite3 hb.db 'select count(*) from logs' > -- Loading resources from /Users/jim/.sqliterc > count(*) > -- > 734909 > > real 0m0.580s > user 0m0.190s > sys 0m0.034s > > Same command again, with the file cached: > > $ time sqlite3 hb.db 'select count(*) from logs' > -- Loading resources from /Users/jim/.sqliterc > count(*) > -- > 734909 > > real 0m0.189s > user 0m0.165s > sys 0m0.019s > > This time is consistent no matter how many times I run it, because the > file is still cached. Doing a purge command to clear the cache and > re-running the query, we get: > > $ purge > $ time sqlite3 hb.db 'select count(*) from logs' > -- Loading resources from /Users/jim/.sqliterc > count(*) > -- > 734909 > > real 0m0.427s > user 0m0.175s > sys 0m0.024s > > On my system, there is not a huge difference, but it is consistent. > Now, if you have a fragmented file system, you will see a much larger > difference. There are many posts on the mailing list about both file > system fragmentation and logical fragmentation within the SQLite file > itself. Your first count(*) is subject to these fragmentation > effects, while your 2nd usually is not, because the file is in memory. > > Some people on the list believe fragmentation is an unimportant detail > you shouldn't worry about, because you have little control over it. > That may be true, but it's useful to understand how it can affect > performance. I think you are seeing this first hand. > > Jim > -- > HashBackup: easy onsite and offsite Unix backup > http://www.hashbackup.com > > > > On Mon, Feb 21, 2011 at 9:37 AM, Sven Lwrote: > > > > Same result :( > > Note that I have compiled SQLite with the following switches: > > SQLITE_ENABLE_STAT2 > > SQLITE_THREADSAFE=2 > > > > I've learnt that COUNT(*) is slower than COUNT(ID), since * means the > > engine has to traverse all columns and it might even return another value > > if there are NULL-values... > > > > Also, this is quite interesting: > > > > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items; > > 0|0|0|SCAN TABLE Items (~100 rows) > > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items; > > sqlite> > > > > I would expect an index scan on the first statement. The second statement > > tells me nada?! > > > > Thanks for your help! > > > > > >> From: slav...@bigfraud.org > >> Date: Mon, 21 Feb 2011 14:24:50 + > >> To: sqlite-users@sqlite.org > >> Subject: Re: [sqlite] COUNT() extremely slow first time! > >> > >> > >> On 21 Feb 2011, at 2:23pm, Sven L wrote: > >> > >> > SELECT COUNT(ItemID) FROM Items; > >> > > >> > This takes around 40 seconds the first time! WHY?! > >> > >> Try again, doing everything identically except that instead of the above > >> line use > >> > >> SELECT COUNT(*) FROM Items; > >> > >> Simon. > >> ___ > >> sqlite-users mailing list > >> sqlite-users@sqlite.org > >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > > ___ > > sqlite-users mailing list > > sqlite-users@sqlite.org > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] COUNT() extremely slow first time!
This is a common issue on the mailing list. The first time you do count(*), SQLite (actually your OS) has to load data into memory. Most OS's will keep the file in a buffer cache, so the 2nd count(*) doesn't have to read from disk. Here's a timing from my own system, after a purge command to clear the buffer cache: $ time sqlite3 hb.db 'select count(*) from logs' -- Loading resources from /Users/jim/.sqliterc count(*) -- 734909 real0m0.580s user0m0.190s sys 0m0.034s Same command again, with the file cached: $ time sqlite3 hb.db 'select count(*) from logs' -- Loading resources from /Users/jim/.sqliterc count(*) -- 734909 real0m0.189s user0m0.165s sys 0m0.019s This time is consistent no matter how many times I run it, because the file is still cached. Doing a purge command to clear the cache and re-running the query, we get: $ purge $ time sqlite3 hb.db 'select count(*) from logs' -- Loading resources from /Users/jim/.sqliterc count(*) -- 734909 real0m0.427s user0m0.175s sys 0m0.024s On my system, there is not a huge difference, but it is consistent. Now, if you have a fragmented file system, you will see a much larger difference. There are many posts on the mailing list about both file system fragmentation and logical fragmentation within the SQLite file itself. Your first count(*) is subject to these fragmentation effects, while your 2nd usually is not, because the file is in memory. Some people on the list believe fragmentation is an unimportant detail you shouldn't worry about, because you have little control over it. That may be true, but it's useful to understand how it can affect performance. I think you are seeing this first hand. Jim -- HashBackup: easy onsite and offsite Unix backup http://www.hashbackup.com On Mon, Feb 21, 2011 at 9:37 AM, Sven Lwrote: > > Same result :( > Note that I have compiled SQLite with the following switches: > SQLITE_ENABLE_STAT2 > SQLITE_THREADSAFE=2 > > I've learnt that COUNT(*) is slower than COUNT(ID), since * means the engine > has to traverse all columns and it might even return another value if there > are NULL-values... > > Also, this is quite interesting: > > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items; > 0|0|0|SCAN TABLE Items (~100 rows) > sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items; > sqlite> > > I would expect an index scan on the first statement. The second statement > tells me nada?! > > Thanks for your help! > > >> From: slav...@bigfraud.org >> Date: Mon, 21 Feb 2011 14:24:50 + >> To: sqlite-users@sqlite.org >> Subject: Re: [sqlite] COUNT() extremely slow first time! >> >> >> On 21 Feb 2011, at 2:23pm, Sven L wrote: >> >> > SELECT COUNT(ItemID) FROM Items; >> > >> > This takes around 40 seconds the first time! WHY?! >> >> Try again, doing everything identically except that instead of the above >> line use >> >> SELECT COUNT(*) FROM Items; >> >> Simon. >> ___ >> sqlite-users mailing list >> sqlite-users@sqlite.org >> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] COUNT() extremely slow first time!
Same result :( Note that I have compiled SQLite with the following switches: SQLITE_ENABLE_STAT2 SQLITE_THREADSAFE=2 I've learnt that COUNT(*) is slower than COUNT(ID), since * means the engine has to traverse all columns and it might even return another value if there are NULL-values... Also, this is quite interesting: sqlite> EXPLAIN QUERY PLAN SELECT COUNT(ItemID) FROM Items; 0|0|0|SCAN TABLE Items (~100 rows) sqlite> EXPLAIN QUERY PLAN SELECT COUNT(*) FROM Items; sqlite> I would expect an index scan on the first statement. The second statement tells me nada?! Thanks for your help! > From: slav...@bigfraud.org > Date: Mon, 21 Feb 2011 14:24:50 + > To: sqlite-users@sqlite.org > Subject: Re: [sqlite] COUNT() extremely slow first time! > > > On 21 Feb 2011, at 2:23pm, Sven L wrote: > > > SELECT COUNT(ItemID) FROM Items; > > > > This takes around 40 seconds the first time! WHY?! > > Try again, doing everything identically except that instead of the above line > use > > SELECT COUNT(*) FROM Items; > > Simon. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] update of a blob
Dietmar Hummelwrote: > Maybe someone could help me with a little problem. I am trying to update > an existing entry in the db > with an update statement where one of the columns is a blob type. I have > code that looks like this: > > sqlite3_stmt* m_pStatement = NULL; > > std::string strStatement( "UPDATE persistence SET > name=\"blub\",expiration=\"2011-04-02\",value=\"?\" WHERE id=\"1\"" ); std::string strStatement( "UPDATE persistence SET name='blub',expiration='2011-04-02',value=? WHERE id=1" ); '?' (let alone "?") is not a parameter placeholder - it's a string literal consisting of a single character ?. -- Igor Tandetnik ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] COUNT() extremely slow first time!
On 21 Feb 2011, at 2:23pm, Sven L wrote: > SELECT COUNT(ItemID) FROM Items; > > This takes around 40 seconds the first time! WHY?! Try again, doing everything identically except that instead of the above line use SELECT COUNT(*) FROM Items; Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] COUNT() extremely slow first time!
SQLite 3.7.5. I have a table (13 columns with 1 primary key, 1 index on a date column, 1 UNIQUE-constraint on two columns). I insert ~130 rows, 1000 at a time (transaction-wise). I close the database, reopen it and immediately perform a COUNT-operation: SELECT COUNT(ItemID) FROM Items; This takes around 40 seconds the first time! WHY?! I believe the value is somehow cached inside the database after the above call. The table used: CREATE TABLE IF NOT EXISTS Item ( ItemID INTEGER PRIMARY KEY, A INTEGER NOT NULL, B INTEGER NOT NULL, C INTEGER NOT NULL, D INTEGER NOT NULL, E INTEGER NOT NULL, F INTEGER NOT NULL, G INTEGER NOT NULL, H DATE NOT NULL, I CHAR(3) NOT NULL, J INTEGER NOT NULL, K INTEGER NOT NULL, L INTEGER NOT NULL, UNIQUE (B, A) ); CREATE INDEX IF NOT EXISTS Item_H_Index ON Item (H); ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] process monitor: wal file access
hello, we recently switched from SQLite version 3.5.4 to 3.7.5. When I run process monitor on our machines I see a lot of failed file access attempts that weren't there before (v 3.5.4). It is failed QueryOpen (NAME NOT FOUND) events for DBASE-NAME.db3-wal and DBASE-NAME.db3-journal. I find especially the *-wal access attempt pretty strange as we do not have WAL enabled for our database: it is set to the default journal mode (DELETE). Is this normal behaviour? Or is this influenced by some setting I don't know of? kind regards, Gert ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] update of a blob
Hi list! Maybe someone could help me with a little problem. I am trying to update an existing entry in the db with an update statement where one of the columns is a blob type. I have code that looks like this: sqlite3_stmt* m_pStatement = NULL; std::string strStatement( "UPDATE persistence SET name=\"blub\",expiration=\"2011-04-02\",value=\"?\" WHERE id=\"1\"" ); int iResult = sqlite3_prepare_v2( m_pHandle, strStatement.c_str(), strStatement.length() + 1, _pStatement, NULL ); /* error checking is done here */ !! the error is on the following line. Whatever I enter for the index counter variable, I always get the error: "bind or column index out of range" iResult = sqlite3_bind_blob( m_pStatement, 0, p_bsValue.data(), p_bsValue.length(), SQLITE_TRANSIENT ); /* here again error checking */ iResult = sqlite3_step( m_pStatement ); /* here again error checking */ //dh done sqlite3_finalize( m_pStatement ); Maybe someone could give me a hint? Regards Dietmar ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Upgrade from 3.5.8 -> 3.7.5 - increase memory usage
I am looking at the private byes counter from Windows (MSDN Description below) for my process. Is there another counter or stat I should be looking at? (from MSDN - Private Bytes: Displays the current number of bytes this process has allocated that cannot be shared with other processes.) I receive similar behavior (although it does not climb as fast-because it doesn't run as fast) with Journal_mode=delete. When using WAL; I set the wal_autocheckpoint to 250 and called WAL_CHECKPOINT without any measurable result to see if I see any difference. I have left the application running and the private bytes continues to increase; I am 150MB and counting My code has to be leaking; but why not with the older version? Thanks for all the help Todd -Original Message--- From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] On Behalf Of Roger Binns Sent: Sunday, February 20, 2011 7:48 PM To: sqlite-users@sqlite.org Subject: Re: [sqlite] Upgrade from 3.5.8 -> 3.7.5 - increase memory usage -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/20/2011 02:48 PM, Todd Shutts wrote: > However; memory usage is growing like crazy. It is a very frequent occurrence that what people think is being measured is not what is actually being measured. Make sure you understand exactly what the tool showing you memory consumption is measuring and especially understand if it includes memory shared with other processes or memory mapped files. Additionally you need to understand the difference between memory as the kernel sees and reports it versus memory within particular APIs such as C's malloc. > The application > never used more than 10MB and it is currently using 57+MB and continues > to climb. The single most likely explanation is this is WAL in action, the memory is from a memory mapped file and a WAL checkpoint will release it. http://www.sqlite.org/wal.html Roger -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk1hxDYACgkQmOOfHg372QRikgCdHrEuzE5p71LTaiF+WRHfG6j2 9S0An100kCApkwZI74XGYR6zxczr2m7u =d0xw -END PGP SIGNATURE- ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Contributing to System.Data.Sqlite
On Sat, Feb 19, 2011 at 10:11 PM, Alexander Spence < alexander.spe...@demandmedia.com> wrote: > Who would I talk to if I wanted to contribute to the System.Data.Sqlite > project? We use this code in an enterprise environment and would like to > contribute our changes to the community. > I need to work out a Contributor License Agreement for System.Data.SQLite - probably something along the lines of http://www.fossil-scm.org/fossil/doc/trunk/www/copyright-release.html used for Fossil. Let me work on that and get back to you. Thanks for volunteering! > > > Please NOTE: This electronic message, including any attachments, may > include privileged, confidential and/or inside information owned by Demand > Media, Inc. Any distribution or use of this communication by anyone other > than the intended recipient(s) is strictly prohibited and may be unlawful. > If you are not the intended recipient, please notify the sender by replying > to this message and then delete it from your system. Thank you. > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] owner/permissions on wal/shm and journal files
On 21 Feb 2011, at 09:36, Simon Slavin wrote: > On 21 Feb 2011, at 9:35am, Philip Graham Willoughby wrote: > >> On 20 Feb 2011, at 16:18, thilo wrote: >> >>> My DB is owned by a different user (www) and I as root insert some data. >> ... >>> Are there any suggestions on how to make this workable? >> >> Don't do that. Unless you can guarantee that no malicious user has been able >> to alter your database it isn't safe to interact with it as root if you're >> using a version of the sqlite3 shell that has the .load built-in. Use su or >> sudo to become www and interact with your data as the owning user. > > Would it be feasible to copy whatever protection is on the database file to > any temporary files like journal files ? On systems with traditional unix permissions if you have authority to create a file in a directory you also have authority to set the created file's permissions. You can also change the group of the file to any group of which you are a member. Conceptually, the relevant VFSs would need to do something like this when creating subsidiary files: // Allow us to set whatever file mode we want mode_t oldmode = umask(0); struct stat dbstat; // Retrieve database permissions etc fstat(dbfh, ); // Create file with the right permissions if necessary int newfd = open(…,…|O_EXCL|O_CREAT,dbstat.st_mode); // Match the group id - should usually work if (fchown(newfd,-1,dbstat.st_gid)) { // Could not change gid to match db // log warning? } // Match the user id - should usually fail if (fchown(newfd,dbstat.st_uid,-1)) { // Could not change uid to match db // log warning? } // restore umask umask(oldmode); For me, the mode-matching is definitely worth it - if you have a group-writable database you almost certainly want a group-writable log. Similarly the gid-matching code is worthwhile for the same reason. The uid-matching code is only usable by root or processes given that privilege by some other mechanism and is therefore less worthwhile. Best Regards, Phil Willoughby -- Managing Director, StrawberryCat Limited StrawberryCat Limited is registered in England and Wales with Company No. 7234809. The registered office address of StrawberryCat Limited is: 107 Morgan Le Fay Drive Eastleigh SO53 4JH ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] owner/permissions on wal/shm and journal files
On 21 Feb 2011, at 9:35am, Philip Graham Willoughby wrote: > On 20 Feb 2011, at 16:18, thilo wrote: > >> My DB is owned by a different user (www) and I as root insert some data. > ... >> Are there any suggestions on how to make this workable? > > Don't do that. Unless you can guarantee that no malicious user has been able > to alter your database it isn't safe to interact with it as root if you're > using a version of the sqlite3 shell that has the .load built-in. Use su or > sudo to become www and interact with your data as the owning user. Would it be feasible to copy whatever protection is on the database file to any temporary files like journal files ? Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Upgrade from 3.5.8 -> 3.7.5 - increase memory usage
On 21 Feb 2011, at 5:56am, Dan Kennedy wrote: > WAL file growth has (apparently) been a problem for people with > really busy systems. But so far it has been the actual size of > the file on disk that bothers them, not the memory mapped bit. If that's what you think is raising concern then fine, someone needs to write something about that. But looking at the actual posts to this list gives me questions about reported memory-usage. Simon. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] owner/permissions on wal/shm and journal files
On 20 Feb 2011, at 16:18, thilo wrote: > My DB is owned by a different user (www) and I as root insert some data. ... > Are there any suggestions on how to make this workable? Don't do that. Unless you can guarantee that no malicious user has been able to alter your database it isn't safe to interact with it as root if you're using a version of the sqlite3 shell that has the .load built-in. Use su or sudo to become www and interact with your data as the owning user. Best Regards, Phil Willoughby -- Managing Director, StrawberryCat Limited StrawberryCat Limited is registered in England and Wales with Company No. 7234809. The registered office address of StrawberryCat Limited is: 107 Morgan Le Fay Drive Eastleigh SO53 4JH ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users