Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
On 2014/09/21 15:39, Merike wrote: Now I could very well be wrong about that as you say in your other reply that "It might simply be that Analyze did not get your QP to react on that size DB as it did for us". You seem to be saying that analyze behaves differently depending on database size... This is very much what I'm saying but maybe not exactly like you think. ANALYZE really interprets the "Shape" of your data (average width, cardinality and so on) and from that provides figures to the Query Planner by which to make decisions. For this reason, running ANALYZE on any amount of databases, different sizes, different kinds of data, will all provide different figures to the Query planner and it is very possible to produce a set of figures running analyze on one DB which changes the QP's mind on what plan to follow, whereas the very same Schema DB but with other data in it might make the QP choose a different plan - either of which may or may not hide the bug in question. Glad to hear the trunk fixed your problem though - that will be standard in the next release. Cheers! Ryan ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
21.09.2014 00:08, Richard Hipp kirjutas: > On Sat, Sep 20, 2014 at 12:45 PM, Merikewrote: > >> 19.09.2014 04:21, Richard Hipp kirjutas: >>> A simple script to reproduce the problem in the latest SQLite is as >>> follows: CREATE TABLE t1(a INTEGER PRIMARY KEY, b INTEGER, c INTEGER, >>> d INTEGER); CREATE INDEX t1b ON t1(b); CREATE TABLE t2(x INTEGER >>> PRIMARY KEY, y); explain query plan SELECT * FROM t1, t2 WHERE x=c AND >>> b=?1 ORDER BY a; .print -- CREATE INDEX t1bd ON >>> t1(b,d); explain query plan SELECT * FROM t1, t2 WHERE x=c AND b=?1 >>> ORDER BY a; If you copy/paste the above script into an sqlite3 >>> command-line shell, you'll see that, for some reason, the query >>> planner decides to sort manually rather than using an index for >>> sorting after the t1bd index is added. We are still investigating to >>> try understand why that is. Please note that if you run ANALYZE on >>> your sample database, the query is fast again. On my (Ubuntu) desktop, >>> the query takes 648 milliseconds before being analyzed and 600 >>> microseconds afterwards - a 1000-fold speedup. >> A question: is the query being fast again after analyze call indicative >> of the bug being fixed? Because I tried it on my original database too >> and there I don't see a speedup after analyze. Should I try to minimize >> it to a smaller database again where the bug still occurs, even after >> analyze? Or will the change you made fix my original database speed as >> well despite the analyze call not helping it? >> > The change fixes the problem (for us) *without* requiring ANALYZE. > Right, that's not what I was trying to ask so I'll try to explain again. Is there a way to tell if this fix helps the query speed only on the example database I provided or also the original one without waiting for a new release of sqlite? Since you said that analyze helps the example database even without code fix I tried running analyze on original database too, hoping it might speed up that as well. And there it did not help. This might mean nothing, I don't know sqlite well enough to tell, but could it also mean that the fix (when released) might not speed up the same query on my original database? The table structure is the same there but data is different and there's more of it. It could very well be that waiting for the release is the only way to find it out as I'm not interested enough to try to build sqlite myself but I was trying to figure out if there's another way :) Otherwise there could be a chance that next release comes around and might not help the query on actual database and then I would need to post here again and wait for another release. Merike ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
21.09.2014 15:12, RSmith kirjutas: > Merike: Running Analyze did not fix the bug, it simply changed some > internal-use values that allowed the bug to be circumvented. If you > re-make a table as you had without running analyze, the problem will > surely remain using the same codebase. Updating your SQLite code from > the Trunk specified by Richard will fix the bug for real, meaning that > analyze would make no real difference (for the specifed use case). > Naturally the trunk is also included in the next release which is > probably due within the next month at which time the bug fix will be > standard across all released versions. So, as I understand it analyze allows the bug to be circumvented on example database without upgrading sqlite itself. What I'm seeing with original database is that analyze does not allow circumvention of the bug there. Which could mean that the bug fixed is not the only one affecting original database even if it's the only one affecting example one. If both of the databases were affected by the exact same bug I would expect that analyze would allow circumvention in both cases, not just with example database. Now I could very well be wrong about that as you say in your other reply that "It might simply be that Analyze did not get your QP to react on that size DB as it did for us". You seem to be saying that analyze behaves differently depending on database size. This would explain why analyze didn't allow circumvention on original database. In terms of size "backlog" table has 160 472 rows in the example I provided while originally it has ~1 100 000 rows. As I discovered it's much simpler to build sqlite than I expected I just checked original database with new version. And the query was fast there so the fix actually works on original database as well :) Merike ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
On 2014/09/21 14:12, RSmith wrote: On 2014/09/20 23:08, Richard Hipp wrote: On Sat, Sep 20, 2014 at 12:45 PM, Merikewrote: A question: is the query being fast again after analyze call indicative of the bug being fixed? Because I tried it on my original database too and there I don't see a speedup after analyze. Should I try to minimize it to a smaller database again where the bug still occurs, even after analyze? Or will the change you made fix my original database speed as well despite the analyze call not helping it? The change fixes the problem (for us) *without* requiring ANALYZE. Richard, I think the OP meant to ask whether or not running analyze fixed the bug, maybe a slight misunderstanding and you probably assumed correct understanding as a benefit of the doubt scenario - so allow me to just add the following in case the OP did think of it the other way... On second thought, I think the first interpretation was correct. Merike - just so I understand clearly - you DID download and is using the new trunk yes? If so, how big is the old DB you tried it on? Could you try the minimize and send the file via a file-upload service? It might simply be that Analyze did not get your QP to react on that size DB as it did for us - but if the bug is not fixed it needs to be seen. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
On 2014/09/20 23:08, Richard Hipp wrote: On Sat, Sep 20, 2014 at 12:45 PM, Merikewrote: A question: is the query being fast again after analyze call indicative of the bug being fixed? Because I tried it on my original database too and there I don't see a speedup after analyze. Should I try to minimize it to a smaller database again where the bug still occurs, even after analyze? Or will the change you made fix my original database speed as well despite the analyze call not helping it? The change fixes the problem (for us) *without* requiring ANALYZE. Richard, I think the OP meant to ask whether or not running analyze fixed the bug, maybe a slight misunderstanding and you probably assumed correct understanding as a benefit of the doubt scenario - so allow me to just add the following in case the OP did think of it the other way: Merike: Running Analyze did not fix the bug, it simply changed some internal-use values that allowed the bug to be circumvented. If you re-make a table as you had without running analyze, the problem will surely remain using the same codebase. Updating your SQLite code from the Trunk specified by Richard will fix the bug for real, meaning that analyze would make no real difference (for the specifed use case). Naturally the trunk is also included in the next release which is probably due within the next month at which time the bug fix will be standard across all released versions. ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
On Sat, Sep 20, 2014 at 12:45 PM, Merikewrote: > 19.09.2014 04:21, Richard Hipp kirjutas: > > A simple script to reproduce the problem in the latest SQLite is as > > follows: CREATE TABLE t1(a INTEGER PRIMARY KEY, b INTEGER, c INTEGER, > > d INTEGER); CREATE INDEX t1b ON t1(b); CREATE TABLE t2(x INTEGER > > PRIMARY KEY, y); explain query plan SELECT * FROM t1, t2 WHERE x=c AND > > b=?1 ORDER BY a; .print -- CREATE INDEX t1bd ON > > t1(b,d); explain query plan SELECT * FROM t1, t2 WHERE x=c AND b=?1 > > ORDER BY a; If you copy/paste the above script into an sqlite3 > > command-line shell, you'll see that, for some reason, the query > > planner decides to sort manually rather than using an index for > > sorting after the t1bd index is added. We are still investigating to > > try understand why that is. Please note that if you run ANALYZE on > > your sample database, the query is fast again. On my (Ubuntu) desktop, > > the query takes 648 milliseconds before being analyzed and 600 > > microseconds afterwards - a 1000-fold speedup. > > A question: is the query being fast again after analyze call indicative > of the bug being fixed? Because I tried it on my original database too > and there I don't see a speedup after analyze. Should I try to minimize > it to a smaller database again where the bug still occurs, even after > analyze? Or will the change you made fix my original database speed as > well despite the analyze call not helping it? > The change fixes the problem (for us) *without* requiring ANALYZE. -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
19.09.2014 04:21, Richard Hipp kirjutas: > A simple script to reproduce the problem in the latest SQLite is as > follows: CREATE TABLE t1(a INTEGER PRIMARY KEY, b INTEGER, c INTEGER, > d INTEGER); CREATE INDEX t1b ON t1(b); CREATE TABLE t2(x INTEGER > PRIMARY KEY, y); explain query plan SELECT * FROM t1, t2 WHERE x=c AND > b=?1 ORDER BY a; .print -- CREATE INDEX t1bd ON > t1(b,d); explain query plan SELECT * FROM t1, t2 WHERE x=c AND b=?1 > ORDER BY a; If you copy/paste the above script into an sqlite3 > command-line shell, you'll see that, for some reason, the query > planner decides to sort manually rather than using an index for > sorting after the t1bd index is added. We are still investigating to > try understand why that is. Please note that if you run ANALYZE on > your sample database, the query is fast again. On my (Ubuntu) desktop, > the query takes 648 milliseconds before being analyzed and 600 > microseconds afterwards - a 1000-fold speedup. A question: is the query being fast again after analyze call indicative of the bug being fixed? Because I tried it on my original database too and there I don't see a speedup after analyze. Should I try to minimize it to a smaller database again where the bug still occurs, even after analyze? Or will the change you made fix my original database speed as well despite the analyze call not helping it? Merike ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
19.09.2014 05:24, Richard Hipp kirjutas: > Thanks again for the regression report. This problem is now fixed on > the SQLite trunk. See > http://www.sqlite.org/src/info/72727b68cd0796?dc=22 for the check-in > that fixes the problem. > > If all you want to know is that the problem has been fixed, you may > stop reading now. If you are curious about the cause of the problem > was, you may continue reading. > > That's great to hear! Looking forward to next release :) Merike ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
On Thu, Sep 18, 2014 at 8:54 AM, Merikewrote: > Hi everyone, > > Since upgrading to Kubuntu 14.04 I've had an issue with Quassel irc > client being slow on startup when it retrieves backlog from database. > I've tracked it down to different sqlite version being installed. > Previously I had 3.7.17 and now have 3.8.2. I've tried various versions > from https://launchpad.net/ubuntu/trusty/+source/sqlite3/+builds and can > only pinpoint it to between 3.7.17 and 3.8.0.2 because there doesn't > appear to be intermediate builds. > Thanks again for the regression report. This problem is now fixed on the SQLite trunk. See http://www.sqlite.org/src/info/72727b68cd0796?dc=22 for the check-in that fixes the problem. If all you want to know is that the problem has been fixed, you may stop reading now. If you are curious about the cause of the problem was, you may continue reading. Here is a simplified test case that illustrates the problem: CREATE TABLE t1(x INTEGER PRIMARY KEY, a, b); CREATE INDEX t1a ON t1(a); CREATE INDEX t1ab ON t1(a,b); explain query plan SELECT * FROM t1 WHERE a=?1 ORDER BY x; The query could use either index, t1a or t1ab, to lookup the appropriate rows based on "WHERE a=?1". If index t1a is used for the lookup, then the rows will come out ordered by x, since every index contains the primary key after the indexed columns. It is as if the t1a index was really over two columns, a and x, and the t1ab index was over three columns, a, b, and x. So when t1a is used for lookup, SQLite searches for the first entry in the index where a=?1 then it starts reading successive entries, as long as the a=?1 condition holds, to find the primary key and hence to look up the table content. But notice how doing that extracts the rows in ascending "x" order - exactly as requested by the ORDER BY clause. SQLite recognizes this and suppresses the sort operation since the rows are going to come out in the correct order naturally. But if t1ab is used to lookup the rows, there is that pesky 'b" column in between the "a" which is constrained by "WHERE a=?" and the "x" column. There might be two or more different "b" values for the same "a", and hence the "x" values will not necessarily be in the right order. So the output must be sorted by a separate post-processing step. That is much slower. Normally, the query planner looks at all possible indexes and tries each one to see which one will generate a plan with the least amount of work. In that case, t1a would be selected over t1ab since t1a can omit the sort operation at the end, which is usually one of the most expensive parts of a query. Unfortunately, the query planner (incorrectly) rejected the t1a index in favor of t1ab very early in the query planning process. There is a prefilter that asks whether or not a index might give output in the ORDER BY order naturally. Such indexes are given special protection from early rejection so that they can be thoroughly analyzed. But in this case, because the ORDER BY was on the INTEGER PRIMARY KEY rather than on a key column, that prefilter failed to recognize that t1a might naturally render the output in the desired order. Hence, t1a ended up being rejected in favor of t1ab, which is a covering index and hence involves less lookup work. The one-line change shown above fixes the prefilter so that it recognizes that t1a might be useful in implementing the ORDER BY clause, thus giving t1a protection from being overridden by t1ab. Both t1a and t1ab indexes have to compete in the final selection process which t1a clearly wins because the satisfies the ORDER BY clause without having to do a separate sort. > > I don't know what exact queries Quassel runs but most likely (after > looking at some Quassel source on github) it's something like the > following for a single channel: > SELECT messageid, time, type, flags, sender, message > FROM backlog > JOIN sender ON backlog.senderid = sender.senderid > WHERE bufferid = 102 > ORDER BY messageid DESC > LIMIT 100; > > When I run this query on 3.7.17 it returns nearly instantly. When I do > it on 3.8.0.2 or 3.8.6 which I also tried then it takes about 4 seconds > on my not so fast spinning drive laptop. I've posted example database > (minimized from original) at > https://docs.google.com/file/d/0Bzx3gCDqfzVdcDNhdzlfVlh4ZTA/. Could > someone either confirm or prove false that this query has become slower > with newer versions? > > Merike > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Performance regression between 3.7.17 and 3.8.0.2?
On Thu, Sep 18, 2014 at 8:54 AM, Merikewrote: > Hi everyone, > > Since upgrading to Kubuntu 14.04 I've had an issue with Quassel irc > client being slow on startup when it retrieves backlog from database. > I've tracked it down to different sqlite version being installed. > Previously I had 3.7.17 and now have 3.8.2. I've tried various versions > from https://launchpad.net/ubuntu/trusty/+source/sqlite3/+builds and can > only pinpoint it to between 3.7.17 and 3.8.0.2 because there doesn't > appear to be intermediate builds. > Thanks for the performance regression report. A simple script to reproduce the problem in the latest SQLite is as follows: CREATE TABLE t1(a INTEGER PRIMARY KEY, b INTEGER, c INTEGER, d INTEGER); CREATE INDEX t1b ON t1(b); CREATE TABLE t2(x INTEGER PRIMARY KEY, y); explain query plan SELECT * FROM t1, t2 WHERE x=c AND b=?1 ORDER BY a; .print -- CREATE INDEX t1bd ON t1(b,d); explain query plan SELECT * FROM t1, t2 WHERE x=c AND b=?1 ORDER BY a; If you copy/paste the above script into an sqlite3 command-line shell, you'll see that, for some reason, the query planner decides to sort manually rather than using an index for sorting after the t1bd index is added. We are still investigating to try understand why that is. Please note that if you run ANALYZE on your sample database, the query is fast again. On my (Ubuntu) desktop, the query takes 648 milliseconds before being analyzed and 600 microseconds afterwards - a 1000-fold speedup. > > I don't know what exact queries Quassel runs but most likely (after > looking at some Quassel source on github) it's something like the > following for a single channel: > SELECT messageid, time, type, flags, sender, message > FROM backlog > JOIN sender ON backlog.senderid = sender.senderid > WHERE bufferid = 102 > ORDER BY messageid DESC > LIMIT 100; > > When I run this query on 3.7.17 it returns nearly instantly. When I do > it on 3.8.0.2 or 3.8.6 which I also tried then it takes about 4 seconds > on my not so fast spinning drive laptop. I've posted example database > (minimized from original) at > https://docs.google.com/file/d/0Bzx3gCDqfzVdcDNhdzlfVlh4ZTA/. Could > someone either confirm or prove false that this query has become slower > with newer versions? > > Merike > ___ > sqlite-users mailing list > sqlite-users@sqlite.org > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users > -- D. Richard Hipp d...@sqlite.org ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users