Re: [sqlite] Non-optimal query plan

2014-09-25 Thread Rob Golsteijn

>> Hi List,
>>
>> I was looking at the query plan of a rather simple query, but I don't 
>> understand why sqlite would choose this query plan.
>>
>> ...I was surprised that sqlite came up with the inferior query plan...
>>
>> Note: After an "analyze aaa" (on a decently populated table) sqlite chooses 
>> the full table scan instead of creating an automatic index (but our 
>> application never uses 'analyze' to avoid that other (bad performing) query 
>> plans are used during operation than during testing)
>> Note 2: Adding "NOT INDEXED" to aa1 gives the desired query plan, but of 
>> course I prefer that sqlite choses the right query plan.

>I think this issue is fixed since it does not do it in my version, but... even 
>if it isn't, let me just note that basically SQLite 
>chooses a Query Plan that might in many circumstances work perfectly since it 
>has no information available about the data.

I do realize that it is impossible to find a QP that works correct in all 
cases. Certainly if sqlite has no knowledge about the tables's contents.
We encounter query plans that do not work out for our data occasionally. And if 
needed, we "fix" them by pushing the query planner in the desired
direction by using INDEXED BY, NOT INDEXED, CROSS JOINs, and adding +es, or 
rewriting the queries. 
The reason I reported this QP problem, is that the choice of this query plan is 
not optimal no matter what the table contents is, and I expected
it to find this without data analysis. 
Appearently, the further optimizations/tuning in the query planner solved this 
issue.

>Further 
>to this SQLite provides not one, but two explicit methods for you to improve 
>the query planning should the default not fit the best 
>for your specific query - yet you choose to ignore BOTH of them and expects 
>SQLite to pick a good plan without any knowledge of the 
>data, by default. [and ironically you are very happy to circumvent a whole 
>part of SQLite Query planning prowess to hide "other" QP 
>issues in stead of fixing it or requesting a fix, but won't add anything legal 
>and valid that would actually improve it. That is 
>just bizarre.]
We use databases in a data format conversion processes. Each conversion uses 
another set of (fresh) databases, on which a few thousand different queries
are executed once. The type and amount of data available per conversion differs 
significantly.
Hence query plans when using ANALYZE on these database will differ very much. 
In the past we used ANALYZE,
and we occasionally had problems with "never ending queries" for queries that 
usually only run minutes to a few hours. This was due to an unfortunate query 
plan.
Then my company decided not to use "ANALYZE" anymore, and only rely on fixed 
QP. (In the future we might introduce the ANALYZE results of a "standard
database" for all our databases to have the best of both worlds: predictable QP 
and sqlite having a bit of knowledge about our database contents, even if this
might differ from actual content in many cases).
Generally sqlite picks good QPs even if ANALYZE results are absent. There are 
some queries that need manual fine tuning for reasonable performance, and we do 
this.
But of couse we want to avoid this tuning as much as possible.

I don't just complain about any QP that doesn't work for me. We are regularly 
using the methods to finetune queries. But I reported this specific query plan 
because
I think even without knowledge of the table contents the QP should have come up 
with a better plan: I expected that it should have found that the a single full 
table scan
is always cheaper than creating an index on that same table and using this 
index for searching.

> If I was the asker of this question I would concentrate on what you mention 
> in passing in Note1 with regards to avoiding "other" bad 
> plans.
> If you do use Analyze and then at any point find the QP comes up with a bad 
> plan WITH access to analyze data - now THAT would 
> be a reason to complain and I have seen requests such as that cause a QP 
> overhaul many times here, you'd actually improve it for 
> everyone should you find a use-case with real QP problems based on proper 
> knowledge of data shape.




As explained our main problem with ANALYZE is predictability: for some specific 
database contents and query combination we might suddenly get a bad query plan, 
that we never encountered during our application testing. Aborting our 
conversion process for this reason means that a lot of time is lost, and 
reconversions are needed.
We want to prevent this. I guess there are more users that do not use ANALYZE 
for this reason.


Regards,
Rob


 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-optimal query plan

2014-09-24 Thread RSmith


On 2014/09/24 16:52, Rob Golsteijn wrote:

Hi List,

I was looking at the query plan of a rather simple query, but I don't 
understand why sqlite would choose this query plan.

...I was surprised that sqlite came up with the inferior query plan...

Note: After an "analyze aaa" (on a decently populated table) sqlite chooses the 
full table scan instead of creating an automatic index (but our application never uses 
'analyze' to avoid that other (bad performing) query plans are used during operation than 
during testing)
Note 2: Adding "NOT INDEXED" to aa1 gives the desired query plan, but of course 
I prefer that sqlite choses the right query plan.


I think this issue is fixed since it does not do it in my version, but... even if it isn't, let me just note that basically SQLite 
chooses a Query Plan that might in many circumstances work perfectly since it has no information available about the data. Further 
to this SQLite provides not one, but two explicit methods for you to improve the query planning should the default not fit the best 
for your specific query - yet you choose to ignore BOTH of them and expects SQLite to pick a good plan without any knowledge of the 
data, by default. [and ironically you are very happy to circumvent a whole part of SQLite Query planning prowess to hide "other" QP 
issues in stead of fixing it or requesting a fix, but won't add anything legal and valid that would actually improve it. That is 
just bizarre.]


If I was the asker of this question I would concentrate on what you mention in passing in Note1 with regards to avoiding "other" bad 
plans. If you do use Analyze and then at any point find the QP comes up with a bad plan WITH access to analyze data - now THAT would 
be a reason to complain and I have seen requests such as that cause a QP overhaul many times here, you'd actually improve it for 
everyone should you find a use-case with real QP problems based on proper knowledge of data shape.



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Non-optimal query plan

2014-09-24 Thread Richard Hipp
On Wed, Sep 24, 2014 at 10:52 AM, Rob Golsteijn 
wrote:

>
> Sqlite decides to create an AUTOMATIC INDEX (time complexity O(n log n))
> which it then uses to iterate table aaa1. This index is not re-used for
> anything else (it can't be re-used since 'type' is not used anywhere else)
> so only the traversal of table aaa1 benefits from this index. However, I
> think, a full table scan of aaa1 (time complexity O(n)) would always be
> faster, since for creating the index it has to read that entire table
> anyway.
>
> I'm using sqlite version 3.8.4.3
>


Thanks for the report.  That problem appears to have been fixed in SQLite
3.8.6.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users