[sqlite] Fastest way to find whether at least one row has a certain column value

R Smith Sun, 6 Mar 2016 10:09:33 +0200


On 2016/03/05 6:21 PM, Paul Sanderson wrote:
> Thanks Richard
>
> so suppose I have two tables table1 and table2 each with 1000 rows and
> say 100 columns some containing large blobs.
>
> My user choose a query "select * from table1, table2"
>
> I can modify the query and do a "select count(*)  from table1, table2"
> to determine that 1M rows will be returned and ask him if this is what
> he really wants to do.
>
> But what if he has "select * from table1, table2 limit 1000"
>
> In this instance there is no need for me to warn him as he has already
> applied a limit to the amount of data returned.
>
> Any suggestions as to how I can detect this , other than parsing the
> query for "limit x"


I see your point, but I think you are missing something - The amount of 
rows returned by a query is a very bad indicator of how long such a 
query will run - AND, if it is a long query, you will waste twice the 
time for the end user.  Further to this, any amount of queries can be 
made that takes hours to complete, but produces perhaps less than 5 
rows, or even just 1.

Similarly, but on a very different tack, you can have a query like 
"SELECT xxx FROM A,B,C,D WHERE yyyy" without the limit clause, which 
might itself really only produce 10 or less rows (and perhaps take 
really long) due to the WHERE clause. So even if you did parse the LIMIT 
clause, you are no closer to a true tale.

My strategy has always been to run queries as the user enters them and 
limit them to 1000 or such when a LIMIT is not already present. If I do 
actually produce 1000 rows (which means there may be more) then I prompt 
the user with "This query has more (perhaps significantly more) rows, 
would you like to: Load all (May take a long time), Load Next 1000, Stop 
here." or some other mechanism that achieves the same, though the "load 
next 1000" option is harder to implement when you connect via 3rd party 
tools.

A last thought, LIMIT is not a Query word or in any way affects the 
Query idea that the Query planner sees, it has no algebraic influence on 
the query. It is an output limitation only. If there is room with 100 
people in it, and I were to ask the curator: Please give me the names of 
the people limited to 10 values, like so:
SELECT Name FROM People LIMIT 10
he would take the list of 100 people and read off the 1st 10 of them. 
This limitation doesn't shorten his list or affects the number of people 
in the room, he just stops reading at some point to please me. So if I 
ask, how many people are in the room, but stop reading at the 10th 
result. he is just going to say: 100 people and stop there. Your 
limiting clause did not change that fact.

Were I to ask: How many people would be in the list if I asked you for a 
list of people limited to 10 output values? He would answer 10 - of 
course, that is why this query works:
SELECT COUNT(*) FROM (SELECT * FROM People LIMIT 10);

Cheers,
Ryan

[sqlite] Fastest way to find whether at least one row has a certain column value

Reply via email to