Re: [sqlite] Best way to optimize this query?

2007-05-02 Thread Dennis Cote

Tito Ciuro wrote:


Assume the following scenario: I store people in a table, like this:

People
ROWID, idx
GUID, idx
First, idx
Last, idx
Email
...

In the app, the user can select People GUIDs from different sources 
and then retrieve the info from the database. The easy/suboptimal 
route to retrieve the records would be to perform a SELECT per GUID 
selected. I thought of something like this:


SELECT * FROM People where GUID in ("ABC", "RDT", "TUV");

Is there a better way to include all these GUIDs on a single SQL 
statement to speed things up?


Questions I have:

1) Is this the best way to solve the problem? Suggestions?


Tito,

You might want to use a temp table to hold your list of GUIDs. Then you 
can use a single precompiled query to do the select.


   select * from People where GUID in (select GUID from temp_guids)

As it is, you will have to prepare your query for each lookup. The temp 
table would replace the loop that binds the parameters with a loop that 
inserts the guids into the temp table.


Also, you are using double quotes for string literals, these should be 
single quotes instead.


   ... in ('ABC', 'RTD', 'TUV')

2) Is there a limit on the number of parameters I can pass to "in"? 
What if I have, say, 500 to retrieve? Will SQLite complain about this?




As far as I know there is no hard limit on the number of parameters you 
can use in a query. If you want to precompile the query you will need to 
have as many as your maximum requirement. In that case you will have to 
bind all the parameters before the query, and then reset them to null 
after the query so that the values are reset for the next query (i.e you 
will have to bind each variable twice). If you don't do this a query 
with many parameters followed by a query with a few will still have 
values bound to the higher numbered parameters left over from the first 
query.



In the alternate, temp table approach, you would  create the table if it 
doesn't exist


   execute("begin")
   execute("create temp table if not exists temp_guids(guid primary key)")
  
Then loop to insert the GUIDs using a precompiled insert with one parameter


   s = prepare(insert into t values(?))
   for each guid in guid_list:
  bind(s, 1, guid)
  execute(s)

   execute ("commit")

Now execute your guid lookup query

people = execute("select * from People where GUID in (select guid 
from temp_guids)"
  
And finally clear the temp table


   execute("delete from temp_guid")


Of course you would have to try each methods to see which is fastest. 
The single query  requires compilation for each lookup, a single 
precompiled query requires binding each variable twice but has the 
advantage that the VDBE code will build an index from the parameters on 
the fly (without an associated table), the temp table approach can be 
done by executing only prepared sql statements but will build two 
records (one in the table and one in the index) for each record.


I seem to be rambling so I will stop now.

HTH
Dennis Cote
 


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Best way to optimize this query?

2007-05-02 Thread Tito Ciuro

Hi Donald,

On May 2, 2007, at 11:25 AM, Griggs, Donald wrote:


The ROWID is indexed implicitly I believe, so it may be slowing things
slightly if you index it explicitly.


Yes, I was aware of that, thanks for the heads up.


Regarding:  "What if I have, say, 500 to retrieve?"

You can create a temporary table, perhaps in ram memory, where you  
store

the GUIDS, e.g.
   CREATE TEMP TABLE MyGUIDS(GUID);
Then pull them all out of the people table all at once with:

SELECT * FROM People
  WHERE GUID IN (SELECT GUID FROM MyGUIDS)
  ORDER BY 


Thank you very much,

-- Tito


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Best way to optimize this query?

2007-05-02 Thread Tito Ciuro

Hello,

On May 2, 2007, at 11:11 AM, P Kishor wrote:


On 5/2/07, Tito Ciuro <[EMAIL PROTECTED]> wrote:
When you say "speed things up," is it not fast enough yet? Numbers
would be helpful.


I've just tested it and the query is *very* fast. I was just  
wondering whether this type of query looked right.



try it, if SQLite complains, you will know, and you will have to
approach the problem differently ;-)


I'll try that. Thanks a lot,

-- Tito

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



RE: [sqlite] Best way to optimize this query?

2007-05-02 Thread Griggs, Donald
Hi Tito, 

People
ROWID, idx
GUID, idx
First, idx
Last, idx
Email
...

The ROWID is indexed implicitly I believe, so it may be slowing things
slightly if you index it explicitly.


Regarding:  "What if I have, say, 500 to retrieve?"

You can create a temporary table, perhaps in ram memory, where you store
the GUIDS, e.g.
   CREATE TEMP TABLE MyGUIDS(GUID);
Then pull them all out of the people table all at once with:

SELECT * FROM People
  WHERE GUID IN (SELECT GUID FROM MyGUIDS)
  ORDER BY 



[opinions mine, not my company's]


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Best way to optimize this query?

2007-05-02 Thread P Kishor

On 5/2/07, Tito Ciuro <[EMAIL PROTECTED]> wrote:

Hello,

Assume the following scenario: I store people in a table, like this:

People
ROWID, idx
GUID, idx
First, idx
Last, idx
Email
...

In the app, the user can select People GUIDs from different sources
and then retrieve the info from the database. The easy/suboptimal
route to retrieve the records would be to perform a SELECT per GUID
selected. I thought of something like this:

SELECT * FROM People where GUID in ("ABC", "RDT", "TUV");

Is there a better way to include all these GUIDs on a single SQL
statement to speed things up?


When you say "speed things up," is it not fast enough yet? Numbers
would be helpful.


Questions I have:

1) Is this the best way to solve the problem? Suggestions?


unless you change your attribution (for example, tag your data by
GUID-types, if there is such a thing in your case), the above is the
best way, perhaps the only way. Internally, I believe the IN clause
gets converted to a set of OR matches as in
GUID = 'ABC' OR GUID = 'XYZ' OR... (or is it the other way around?)


2) Is there a limit on the number of parameters I can pass to "in"?


dunno... try it.


What if I have, say, 500 to retrieve? Will SQLite complain about this?


try it, if SQLite complains, you will know, and you will have to
approach the problem differently ;-)




-
To unsubscribe, send email to [EMAIL PROTECTED]
-





--
Puneet Kishor http://punkish.eidesis.org/
Nelson Inst. for Env. Studies, UW-Madison http://www.nelson.wisc.edu/
Open Source Geospatial Foundation http://www.osgeo.org/education/
-
collaborate, communicate, compete
=

-
To unsubscribe, send email to [EMAIL PROTECTED]
-