I was hoping they wouldn't block each other because it's a read lock. I tried 
making an index on all the columns, but R*tree table + JOIN that I'm using runs 
about 10x faster. I might have done something wrong, so I'm open to suggestions 
on a better index, though.

I don't think that UPDATEing 99.99% of 17 million columns is going to run 
faster than the pure select statement that I have. Each select statement is 
fairly random, a small selection out of an N-dimensional space.

Am I understanding you? Any other ideas?
~Seth


On Aug 3, 2011, at 8:43 PM, Simon Slavin wrote:

> 
> On 4 Aug 2011, at 3:07am, Seth Price wrote:
> 
>> The full DB has around 17 million rows in it, and for each row I am trying 
>> to count all rows with similar characteristics, divided into different 
>> classifications. I was already able to improve speed 10x by using the R*tree 
>> extension to narrow my search. My queries look like this:
>> 
>> SELECT class, COUNT(*) AS count FROM data_r JOIN data USING (rowNum) WHERE 
>> 57 < col0min AND col0max < 61 AND 52 < col1min AND col1max < 56 AND 66 < 
>> col2min AND col2max < 70 AND 88 < col3min AND col3max < 92 AND 133 < col4min 
>> AND col4max < 137 AND 57 < col0 AND col0 < 61 AND 52 < col1 AND col1 < 56 
>> AND 66 < col2 AND col2 < 70 AND 88 < col3 AND col3 < 92 AND 133 < col4 AND 
>> col4 < 137 AND 81 < col5 AND col5 < 85 GROUP BY class;
> 
> Forget multithreading.  You have a bottleneck of accessing the database file 
> and the threads will just block each other.
> 
> There're two ways to make that SELECT run quickly.  One is to create one 
> index on all those columns.
> 
> The other is to do the SELECT in parts, by disqualifying records you don't 
> want.  Create another column called, perhaps, notThisOne.  Then do a 
> succession of disqualifying checks:
> 
> UPDATE data SET notThisOne=1 WHERE col0min<=57 OR col0min>=61;
> 
> etc.  Then all the rows you didn't disqualify are the ones you want.  
> Obviously, separate indices on each of those columns will speed up the 
> disqualifying check.
> 
> Simon.
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to