On Saturday 09 February 2008 06:30:05 Army wrote: > Can you can provide a simple example to demonstrate the behavior that > you are seeing? I tried playing around with some simple (nonsense) > tables but I wasn't able to come up with anything that sounds like what > you are describing. > > Are you simply referring to something like: > > select count(*) from jobitems; > > vs > > select count(*) from jobitems where jobid = 4288; > select count(*) from jobitems where jobid > 1000;
Schema: CREATE TABLE jobitems (jobid INTEGER NOT NULL, itemid INTEGER NOT NULL, state INTEGER NOT NULL) CREATE INDEX jobitem_state_idx ON jobitems(state) CREATE INDEX jobitem_jobid_state_idx ON jobitems(jobid, state) I created bogus test data where jobid was always 0 and state was cycled between 0,1,2. Query: SELECT COUNT(*) FROM jobitems WHERE jobid = 0 AND state = 0 Initial results were around 600ms for 100000 rows, when I realised it would probably speed up if I put in the second index listed above. The performance is still linear but it does speed up as now it doesn't have to iterate through every row in the table, only the ones which match the WHERE clause. 1-100 rows: <1ms 1000 rows: 2ms 10000 rows: 7ms 100000 rows: 66ms 1000000 rows: 1176ms I naively thought an index would also result in being able to determine the count quickly for any given query on that index. If I log the query plan it confirms what's happening. It estimates the number of rows which will come back and then says it's iterating through each row to determine the count. I don't suppose there is some advanced way to get an approximate count. When the number of items is so large it seems a little pointless to get an accurate count anyway; within 1% would be acceptable if there were a way to get it quickly. My initial thought is to have total columns on the table I'm joining to, and update both whenever anything is changed. Problem is that two users adding or removing items at the same time would be updating the same table and I'm not sure how clever Derby is if two users do an UPDATE at the same time which adds or subtracts from the existing value. If it merged them together as expected then I'm guessing would be a viable solution. Daniel
