Hi everyone, first of all, it's great to have a RDBMS like derby around! Now, on to the 2 problems I'm trying to resolve...
Situation: 1 table, 15+ million rows, 4 integer values (id, id_group1, id_group2, id_group3), 750MB size on disk, derby in network mode Problem 1: select id_group, count(id) group by id_group This takes a couple of minutes, basically regardless of what I tried with indexes. I'd like it to take <10s if possible. Problem 2: It seemed to me that the query triggers a sequential scan (I see a lot of disk activity) and so I created a disk in memory and restored the database to this disk, expecting the query to be close to instantaneous. Better, but still horrible (80 sec). Which brings me to the crux of the 2nd problem: this in-memory exercise was without indexes, so I tried to create some (id_group1 and id for a start). However, the indexes seem to be _huge_ at 300-400MB each and spike at about 700MB before the index is completely created - I was quite surprised to see the indexes ttake more than several dozen MB. So, is there a way to reduce index size? And would any kind of indexes help the type of query I'm interested in? (Machine: Ubuntu Jaunty, 1.7GHz Pentium M, 2GB RAM, 5400 rpm disk) Thanks, Tomislav -- www.PanBI.org: business intelligence everywhere!
