performance, memory consumption...and a big hello :)

Tomi N/A Fri, 24 Jul 2009 14:31:36 -0700

Hi everyone,

first of all, it's great to have a RDBMS like derby around!
Now, on to the 2 problems I'm trying to resolve...


Situation: 1 table, 15+ million rows, 4 integer values (id, id_group1,
id_group2, id_group3), 750MB size on disk, derby in network mode

Problem 1:

select id_group, count(id)
group by id_group

This takes a couple of minutes, basically regardless of what I tried
with indexes. I'd like it to take <10s if possible.

Problem 2:

It seemed to me that the query triggers a sequential scan (I see a lot
of disk activity) and so I created a disk in memory and restored the
database to this disk, expecting the query to be close to
instantaneous. Better, but still horrible (80 sec).
Which brings me to the crux of the 2nd problem: this in-memory
exercise was without indexes, so I tried to create some (id_group1 and
id for a start). However, the indexes seem to be _huge_ at 300-400MB
each and spike at about 700MB before the index is completely created -
I was quite surprised to see the indexes ttake more than several dozen
MB.
So, is there a way to reduce index size? And would any kind of indexes
help the type of query I'm interested in?

(Machine: Ubuntu Jaunty, 1.7GHz Pentium M, 2GB RAM, 5400 rpm disk)

Thanks,
Tomislav

--
www.PanBI.org: business intelligence everywhere!

performance, memory consumption...and a big hello :)

Reply via email to