[basedb-devel] Performance/delete issues on Postgres

2008-10-16 Thread Pawel Sztromwasser
Hello,

I found something interesting regarding BASE2 performance on Postgres 
(we have 8.3 but I suppose it is valid for previous versions as well). 
Recently I was unable to delete ArrayDesign (Illumina Human 8Ref, no 
items connected to it, features imported), because database query was 
endless. I stopped it after 3 days and started similar on development 
installation (much, much smaller). It took almost 20h to complete.

We started digging and found (among other similar) this: 
http://archives.postgresql.org/pgsql-admin/2006-02/msg00238.php
In short, postgres does not create indexes on foreign keys by default. 
That has huge impact on performance when updating/deleting items from 
table that is referenced by other big table. In case above Features 
(that are deleted with ArrayDesign) are referenced by all RawData* 
tables. I created indexes on feature_id column in all of these tables 
and delete took less than a minute.

That reminds me of another case where performance for postgres could be 
improved. When testing variuos kinds of jobs I noticed that concurent 
jobs creating root bioassayset took surprisingly long and did not scale 
well. That was due to sequential lookup in RawData* table on 
rawbioassay_id column. Creating index on that column improved 
performance by more than 50% (see: 
http://picasaweb.google.com/pawel.sztromwasser/Other#5257654388791782770).

I know that indexes have drawbacks as well, but in cases above they are 
a huge advantage. Could these indexes be incorporated in 
updateDb/indexes scripts? If you know about any other possible places in 
BASE database where similar things may occur, let me know and I will 
test them.

Best regards,
Pawel

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
basedb-devel mailing list
basedb-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/basedb-devel


Re: [basedb-devel] Performance/delete issues on Postgres

2008-10-16 Thread Nicklas Nordborg
Oh, I didn't know this. We are mainly working with MySQL which 
implicitly creates an index for every foreign key. I agree that indexes 
are certainly needed on the big tables.

I guess we could make the 'updateindexes.sh' script check that each 
foreign key definition has an associated index. Right now it can only be 
used to check that indexes that have been explicitly defined really 
exists in the database.

I'll add this for the 2.9 release.

/Nicklas


Pawel Sztromwasser wrote:
 Hello,
 
 I found something interesting regarding BASE2 performance on Postgres 
 (we have 8.3 but I suppose it is valid for previous versions as well). 
 Recently I was unable to delete ArrayDesign (Illumina Human 8Ref, no 
 items connected to it, features imported), because database query was 
 endless. I stopped it after 3 days and started similar on development 
 installation (much, much smaller). It took almost 20h to complete.
 
 We started digging and found (among other similar) this: 
 http://archives.postgresql.org/pgsql-admin/2006-02/msg00238.php
 In short, postgres does not create indexes on foreign keys by default. 
 That has huge impact on performance when updating/deleting items from 
 table that is referenced by other big table. In case above Features 
 (that are deleted with ArrayDesign) are referenced by all RawData* 
 tables. I created indexes on feature_id column in all of these tables 
 and delete took less than a minute.
 
 That reminds me of another case where performance for postgres could be 
 improved. When testing variuos kinds of jobs I noticed that concurent 
 jobs creating root bioassayset took surprisingly long and did not scale 
 well. That was due to sequential lookup in RawData* table on 
 rawbioassay_id column. Creating index on that column improved 
 performance by more than 50% (see: 
 http://picasaweb.google.com/pawel.sztromwasser/Other#5257654388791782770).
 
 I know that indexes have drawbacks as well, but in cases above they are 
 a huge advantage. Could these indexes be incorporated in 
 updateDb/indexes scripts? If you know about any other possible places in 
 BASE database where similar things may occur, let me know and I will 
 test them.
 
 Best regards,
 Pawel
 
 -
 This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
 Build the coolest Linux based applications with Moblin SDK  win great prizes
 Grand prize is a trip for two to an Open Source event anywhere in the world
 http://moblin-contest.org/redirect.php?banner_id=100url=/
 ___
 basedb-devel mailing list
 basedb-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/basedb-devel


-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
basedb-devel mailing list
basedb-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/basedb-devel