Hi Emanuel, > Do you tried partitioned tables? diferent tablespaces? > set the storage external for more important columns? > what kind of indexes do you have?
I think the table design is ok. I don't know what "set the storage external" means. Maybe you can explain it to me in lay-person's terms? I think indices are ok, but I welcome your comments. canon=# \d genotype Table "public.genotype" Column | Type | Modifiers -----------------+-----------------------------+--------------------------------------------------------------- genotypeid | integer | not null default nextval('genotype_genotypeid_seq'::regclass) subjectid | integer | markerid | integer | allele1id | integer | allele2id | integer | datecreated | timestamp without time zone | not null datereplaced | timestamp without time zone | not null ignore | character(1) | not null default 'N'::bpchar inconsistent | character(1) | not null default 'N'::bpchar sourcetablename | character varying | not null sourceid | character varying | not null Indexes: "genotype_pkey" PRIMARY KEY, btree (genotypeid, datecreated) "genotype_genotypeid_idx" btree (genotypeid) "genotype_markerid_idx" btree (markerid) "genotype_source_idx" btree (sourceid, sourcetablename) "genotype_subjectid_idx" btree (subjectid) canon=# \d allele Table "public.allele" Column | Type | Modifiers --------------+-----------------------------+----------------------------------------------------------- alleleid | integer | not null default nextval('allele_alleleid_seq'::regclass) markerid | integer | value | character varying | not null datecreated | timestamp without time zone | not null datereplaced | timestamp without time zone | not null Indexes: "allele_pkey" PRIMARY KEY, btree (alleleid, datecreated) "allele_markerid_idx" btree (markerid) > try to run explain analyze for those querys, then > post the results. Yes, it is running for last 13+ hours and I have no idea how much longer it might take. > I recommend follow this thread: > http://archives.postgresql.org/pgsql-hackers/2009-02/msg00718.php That's pretty interesting stuff. I need to experiment tweaking parameters Joshua mentions. Alas, now is not a good time to do so. Regards, Tena Sakai -----Original Message----- From: pgsql-admin-ow...@postgresql.org on behalf of Emanuel Calvo Franco Sent: Fri 2/20/2009 4:10 AM To: pgsql-admin@postgresql.org Subject: Re: [ADMIN] very, very slow performance 2009/2/20 Tena Sakai <tsa...@gallo.ucsf.edu>: > Hi Everybody, > > Here's the query (please read them in fixed-size font, if > you can): > > select subjectid, genotype.markerid, a1.value as allele1, > a2.value as allele2, genotype.dateCreated, > genotype.dateReplaced, genotype.ignore, > genotype.inconsistent > from genotype, allele a1, allele a2 > where > allele1id = a1.alleleid > and > allele2id = a2.alleleid; > > Genotype table mentioned above has about 600,000,000+ rows. As > I mentioned, there are 20 more of them running concurrently. > 3 other jobs look like: > Do you tried partitioned tables? diferent tablespaces? set the storage external for more important columns? what kind of indexes do you have? > SELECT a.markerid,a.type,a.localname,b.ncbibuild, > a.chromosome,a.geneticposition,b.physicalposition, > a.strand,a.stdflanks,a.maxflanks,a.datecreated, > a.datereplaced,a.sourcetablename,a.sourceid, > b.dbsnprsid,a.ignore,a.gene > FROM public.marker a, public.snpposition b > WHERE > a.ignore= 'N' > AND a.datecreated <= (timestamp'Wed Oct 29 09:35:54.266 2008') > AND a.datereplaced > (timestamp'Wed Oct 29 09:35:54.266 2008') > AND a.localname IN > ('RS10757474','RS7859598','RS6148','RS9792663','RS1541125', > 'RS10511446','RS10814410','RS12338622','RS875587', > 'RS1590979', 'RS748786','RS958505','RS12352961', > and on and on and on...); > > > insert into summarystats > select 'Marker by Chromosomes', chromosome, > sourcetablename, > count(*), null, to_timestamp('2009-02-18 > 20:29:40.125', > 'yyyy-mm-dd hh:mi:ss.ms') > from marker > where ignore = 'N' > and datereplaced = '3000-01-01 12:00:00.000' > and exists (select 1 > from genotype > where genotype.markerid = > marker.markerid > and genotype.ignore = 'N' > and genotype.datereplaced = > '3000-01-01 12:00:00.000') > group by chromosome, sourcetablename; > > COPY public.genotype (genotypeid, subjectid, markerid, > allele1id, allele2id, datecreated, datereplaced, > ignore, inconsistent, sourcetablename, sourceid) > TO stdout; > Like Scott saids, try to run explain analyze for those querys, then post the results. > > Can anybody suggest anything that I can do to gain speed? > Any help is much appreciated. > I recommend follow this thread: http://archives.postgresql.org/pgsql-hackers/2009-02/msg00718.php > Regards, > > Tena Sakai > tsa...@gallo.ucsf.edu > > > -- Emanuel Calvo Franco Sumate al ARPUG ! (www.postgres-arg.org - www.arpug.com.ar) ArPUG / AOSUG Member Postgresql Support & Admin -- Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin