Hi Everybody, I am running postgres v8.3.3 on redhat linux (del hardware) with 4 cpu's. This machine is terribly bogged down and I would like a bit of help as to what can be done.
For last maybe 18+/- hours, there are 24 queries happening. What's odd is that 21 of them are identical queries. This happens once in a while (maybe one per month, plus/minus) because of the data I need to process. Basically, I fire up an application for each data (there are 21 of them) and the application gets data out of postgres and transforms matrices and put them into the form the subsequent processes can handle. I know it sounds dumb, but that's how it is for a foreseeable future. Here's the query (please read them in fixed-size font, if you can): select subjectid, genotype.markerid, a1.value as allele1, a2.value as allele2, genotype.dateCreated, genotype.dateReplaced, genotype.ignore, genotype.inconsistent from genotype, allele a1, allele a2 where allele1id = a1.alleleid and allele2id = a2.alleleid; Genotype table mentioned above has about 600,000,000+ rows. As I mentioned, there are 20 more of them running concurrently. 3 other jobs look like: SELECT a.markerid,a.type,a.localname,b.ncbibuild, a.chromosome,a.geneticposition,b.physicalposition, a.strand,a.stdflanks,a.maxflanks,a.datecreated, a.datereplaced,a.sourcetablename,a.sourceid, b.dbsnprsid,a.ignore,a.gene FROM public.marker a, public.snpposition b WHERE a.ignore= 'N' AND a.datecreated <= (timestamp'Wed Oct 29 09:35:54.266 2008') AND a.datereplaced > (timestamp'Wed Oct 29 09:35:54.266 2008') AND a.localname IN ('RS10757474','RS7859598','RS6148','RS9792663','RS1541125', 'RS10511446','RS10814410','RS12338622','RS875587', 'RS1590979', 'RS748786','RS958505','RS12352961', and on and on and on...); insert into summarystats select 'Marker by Chromosomes', chromosome, sourcetablename, count(*), null, to_timestamp('2009-02-18 20:29:40.125', 'yyyy-mm-dd hh:mi:ss.ms') from marker where ignore = 'N' and datereplaced = '3000-01-01 12:00:00.000' and exists (select 1 from genotype where genotype.markerid = marker.markerid and genotype.ignore = 'N' and genotype.datereplaced = '3000-01-01 12:00:00.000') group by chromosome, sourcetablename; COPY public.genotype (genotypeid, subjectid, markerid, allele1id, allele2id, datecreated, datereplaced, ignore, inconsistent, sourcetablename, sourceid) TO stdout; The one in the middle (insert job), which gets done 6 days a week, takes anywhere between 30 minutes and 1.5 hour when the load average is in the neighborhood of 3. The total memory the machine has is 32 mb and nearly 100% of it is consumed. Swap is twice as large as physical memory, but very little is used. The load average of the machine when I am in this fix is somewhere between 25 and 27. Each postgres process consumes so little cpu time. The copy takes maybe 4% (+/-); the rest of them use somewhere between 0.3% and 0.7%. As to memory, the copy takes 3% to 4% and the rest takes something like 1.7%. In terms of postgres configuration: max_connections = 100 shared_buffers = 1024MB temp_buffers = 128MB max_fsm_pages = 153600 vacuum_cost_delay = 0 checkpoint_segments = 3 checkpoint_timeout = 5min checkpoint_warning = 30s I don't think I am doing anything wild... Am I? Oh, one more thing, I said that there are 24 queries/jobs happening, but there are a bunch of them that says <IDLE> or <IDLE> in transaction --according to pg_stat_activity view. Can anybody suggest anything that I can do to gain speed? Any help is much appreciated. Regards, Tena Sakai tsa...@gallo.ucsf.edu