Re: [PERFORM] This query is still running after 10 hours...
What does observations_trigger do? On Tue, 28 Sep 2004 08:19:57 -0600, Robert Creager [EMAIL PROTECTED] wrote: Help? Normally, this query takes from 5 minutes to 2 hours to run. On this update, it's been running for more than 10 hours. Can it be helped? UPDATE obs_v SET mag = obs_v.imag + zp.zero_v + cg.color_v * (obs_v.imag - i.imag), use = true FROM color_groups AS cg, zero_pair AS zp, obs_i AS i, files AS f, groups AS g WHERE obs_v.star_id = i.star_id AND obs_v.file_id = f.file_id AND cg.group_id = g.group_id AND g.night_id = f.night_id AND g.group_id = $group_id AND zp.pair_id = f.pair_id Hash Join (cost=130079.22..639663.94 rows=1590204 width=63) Hash Cond: (outer.star_id = inner.star_id) - Seq Scan on obs_i i (cost=0.00..213658.19 rows=10391319 width=8) - Hash (cost=129094.19..129094.19 rows=77211 width=59) - Nested Loop (cost=250.69..129094.19 rows=77211 width=59) - Hash Join (cost=250.69..307.34 rows=67 width=12) Hash Cond: (outer.pair_id = inner.pair_id) - Seq Scan on zero_pair zp (cost=0.00..43.32 rows=2532 width=8) - Hash (cost=250.40..250.40 rows=118 width=12) - Hash Join (cost=4.80..250.40 rows=118 width=12) Hash Cond: (outer.night_id = inner.night_id) - Seq Scan on files f (cost=0.00..199.28 rows=9028 width=12) - Hash (cost=4.80..4.80 rows=1 width=8) - Nested Loop (cost=0.00..4.80 rows=1 width=8) - Seq Scan on color_groups cg (cost=0.00..2.84 rows=1 width=8) Filter: (171 = group_id) - Seq Scan on groups g (cost=0.00..1.95 rows=1 width=8) Filter: (group_id = 171) - Index Scan using obs_v_file_id_index on obs_v (cost=0.00..1893.23 rows=2317 width=51) Index Cond: (obs_v.file_id = outer.file_id) Table definitions: tassiv=# \d color_groups Table public.color_groups Column| Type | Modifiers --+-+--- group_id | integer | not null default nextval('color_groups_group_id_seq'::text) color_u | real| color_b | real| color_v | real| color_r | real| color_i | real| max_residual | real| Indexes: color_groups_pkey primary key, btree (group_id) color_group_group_id_index btree (group_id) tassiv=# \d zero_pair Table public.zero_pair Column | Type | Modifiers -+-+--- pair_id | integer | not null zero_u | real| default 0 zero_b | real| default 0 zero_v | real| default 0 zero_r | real| default 0 zero_i | real| default 0 Indexes: zero_pair_pkey primary key, btree (pair_id) Foreign-key constraints: $1 FOREIGN KEY (pair_id) REFERENCES pairs(pair_id) ON DELETE CASCADE tassiv=# \d obs_v Table public.obs_v Column | Type | Modifiers -+-+ x | real| not null y | real| not null imag| real| not null smag| real| not null loc | spoint | not null obs_id | integer | not null default nextval('obs_id_seq'::text) file_id | integer | not null use | boolean | default false solve | boolean | default false star_id | integer | mag | real| Indexes: obs_v_file_id_index btree (file_id) obs_v_loc_index gist (loc) obs_v_obs_id_index btree (obs_id) obs_v_star_id_index btree (star_id) obs_v_use_index btree (use) Foreign-key constraints: obs_v_files_constraint FOREIGN KEY (file_id) REFERENCES files(file_id) ON DELETE CASCADE obs_v_star_id_constraint FOREIGN KEY (star_id) REFERENCES catalog(star_id) ON DELETE SET NULL Triggers: obs_v_trig BEFORE INSERT OR DELETE OR UPDATE ON obs_v FOR EACH ROW EXECUTE PROCEDURE observations_trigger () tassiv=# \d files Table public.files Column |Type | Modifiers --+-+--- file_id | integer | not null default nextval('files_file_id_seq'::text) night_id | integer | pair_id | integer | name | character varying | not null date | timestamp without time zone | Indexes: files_pkey primary key, btree (file_id) files_name_key
Re: [PERFORM] This query is still running after 10 hours...
On Tue, 28 Sep 2004 20:21:40 -0600, Robert Creager [EMAIL PROTECTED] wrote: The trigger keeps another table (catalog) up to date with the information from the obs_v and obs_i tables. There are no direct insert/update/delete's on the catalog table, only though the trigger. It's possible that the update to catalog is what is really taking a long time. You might wish to try and explain that query just to make sure. You might also wish to disable to trigger just to rule it out. Does catalog have any triggers on it? Does it have any foreign keys? I've shot myself in the foot on this before which is the only reason I ask about it. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Table UPDATE is too slow
Do all of the commands to swap tables in a transaction. The table gets locked briefly but should have a lot less impact then the update command. On Mon, 06 Sep 2004 01:28:04 +0200, Marinos J. Yannikos [EMAIL PROTECTED] wrote: That said, I'm not entirely sure how well postgres' client libraries can deal with tables being renamed while in use, perhaps someone can shed some light on this. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Anyone familiar with Apple Xserve RAID
Actually you are both are right and wrong. The XRaid uses FibreChannel to communicate to the host machine(s). The Raid controller is a FibreChannel controller. After that there is a FibreChannel to UltraATA conversion for each drive, separate ATA bus for each drive. What I am curious about is if this setup gets around ATA fsync problems, where the drive reports the write before it is actually performed. Josh Berkus wrote: Guys, the XServe/XRaid comes with FibreChannel I stand corrected. That should help things some; it makes it more of a small tradeoff between performance and storage size for the drives. -- Kevin Barnard Speed Fulfillment and Call Center [EMAIL PROTECTED] 214-258-0120
Re: [PERFORM] Timestamp-based indexing
Harmon S. Nine wrote: monitor=# explain analyze select * from eventtable where timestamp CURRENT_TIMESTAMP - INTERVAL '10 minutes'; QUERY PLAN Try SELECT * FROM eventtable where timestamp BETWEEN (CURRENT_TIMESTAMP - INTERVAL '10 minutes') AND CURRENT_TIMESTAMP; This should will use a range off valid times. What your query is doing is looking for 10 minutes ago to an infinate future. Statically speaking that should encompass most of the table because you have an infinate range. No index will be used. If you assign a range the planner can fiqure out what you are looking for. -- Kevin Barnard Speed Fulfillment and Call Center [EMAIL PROTECTED] 214-258-0120 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] atrocious update performance
On 5 Apr 2004 at 12:05, Rosser Schwarz wrote: Just this morning, however, I created a copy of the target table (all 4.7M rows), with absolutely no foreign keys referring to it, and ran the update against the copy. That update took 2300 seconds. The join columns were indexed in both cases. Have you added indexes for the custid column for tables account.acct accunt.orgacct and note? I haven't followed the entire thread but it you have cascading FK on those tables without an index on the column that could cause your delay. Kevin Barnard SpeedFC ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] FreeBSD config
On 26 Feb 2004 at 13:58, Dror Matalon wrote: which brings me back to my question why not make Freebsd use more of its memory for disk caching and then tell postgres about it. I think there is some confusion about maxbufsize and hibufspace. I looking at a comment in the FreeBSB source 4.9 that explains this. I think you will want to increase effective_cache to match maxbufsize not hibufspace but I could be wrong. $FreeBSD: src/sys/kern/vfs_bio.c,v 1.242.2.21 line 363
Re: [PERFORM] Increasing number of PG connections.
On 2 Feb 2004 at 16:45, scott.marlowe wrote: Do you have the cache set to write back or write through? Write through can be a performance killer. But I don't think your RAID is the problem, it looks to me like postgresql is doing a lot of I/O. When you run top, do the postgresql processes show a lot of D status? That's usually waiting on I/O Actually I'm not sure. It's setup with the factory defaults from IBM. Actually when I start hitting the limit I was surprised to find only a few D status indicators. Most of the processes where sleeping. what you want to do is get the machine to a point where the kernel cache is about twice the size or larger, than the shared_buffers. I'd start at 1 shared buffers and 4096 sort mem and see what happens. If you've still got 2 gig kernel cache at that point, then increase both a bit (2x or so) and see how much kernel cache you've got. If your kernel cache stays above 1Gig, and the machine is running faster, you're doing pretty good. I've set shared to 1 and sort to 4096. I just have to wait until the afternoon before I see system load start to max out. Thanks for the tips I'm crossing my fingers. -- Kevin Barnard ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Increasing number of PG connections.
I am running a Dual Xeon hyperthreaded server with 4GB RAM RAID-5. The only thing running on the server is Postgres running under Fedora. I have a 700 connection limit. The DB is setup as a backend for a very high volume website. Most of the queries are simple, such as logging accesses, user login verification etc. There are a few bigger things suchas reporting etc but for the most part each transaction lasts less then a second. The connections are not persistant (I'm using pg_connect in PHP) The system was at 2 GB with a 400 connection limit. We ran into problems because we hit the limit of connections during high volume. 1. Does 400 connections sound consistant with the 2GB of RAM? Does 700 sound good with 4 GB. I've read a little on optimizing postgres. Is there anything else I can do maybe OS wise to increase how many connections I get before I start swapping? 2. Are there any clustering technologies that will work with postgres? Specifically I'm looking at increasing the number of connections. The bottom line is since the website launched (middle of January) we have increased the number of http connections, and increased bandwidth allowances by over 10 times. The site continues to grow and we are looking at our options. Some of the ideas have been possible DB replication. Write to master and read from multiple slaves. Other ideas including increasing hardware. This is the biggest site I have ever worked with. Almost everything else fits in a T1 with a single DB server handling multiple sites. Does anybody with experence in this realm have any suggestions? Thank you in advance for whatever help you can provide. -- Kevin Barnard ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Increasing number of PG connections.
On 2 Feb 2004 at 13:58, scott.marlowe wrote: what do you mean at 2 GB? Is that how much is in kernel cache plus buffer, plus used, plus etc??? Could you give us the top of top output to make sure? If most of that is kernel cache, then that's fine. 2GB was total system memory. We upgraded to 4GB to prior to increasing the number of connections. Here's the top of top 16:14:17 up 2 days, 16:15, 1 user, load average: 7.60, 6.56, 4.61 730 processes: 721 sleeping, 9 running, 0 zombie, 0 stopped CPU states: cpuusernice systemirq softirq iowaitidle total0.0%0.0%0.0% 0.0% 0.0%0.0%0.0% cpu000.0%0.0%0.0% 0.0% 0.0%0.0%0.0% cpu010.0%0.0%0.0% 0.0% 0.0%0.0%0.0% cpu020.0%0.0%0.0% 0.0% 0.0%0.0%0.0% cpu030.0%0.0%0.0% 0.0% 0.0%0.0%0.0% Mem: 3747644k av, 3298344k used, 449300k free, 0k shrd, 147880k buff 2158532k active, 760040k inactive Swap: 1048088k av, 0k used, 1048088k free 2262156k cached The DB is pretty close to max connections at this point in time. I don't know why CPU shows 0% in every bucket. It looks like I can increase the number of connections a little from here. This is a fairly standard Fedora install. It's using version 2.4.22 of the Kernel. Postgres is a complied version using 7.4.1 experience has been that individual postgresql backends only weigh in at a mega byte at most, and they share buffer, so 700 connections can be anywhere from 300meg to 1 gig. the rest would be buffer memory. It's not a good idea to give up too much to shared buffers, as the database isn't as good at caching as the kernel. OK I take this as I should keep shared buffers around 2x connections then correct? What do you have in postgresql.conf? sort_mem, shared_buffers, etc??? Here is what I have that is not set from the defaults. max_connections = 700 shared_buffers = 1500 sort_mem = 512 random_page_cost = 2 stats_start_collector = true stats_command_string = true stats_block_level = true stats_row_level = true sort_mem can be a real killer if it lets the processes chew up too much memory. Once sort_mem gets high enough to make the machine start swapping it is doing more harm than good being that high, and should usually be lowered a fair bit. I dropped it down to 512 as you can see. Should I be running with all of the stats on? I am no longer using pg_autovacuum. I seem to be getting better results with an hourly Vacuum anaylse. How many disks in your RAID5? The more the better. Is it hardware with battery backed cache? If you write much to it it will help to have battery backed cache on board. If it's a megaraid / LSI board, get the megaraid2 driver, it's supposedly much faster. 4 disk IBM ServeRAID 5i with battery backed cache. You may find it hard to get postgresql to use any more memory than you have, as 32 bit apps can only address 2 gigs anyway, but the extra can certainly be used by the kernel as cache, which will help. Isn't that only true for each indivdual process space. Shouldn't each process have access at most 2GB. If each backend is in it's own process space is this really a limit since all of my queries are pretty small. I have been monitoring the system has it gets up to load. For most of the time the sytem sits around 100-300 connections. Once it ramps up it ramps up hard. Top starts cycling at 0 and 133% CPU for irq, softirq and iowait. The system stays at 700 connections until users give up. I can watch bandwidth utilization drop to almost nothing right before the DB catches up. -- Kevin Barnard Speed Fulfillment and Call Center [EMAIL PROTECTED] 214-258-0120 ---(end of broadcast)--- TIP 8: explain analyze is your friend