Re: [PERFORM] This query is still running after 10 hours...

2004-09-28 Thread Kevin Barnard
What does observations_trigger do?



On Tue, 28 Sep 2004 08:19:57 -0600, Robert Creager
[EMAIL PROTECTED] wrote:

 Help?

 Normally, this query takes from 5 minutes to 2 hours to run.  On this update, it's 
 been running for more than 10 hours.

 Can it be helped?

 UPDATE obs_v
 SET mag = obs_v.imag + zp.zero_v + cg.color_v * (obs_v.imag - i.imag),
 use = true
 FROM color_groups AS cg, zero_pair AS zp, obs_i AS i, files AS f, groups AS g
 WHERE  obs_v.star_id = i.star_id
AND obs_v.file_id = f.file_id
AND cg.group_id = g.group_id
AND g.night_id = f.night_id
AND g.group_id = $group_id
AND zp.pair_id = f.pair_id

 Hash Join  (cost=130079.22..639663.94 rows=1590204 width=63)
   Hash Cond: (outer.star_id = inner.star_id)
   -  Seq Scan on obs_i i  (cost=0.00..213658.19 rows=10391319 width=8)
   -  Hash  (cost=129094.19..129094.19 rows=77211 width=59)
 -  Nested Loop  (cost=250.69..129094.19 rows=77211 width=59)
   -  Hash Join  (cost=250.69..307.34 rows=67 width=12)
 Hash Cond: (outer.pair_id = inner.pair_id)
 -  Seq Scan on zero_pair zp  (cost=0.00..43.32 rows=2532 
 width=8)
 -  Hash  (cost=250.40..250.40 rows=118 width=12)
   -  Hash Join  (cost=4.80..250.40 rows=118 width=12)
 Hash Cond: (outer.night_id = inner.night_id)
 -  Seq Scan on files f  (cost=0.00..199.28 
 rows=9028 width=12)
 -  Hash  (cost=4.80..4.80 rows=1 width=8)
   -  Nested Loop  (cost=0.00..4.80 rows=1 
 width=8)
 -  Seq Scan on color_groups cg  
 (cost=0.00..2.84 rows=1 width=8)
   Filter: (171 = group_id)
 -  Seq Scan on groups g  
 (cost=0.00..1.95 rows=1 width=8)
   Filter: (group_id = 171)
   -  Index Scan using obs_v_file_id_index on obs_v  (cost=0.00..1893.23 
 rows=2317 width=51)
 Index Cond: (obs_v.file_id = outer.file_id)

 Table definitions:

 tassiv=# \d color_groups
   Table public.color_groups
 Column|  Type   |   Modifiers
 --+-+---
  group_id | integer | not null default 
 nextval('color_groups_group_id_seq'::text)
  color_u  | real|
  color_b  | real|
  color_v  | real|
  color_r  | real|
  color_i  | real|
  max_residual | real|
 Indexes:
 color_groups_pkey primary key, btree (group_id)
 color_group_group_id_index btree (group_id)

 tassiv=# \d zero_pair
Table public.zero_pair
  Column  |  Type   | Modifiers
 -+-+---
  pair_id | integer | not null
  zero_u  | real| default 0
  zero_b  | real| default 0
  zero_v  | real| default 0
  zero_r  | real| default 0
  zero_i  | real| default 0
 Indexes:
 zero_pair_pkey primary key, btree (pair_id)
 Foreign-key constraints:
 $1 FOREIGN KEY (pair_id) REFERENCES pairs(pair_id) ON DELETE CASCADE

 tassiv=# \d obs_v
 Table public.obs_v
  Column  |  Type   |   Modifiers
 -+-+
  x   | real| not null
  y   | real| not null
  imag| real| not null
  smag| real| not null
  loc | spoint  | not null
  obs_id  | integer | not null default nextval('obs_id_seq'::text)
  file_id | integer | not null
  use | boolean | default false
  solve   | boolean | default false
  star_id | integer |
  mag | real|
 Indexes:
 obs_v_file_id_index btree (file_id)
 obs_v_loc_index gist (loc)
 obs_v_obs_id_index btree (obs_id)
 obs_v_star_id_index btree (star_id)
 obs_v_use_index btree (use)
 Foreign-key constraints:
 obs_v_files_constraint FOREIGN KEY (file_id) REFERENCES files(file_id) ON 
 DELETE CASCADE
 obs_v_star_id_constraint FOREIGN KEY (star_id) REFERENCES catalog(star_id) ON 
 DELETE SET NULL
 Triggers:
 obs_v_trig BEFORE INSERT OR DELETE OR UPDATE ON obs_v FOR EACH ROW EXECUTE 
 PROCEDURE observations_trigger
 ()

 tassiv=# \d files
   Table public.files
   Column  |Type |   Modifiers
 --+-+---
  file_id  | integer | not null default 
 nextval('files_file_id_seq'::text)
  night_id | integer |
  pair_id  | integer |
  name | character varying   | not null
  date | timestamp without time zone |
 Indexes:
 files_pkey primary key, btree (file_id)
 files_name_key 

Re: [PERFORM] This query is still running after 10 hours...

2004-09-28 Thread Kevin Barnard
On Tue, 28 Sep 2004 20:21:40 -0600, Robert Creager
[EMAIL PROTECTED] wrote:
 
 The trigger keeps another table (catalog) up to date with the information from the 
 obs_v and obs_i tables.  There are no direct insert/update/delete's on the catalog 
 table, only though the trigger.
 

It's possible that the update to catalog is what is really taking a
long time.  You might wish to try and explain that query just to make
sure.  You might also wish to disable to trigger just to rule it out. 
Does catalog have any triggers on it?  Does it have any foreign keys?

I've shot myself in the foot on this before which is the only reason I
ask about it.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Table UPDATE is too slow

2004-09-05 Thread Kevin Barnard
Do all of the commands to swap tables in a transaction.  The table
gets locked briefly but should have a lot less impact then the update
command.


On Mon, 06 Sep 2004 01:28:04 +0200, Marinos J. Yannikos [EMAIL PROTECTED] wrote:
 
 That said, I'm not entirely sure how well postgres' client libraries can
 deal with tables being renamed while in use, perhaps someone can shed
 some light on this.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Anyone familiar with Apple Xserve RAID

2004-08-26 Thread Kevin Barnard




Actually you are both are right and wrong. The XRaid uses FibreChannel
to communicate to the host machine(s). The Raid controller is a
FibreChannel controller. After that there is a FibreChannel to
UltraATA conversion for each drive, separate ATA bus for each drive.

What I am curious about is if this setup gets around ATA fsync
problems, where the drive reports the write before it is actually
performed.


Josh Berkus wrote:

  Guys,

  
  
the XServe/XRaid comes with FibreChannel

  
  
I stand corrected.   That should help things some; it makes it more of a small 
tradeoff between performance and storage size for the drives.

  


-- 
Kevin Barnard
Speed Fulfillment and Call Center
[EMAIL PROTECTED]
214-258-0120




Re: [PERFORM] Timestamp-based indexing

2004-07-26 Thread Kevin Barnard

Harmon S. Nine wrote:
monitor=# explain analyze select * from eventtable where timestamp  
CURRENT_TIMESTAMP - INTERVAL '10 minutes';
QUERY PLAN
Try
SELECT * FROM eventtable where timestamp BETWEEN  (CURRENT_TIMESTAMP - 
INTERVAL '10 minutes') AND CURRENT_TIMESTAMP;

This should will use a range off valid times.  What your query is doing 
is looking for 10 minutes ago to an infinate future.  Statically 
speaking that should encompass most of the table because you have an 
infinate range.  No index will be used.  If you assign a range the 
planner can fiqure out what you are looking for.

--
Kevin Barnard
Speed Fulfillment and Call Center
[EMAIL PROTECTED]
214-258-0120
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] atrocious update performance

2004-04-05 Thread Kevin Barnard
On 5 Apr 2004 at 12:05, Rosser Schwarz wrote:

 Just this morning, however, I created a copy of the target table (all
 4.7M rows), with absolutely no foreign keys referring to it, and ran
 the update against the copy.  That update took 2300 seconds.  The
 join columns were indexed in both cases.

Have you added indexes for the custid column for tables account.acct accunt.orgacct 
and note?

I haven't followed the entire thread but it you have cascading FK on those tables 
without an index on the column that could cause your delay.

Kevin Barnard
SpeedFC



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] FreeBSD config

2004-02-26 Thread Kevin Barnard



On 26 Feb 2004 at 13:58, Dror Matalon wrote:

 
 which brings me back to my question why not make Freebsd use more of its
 memory for disk caching and then tell postgres about it. 
 


I think there is some confusion about maxbufsize and hibufspace. I looking at a 
comment in the FreeBSB source 4.9 that explains this. I think you will want to 
increase effective_cache to match maxbufsize not hibufspace but I could be wrong.


$FreeBSD: src/sys/kern/vfs_bio.c,v 1.242.2.21 line 363





Re: [PERFORM] Increasing number of PG connections.

2004-02-03 Thread Kevin Barnard
On 2 Feb 2004 at 16:45, scott.marlowe wrote:

 Do you have the cache set to write back or write through?  Write through 
 can be a performance killer.  But I don't think your RAID is the problem, 
 it looks to me like postgresql is doing a lot of I/O.  When you run top, 
 do the postgresql processes show a lot of D status? That's usually waiting 
 on I/O
 

Actually I'm not sure.  It's setup with the factory defaults from IBM.  Actually when 
I 
start hitting the limit I was surprised to find only a few D status indicators.  Most 
of the 
processes where sleeping.

 what you want to do is get the machine to a point where the kernel cache 
 is about twice the size or larger, than the shared_buffers.  I'd start at 
 1 shared buffers and 4096 sort mem and see what happens.  If you've 
 still got 2 gig kernel cache at that point, then increase both a bit (2x 
 or so) and see how much kernel cache you've got.  If your kernel cache 
 stays above 1Gig, and the machine is running faster, you're doing pretty 
 good.
 

I've set  shared to 1 and sort to 4096.  I just have to wait until the afternoon 
before I see system load start to max out.  Thanks for the tips I'm crossing my 
fingers.

--
Kevin Barnard


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] Increasing number of PG connections.

2004-02-02 Thread Kevin Barnard
I am running a Dual Xeon hyperthreaded server with 4GB RAM RAID-5.  The only 
thing running on the server is Postgres running under Fedora.  I have a 700 
connection limit.

The DB is setup as a backend for a very high volume website.  Most of the queries 
are simple, such as logging accesses, user login verification etc.  There are a few 
bigger things suchas reporting etc but for the most part each transaction lasts less 
then a second.  The connections are not persistant (I'm using pg_connect in PHP)

The system was at 2 GB with a 400 connection limit.  We ran into problems because 
we hit the limit of connections during high volume.

1.  Does 400 connections sound consistant with the 2GB of RAM?  Does 700 sound 
good with 4 GB.  I've read a little on optimizing postgres.  Is there anything else I 
can 
do maybe OS wise to increase how many connections I get before I start swapping?

2.  Are there any clustering technologies that will work with postgres?  Specifically 
I'm 
looking at increasing the number of connections.

The bottom line is since the website launched (middle of January) we have increased 
the number of http connections, and increased bandwidth allowances by over 10 
times.  The site continues to grow and we are looking at our options.  Some of the 
ideas have been possible DB replication.   Write to master and read from multiple 
slaves.  Other ideas including increasing hardware.

This is the biggest site I have ever worked with.  Almost everything else fits in a T1 
with a single DB server handling multiple sites.  Does anybody with experence in this 
realm have any suggestions?

Thank you in advance for whatever help you can provide.
--
Kevin Barnard



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Increasing number of PG connections.

2004-02-02 Thread Kevin Barnard
On 2 Feb 2004 at 13:58, scott.marlowe wrote:

 what do you mean at 2 GB?  Is that how much is in kernel cache plus 
 buffer, plus used, plus etc???  Could you give us the top of top output to 
 make sure?  If most of that is kernel cache, then that's fine.  

2GB was total system memory.  We upgraded to 4GB to prior to increasing the 
number of connections.

Here's the top of top

 16:14:17  up 2 days, 16:15,  1 user,  load average: 7.60, 6.56, 4.61
730 processes: 721 sleeping, 9 running, 0 zombie, 0 stopped
CPU states:  cpuusernice  systemirq  softirq  iowaitidle
   total0.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
   cpu000.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
   cpu010.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
   cpu020.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
   cpu030.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
Mem:  3747644k av, 3298344k used,  449300k free,   0k shrd,  147880k buff
  2158532k active, 760040k inactive
Swap: 1048088k av,   0k used, 1048088k free 2262156k cached


The DB is pretty close to max connections at this point in time.  I don't know why 
CPU shows 0% in every bucket.  It looks like I can increase the number of 
connections a little from here.  This is a fairly standard Fedora install.  It's using 
version 2.4.22 of the Kernel.  Postgres is a complied version using 7.4.1

 experience has been that individual postgresql backends only weigh in at a 
 mega byte at most, and they share buffer, so 700 connections can be 
 anywhere from 300meg to 1 gig.  the rest would be buffer memory.  It's not 
 a good idea to give up too much to shared buffers, as the database isn't 
 as good at caching as the kernel.

OK I take this as I should keep shared buffers around 2x connections then correct?

 
 What do you have in postgresql.conf?  sort_mem, shared_buffers, etc???

Here is what I have that is not set from the defaults.

max_connections = 700
shared_buffers = 1500
sort_mem = 512
random_page_cost = 2
stats_start_collector = true
stats_command_string = true
stats_block_level = true
stats_row_level = true


 sort_mem can be a real killer if it lets the processes chew up too much 
 memory.  Once sort_mem gets high enough to make the machine start swapping 
 it is doing more harm than good being that high, and should usually be 
 lowered a fair bit.

I dropped it down to 512 as you can see.  Should I be running with all of the stats 
on?  
I am no longer using pg_autovacuum.  I seem to be getting better results with an 
hourly Vacuum anaylse.

 How many disks in your RAID5?  The more the better.  Is it hardware with 
 battery backed cache?  If you write much to it it will help to have 
 battery backed cache on board.  If it's a megaraid / LSI board, get the 
 megaraid2 driver, it's supposedly much faster.

4 disk IBM ServeRAID 5i with battery backed cache.

 You may find it hard to get postgresql to use any more memory than you 
 have, as 32 bit apps can only address 2 gigs anyway, but the extra can 
 certainly be used by the kernel as cache, which will help.

Isn't that only true for each indivdual process space.  Shouldn't each process have 
access at most 2GB.  If each backend is in it's own process space is this really a 
limit 
since all of my queries are pretty small.

I have been monitoring the system has it gets up to load.  For most of the time the 
sytem sits around 100-300 connections.  Once it ramps up it ramps up hard.  Top 
starts cycling at 0 and 133% CPU for irq, softirq and iowait.  The system stays at 700 
connections until users give up.  I can watch bandwidth utilization drop to almost 
nothing right before the DB catches up.

--
Kevin Barnard
Speed Fulfillment and Call Center
[EMAIL PROTECTED]
214-258-0120

---(end of broadcast)---
TIP 8: explain analyze is your friend