Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!

2004-03-24 Thread Mark Kirkwood


Josh Berkus wrote:

Forget hyperthreading. Look at their postgresql.conf settings. 8mb shared

mem, 16mb sort mem per connection for 512 connections, default 
effective_cache_size.

 

Umm...its 64Mb shared buffers isn't it ?

However agree completely with general thrust of message particularly 
the 16Mb of sort mem / connection - a very bad idea unless you are 
running a data warehouse box for only a few users (not 512 of them...)

regards

Mark

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!

2004-03-24 Thread Mark Kirkwood
Darcy Buskermolen wrote:

--  Forwarded Message  --

Subject: FreeBSD, PostgreSQL, semwait and sbwait!
Date: March 23, 2004 12:02 pm
From: Jason Coene [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Hello all,

We're having a substantial problem with our FreeBSD 5.2 database server
running PostgreSQL - it's getting a lot of traffic (figure about 3,000
queries per second), but queries are slow, and it's seemingly waiting on
other things than CPU time
 

Could this be a 5.2 performance issue ?

In spite of certain areas where the 5.x series performance is known to 
be much better than 4.x (e.g networking), this may not be manifested in 
practice for a complete application.
(e.g. I am still running 4.9 as it outperformed 5.1 vastly for a ~100 
database sessions running queries - note that I have not tried out 5.2, 
so am happy to be corrected on this)

regards

Mark

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Benchmarking postgres on Solaris/Linux

2004-03-24 Thread Mark Kirkwood


Josh Berkus wrote:

Mark,

 

It might be worth considering Apple if you want a 64-bit chip that has a
clock speed comparable to Intel's - the Xserv is similarly priced to Sun
V210 (both dual cpu 1U's).
   

Personally I'd stay *far* away from the XServs until Apple learns to build 
some real server harware.The current XServs have internal parts more 
appropriate to a Dell desktop (promise controller, low-speed commodity IDE 
drives), than a server.

If Apple has prices these IU desktop machines similar to Sun, then I sense 
doom ahead for the Apple Server Division.

 

(thinks...) Point taken - the Xserv is pretty entry level...

However, having recently benchmarked  a 280R vs a PIII Dell using a 
Promise ide raid controller - and finding the Dell comparable (with 
write cache *disabled*), I suspect that the Xserv has a pretty good 
chance of outperforming a V210  (certainly would be interesting to try 
out)

What I think has happened is that over the last few years then cheap / 
slow ide stuff has gotten pretty fast - even when you make write mean 
write

cheers

Mark

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] slow vacuum performance

2004-03-24 Thread pginfo
Hi,

I am running pg 7.4.1 on linux box.
I have a midle size DB with many updates and after it I try to run
vacuum full analyze.
It takes about 2 h.
If I try to dump and reload the DB it take 20 min.

How can I improve the vacuum full analyze time?

My configuration:

shared_buffers = 15000  # min 16, at least max_connections*2,
8KB each
sort_mem = 1# min 64, size in KB
vacuum_mem = 32000  # min 1024, size in KB
effective_cache_size = 4# typically 8KB each
#max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each

#max_fsm_relations = 1000   # min 100, ~50 bytes each


regards,
ivan.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-24 Thread Matt Clark
  Now if these vendors could somehow eliminate downtime due to human error
  we'd be talking *serious* reliablity.

 You mean making the OS smart enough to know when clearing the arp
 cache is a bonehead operation, or just making the hardware smart
 enough to realise that the keyswitch really shouldn't be turned
 while 40 people are logged in?  (Either way, I agree this'd be an
 improvement.  It'd sure make colocation a lot less painful.)

Well I was joking really, but those are two very good examples!  Yes, machines should 
require extra confirmation for operations like
those.  Hell, even a simple 'init 0' would be well served by a prompt that says There 
are currently 400 network sockets open, 50
remote users logged in, and 25 disk IOs per second.  What's more, there's nobody 
logged in at the console to boot me up again
afterwards - are you _sure_ you want to shut the machine down?.  It's also crazy that 
there's no prompt after an 'rm -rf' (we could
have 'rm -rf --iacceptfullresponsibility' for an unprompted version).

Stuff like that would have saved me from a few embarrassments in the past for sure ;-)

It drives me absolutely nuts every time I see a $staggeringly_expensive clustered 
server whose sysadmins are scared to do a failover
test in case something goes wrong!  Or which has worse uptime than my desktop PC 
because the cluster software's poorly set up or
administered.  Or which has both machines on the same circuit breaker.  I could go on 
but it's depressing me.

Favourite anecdote:  A project manager friend of mine had a new 'lights out' 
datacenter to set up.  The engineers, admins and
operators swore blind that everything had been tested in every possible way, and that 
incredible uptime was guaranteed.  'So if I
just pull this disk out everything will keep working?' he asked, and then pulled the 
disk out without waiting for an answer...

Ever since he told me that story I've done exactly that with every piece of so-called 
'redundant' hardware a vendor tries to flog
me.  Ask them to set it up, then just do nasty things to it without asking for 
permission.  Less than half the gear makes it through
that filter, and actually you can almost tell from the look on the technical sales 
rep's face as you reach for the
drive/cable/card/whatever whether it will or won't.

M






---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Help with query plan inconsistencies

2004-03-24 Thread George Woodring
I currently have it set up to vacuum/analyze every 2 hours.  However my
QUERY PLAN #1  #2 in my example I ran my explain immediately after a
vacuum/analyze.



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Joseph
Shraibman
Sent: Tuesday, March 23, 2004 2:17 PM
To: [EMAIL PROTECTED]
Subject: Re: [PERFORM] Help with query plan inconsistencies


I'm going to ask because someone else surely will:

Do you regularily vacuum/analyze the database?

Woody Woodring wrote:
 Hello,
 
 I am using postgres 7.4.2 as a backend for geocode data for a mapping 
 application.  My question is why can't I get a consistent use of my 
 indexes during a query, I tend to get a lot of seq scan results.
 
 I use a standard query:
 
 SELECT lat, long, mac, status FROM (
SELECT text(mac) as mac, lat long, CASE status WHEN 0 THEN 0 WHEN 1 
 THEN 1 ELSE -1 END
   as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) 
 WHERE boxtype='d' )AS FOO WHERE (long=X1) AND (long=X2) AND 
 (lat=Y1) AND (lat=Y2)
 
 Where X1,X2,Y1,Y2 are the coordinates for the rectangle of the map 
 viewing area.
 
 QUERY PLAN #1  #2 are from when I get a view from 10 miles out, 
 sometimes it uses the index(#1) and most of the time not(#2).  I do 
 run into plans that seq scan both sides of the join.
 
 QUERY PLAN #3 is when I view from 5 miles out, and I have much greater 
 chance of getting index scans ( about 90% of the time).
 
 I have listed information about the database below.
 
 Cable_billing ~500,000 rows updated once per day
 Davic  ~500,000 rows, about 100 rows update per minute
 
 Any info or suggestions would be appreciated.
 
 Woody
 
 
 twc-ral-overview=# \d cable_billing;
  Table public.cable_billing
  Column  |  Type  | Modifiers 
 -++---
  cable_billingid | integer| not null
  mac | macaddr| not null
  account | integer| 
  number  | character varying(10)  | 
  address | character varying(200) | 
  region  | character varying(30)  | 
  division| integer| 
  franchise   | integer| 
  node| character varying(10)  | 
  lat | numeric| 
  long| numeric| 
  trunk   | character varying(5)   | 
  ps  | character varying(5)   | 
  fd  | character varying(5)   | 
  le  | character varying(5)   | 
  update  | integer| 
  boxtype | character(1)   | 
 Indexes: cable_billing_pkey primary key btree (mac),
  cable_billing_account_index btree (account),
  cable_billing_lat_long_idx btree (lat, long),
  cable_billing_node_index btree (node),
  cable_billing_region_index btree (region)
 
 twc-ral-overview=# \d davic
 Table public.davic
  Column  | Type  | Modifiers 
 -+---+---
  davicid | integer   | not null
  mac | macaddr   | not null
  source  | character varying(20) | 
  status  | smallint  | 
  updtime | integer   | 
  type| character varying(10) | 
  avail1  | integer   | 
 Indexes: davic_pkey primary key btree (mac)
 
 
 
 twc-ral-overview=# vacuum analyze;
 VACUUM
 twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM 
 (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 
 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic 
 USING(mac) WHERE boxtype='d') AS foo  WHERE (long=-78.70723462816063) 
 AND
 (long=-78.53096764204116) AND (lat=35.5741118787) AND
 (lat=35.66366331376857);
 QUERY PLAN #1
 
 --
 --


 -
  Nested Loop Left Join  (cost=0.00..23433.18 rows=1871 width=34) (actual
 time=0.555..5095.434 rows=3224 loops=1)
-  Index Scan using cable_billing_lat_long_idx on cable_billing
 (cost=0.00..12145.85 rows=1871 width=32) (actual time=0.431..249.931
 rows=3224 loops=1)
  Index Cond: ((lat = 35.5741118787) AND (lat =
 35.66366331376857) AND (long = -78.70723462816063) AND (long =
 -78.53096764204116))
  Filter: (boxtype = 'd'::bpchar)
-  Index Scan using davic_pkey on davic  (cost=0.00..6.01 rows=1
 width=8) (actual time=1.476..1.480 rows=1 loops=3224)
  Index Cond: (outer.mac = davic.mac)
  Total runtime: 5100.028 ms
 (7 rows)
 
 
 
 twc-ral-overview=# vacuum analyze;
 VACUUM
 twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM 
 (SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 
 THEN 1 ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic 
 USING(mac) WHERE boxtype='d') AS foo  

Re: [PERFORM] slow vacuum performance

2004-03-24 Thread Bill Moran
pginfo wrote:
Hi,

I am running pg 7.4.1 on linux box.
I have a midle size DB with many updates and after it I try to run
vacuum full analyze.
It takes about 2 h.
If I try to dump and reload the DB it take 20 min.
How can I improve the vacuum full analyze time?
How often are you vacuuming?  If you've gone a LONG time since the last vacuum,
it can take quite a while, to the point where a dump/restore is faster.
A recent realization that I've had some misconceptions about vacuuming led me
to re-read section 8.2 of the admin guide (on vacuuming) ... I highly suggest
a review of these 3 pages of the admin manual, as it contains an excellent
description of why databases need vacuumed, that one can use to determine how
often vacuuming is necessary.
--
Bill Moran
Potential Technologies
http://www.potentialtech.com
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread pginfo
Hi Bill,
I am vacuuming every 24 h.
I have a cron script about i.
But if I make massive update (for example it affects 1 M rows) and I start vacuum,
it take this 2 h.
Also I will note, that this massive update is running in one transaction ( I can
not update 100K and start vacuum after it).

regards,
ivan.

Bill Moran wrote:

 pginfo wrote:
  Hi,
 
  I am running pg 7.4.1 on linux box.
  I have a midle size DB with many updates and after it I try to run
  vacuum full analyze.
  It takes about 2 h.
  If I try to dump and reload the DB it take 20 min.
 
  How can I improve the vacuum full analyze time?

 How often are you vacuuming?  If you've gone a LONG time since the last vacuum,
 it can take quite a while, to the point where a dump/restore is faster.

 A recent realization that I've had some misconceptions about vacuuming led me
 to re-read section 8.2 of the admin guide (on vacuuming) ... I highly suggest
 a review of these 3 pages of the admin manual, as it contains an excellent
 description of why databases need vacuumed, that one can use to determine how
 often vacuuming is necessary.

 --
 Bill Moran
 Potential Technologies
 http://www.potentialtech.com




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread scott.marlowe
On Wed, 24 Mar 2004, pginfo wrote:

 Hi,
 
 I am running pg 7.4.1 on linux box.
 I have a midle size DB with many updates and after it I try to run
 vacuum full analyze.

Is there a reason to not use just regular vacuum / analyze (i.e. NOT 
full)?  

 It takes about 2 h.

Full vacuums, by their nature, tend to be a bit slow.  It's better to let 
the database achieve a kind of steady state with regards to number of 
dead tuples, and use regular vacuums to reclaim said space rather than a 
full vacuum.

 How can I improve the vacuum full analyze time?
 
 My configuration:
 
 shared_buffers = 15000  # min 16, at least max_connections*2,
 8KB each
 sort_mem = 1# min 64, size in KB

You might want to look at dropping sort_mem.  It would appear you've been 
going through the postgresql.conf file and bumping up numbers to see what 
works and what doesn't.  While most of the settings aren't too dangerous 
to crank up a little high, sort_mem is quite dangerous to crank up high, 
should you have a lot of people connected who are all sorting.  Note that 
sort_mem is a limit PER SORT, not per backend, or per database, or per 
user, or even per table, but per sort.  IF a query needs to run three or 
four sorts, it can use 3 or 4x sort_mem.  If a hundred users do this at 
once, they can then use 300 or 400x sort_mem.  You can see where I'm 
heading.

Note that for individual sorts in batch files, like import processes, you 
can bump up sort_mem with the set command, so you don't have to have a 
large setting in postgresql.conf to use a lot of sort mem when you need 
to, you can just grab it during that one session.

 vacuum_mem = 32000  # min 1024, size in KB

If you've got lots of memory, crank up vacuum_mem to the 200 to 500 meg 
range and see what happens.

For a good tuning guide, go here:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] atrocious update performance

2004-03-24 Thread Rosser Schwarz
Greg Spiegelberg wrote:

  Will advise.

After creating 100, 1K, 10K, 100K and 1M-row subsets of account.cust and
the corresponding rows/tables with foreign key constraints referring to
the table, I'm unable to reproduce the behavior at issue.

explain analyze looks like the following, showing the query run with the
join column indexed and not, respectively:

# explain analyze update test.cust100 set prodid = tempprod.prodid,
subprodid = tempprod.subprodid where origid = tempprod.debtid;
-- with index
QUERY PLAN
---
 Merge Join  (cost=0.00..25.64 rows=500 width=220) (actual
 time=0.241..13.091 rows=100 loops=1)
   Merge Cond: ((outer.origid)::text = (inner.debtid)::text)
   -  Index Scan using ix_origid_cust100 on cust100  (cost=0.00..11.50
   rows=500 width=204) (actual time=0.125..6.465 rows=100 loops=1)
   -  Index Scan using ix_debtid on tempprod  (cost=0.00..66916.71
   rows=4731410 width=26) (actual time=0.057..1.497 rows=101 loops=1)
 Total runtime: 34.067 ms
(5 rows)

-- without index
QUERY PLAN
--
 Merge Join  (cost=7.32..16.71 rows=100 width=220) (actual
 time=4.415..10.918 rows=100 loops=1)
   Merge Cond: ((outer.debtid)::text = inner.?column22?)
   -  Index Scan using ix_debtid on tempprod  (cost=0.00..66916.71
   rows=4731410 width=26) (actual time=0.051..1.291 rows=101 loops=1)
   -  Sort  (cost=7.32..7.57 rows=100 width=204) (actual
   time=4.311..4.450 rows=100 loops=1)
 Sort Key: (cust100.origid)::text
 -  Seq Scan on cust100  (cost=0.00..4.00 rows=100 width=204)
 (actual time=0.235..2.615 rows=100 loops=1)
 Total runtime: 25.031 ms
(7 rows)

With the join column indexed, it takes roughly .32ms/row on the first
four tests (100.. 100K), and about .48ms/row on 1M rows.  Without the
index, it runs 100 rows @ .25/row, 1000 @ .26, 1 @ .27, 10 @
.48 and .5 @ 1M rows.

In no case does the query plan reflect foreign key validation.  Failing
any other suggestions for diagnosis in the soon, I'm going to nuke the
PostgreSQL install, scour it from the machine and start from scratch.
Failing that, I'm going to come in some weekend and re-do the machine.

 Problem is when I recreate the indexes and add the constraints back
 on ORIG I end up with the same long running process.  The original
 UPDATE runs for about 30 minutes on a table of 400,000 with the
 WHERE matching about 70% of the rows.  The above runs for about 2
 minutes without adding the constraints or indexes however adding the
 constraints and creating the dropped indexes negates any gain.

Is this a frequently-run update?

In my experience, with my seemingly mutant install, dropping indices
and constraints to shave 14/15 off the update time would be worth the
effort.  Just script dropping, updating and recreating into one large
transaction.  It's a symptom-level fix, but re-creating the fifteen
indices on one of our 5M row tables doesn't take 28 minutes, and your
hardware looks to be rather less IO and CPU bound than ours.  I'd also
second Tom's suggestion of moving to 7.4.

/rls


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Help with query plan inconsistencies

2004-03-24 Thread Richard Huxton
On Tuesday 23 March 2004 18:49, Woody Woodring wrote:
 Hello,

 I am using postgres 7.4.2 as a backend for geocode data for a mapping
 application.  My question is why can't I get a consistent use of my indexes
 during a query, I tend to get a lot of seq scan results.

I'm not sure it wants to be using the indexes all of the time.

  Nested Loop Left Join  (cost=0.00..23433.18 rows=1871 width=34) (actual
 time=0.555..5095.434 rows=3224 loops=1)
  Total runtime: 5100.028 ms

  Nested Loop Left Join  (cost=0.00..76468.90 rows=9223 width=34) (actual
 time=0.559..17387.427 rows=19997 loops=1)
  Total runtime: 17416.501 ms

  Nested Loop Left Join  (cost=0.00..29160.02 rows=2327 width=34) (actual
 time=0.279..510.773 rows=5935 loops=1)
  Total runtime: 516.782 ms

#1 = 630 rows/sec (with index on cable_billing)
#2 = 1,148 rows/sec (without index)
#3 = 11,501 rows/sec (with index)

The third case is so much faster, I suspect the data wasn't cached at the 
beginning of this run.

In any case #2 is faster than #1. If the planner is getting things wrong, 
you're not showing it here.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-24 Thread Vivek Khera
 SS == Stalin Subbiah Subbiah writes:

SS We are looking into Sun V210 (2 x 1 GHz cpu, 2 gig ram, 5.8Os)
SS vs. Dell 1750 (2 x 2.4 GHz xeon, 2 gig ram, RH3.0). database will
SS mostly be write intensive and disks will be on raid 10. Wondering
SS if 64bit 1 GHz to 32bit 2.4 GHz make a big difference here.

Spend all your money speeding up your disk system.  If you're mostly
writing (like my main app) then that's your bottleneck.  I use a dell
2650 with external RAID 5 on 14 spindles.  I didn't need that much
disk space, but went for  maxing out the number of spindles.  RAID 5
was faster than RAID10 or RAID50 with this configuration for me.


-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D.Khera Communications, Inc.
Internet: [EMAIL PROTECTED]   Rockville, MD  +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera   http://www.khera.org/~vivek/

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread pginfo
Hi,

scott.marlowe wrote:

 On Wed, 24 Mar 2004, pginfo wrote:

  Hi,
 
  I am running pg 7.4.1 on linux box.
  I have a midle size DB with many updates and after it I try to run
  vacuum full analyze.

 Is there a reason to not use just regular vacuum / analyze (i.e. NOT
 full)?


Yes, in case I make massive updates (only in my case of cource)   for example
2 M rows, I do not expect to have 2M new rows in next 180 days.That is the
reaso for running vacuum full.
My idea was to free unneedet space and so to have faster system.
It is possible that I am wrong.


  It takes about 2 h.

 Full vacuums, by their nature, tend to be a bit slow.  It's better to let
 the database achieve a kind of steady state with regards to number of
 dead tuples, and use regular vacuums to reclaim said space rather than a
 full vacuum.

  How can I improve the vacuum full analyze time?
 
  My configuration:
 
  shared_buffers = 15000  # min 16, at least max_connections*2,
  8KB each
  sort_mem = 1# min 64, size in KB

 You might want to look at dropping sort_mem.  It would appear you've been
 going through the postgresql.conf file and bumping up numbers to see what
 works and what doesn't.  While most of the settings aren't too dangerous
 to crank up a little high, sort_mem is quite dangerous to crank up high,
 should you have a lot of people connected who are all sorting.  Note that
 sort_mem is a limit PER SORT, not per backend, or per database, or per
 user, or even per table, but per sort.  IF a query needs to run three or
 four sorts, it can use 3 or 4x sort_mem.  If a hundred users do this at
 once, they can then use 300 or 400x sort_mem.  You can see where I'm
 heading.

 Note that for individual sorts in batch files, like import processes, you
 can bump up sort_mem with the set command, so you don't have to have a
 large setting in postgresql.conf to use a lot of sort mem when you need
 to, you can just grab it during that one session.


I know. In my case we are using many ID's declared as varchar/name (I know it
is bad idea, butwe are migrating this system from oracle) and pg have very
bad performance with varchar/name indexes.
The only solution I found was to increase the sort mem.
But, I wll try to decrease this one and to see the result.

  vacuum_mem = 32000  # min 1024, size in KB

 If you've got lots of memory, crank up vacuum_mem to the 200 to 500 meg
 range and see what happens.


I wil try it today. It is good idea and hope it will help.

 For a good tuning guide, go here:

 http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

  I know it. It is the best I found and also the site.

Thanks for the help.
ivan.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread scott.marlowe
On Wed, 24 Mar 2004, pginfo wrote:

 Hi,
 
 scott.marlowe wrote:
 
  On Wed, 24 Mar 2004, pginfo wrote:
 
   Hi,
  
   I am running pg 7.4.1 on linux box.
   I have a midle size DB with many updates and after it I try to run
   vacuum full analyze.
 
  Is there a reason to not use just regular vacuum / analyze (i.e. NOT
  full)?
 
 
 Yes, in case I make massive updates (only in my case of cource)   for example
 2 M rows, I do not expect to have 2M new rows in next 180 days.That is the
 reaso for running vacuum full.
 My idea was to free unneedet space and so to have faster system.
 It is possible that I am wrong.

It's all about percentages.  If you've got an average of 5% dead tuples 
with regular vacuuming, then full vacuums won't gain you much, if 
anything.  If you've got 20 dead tuples for each live one, then a full 
vacuum is pretty much a necessity.  The generally accepted best 
performance comes with 5 to 50% or so dead tuples.  Keep in mind, having a 
few dead tuples is actually a good thing, as your database won't grow then 
srhink the file all the time, but keep it in a steady state size wise.




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread pginfo


scott.marlowe wrote:

 On Wed, 24 Mar 2004, pginfo wrote:

  Hi,
 
  scott.marlowe wrote:
 
   On Wed, 24 Mar 2004, pginfo wrote:
  
Hi,
   
I am running pg 7.4.1 on linux box.
I have a midle size DB with many updates and after it I try to run
vacuum full analyze.
  
   Is there a reason to not use just regular vacuum / analyze (i.e. NOT
   full)?
  
 
  Yes, in case I make massive updates (only in my case of cource)   for example
  2 M rows, I do not expect to have 2M new rows in next 180 days.That is the
  reaso for running vacuum full.
  My idea was to free unneedet space and so to have faster system.
  It is possible that I am wrong.

 It's all about percentages.  If you've got an average of 5% dead tuples
 with regular vacuuming, then full vacuums won't gain you much, if
 anything.  If you've got 20 dead tuples for each live one, then a full
 vacuum is pretty much a necessity.  The generally accepted best
 performance comes with 5 to 50% or so dead tuples.  Keep in mind, having a
 few dead tuples is actually a good thing, as your database won't grow then
 srhink the file all the time, but keep it in a steady state size wise.

thanks for the good analyze,ivan.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])