date:20070619


Tom Lane wrote:

Karl Wright <[EMAIL PROTECTED]> writes:
- At any given time, there are up to 100 of these operations going on at 
once against the same database.


It sounds like your hardware is far past "maxed out".  Which is odd
since tables with a million or so rows are pretty small for modern
hardware.  What's the CPU and disk hardware here, exactly?  What do you
see when watching vmstat or iostat (as appropriate for OS, which you
didn't mention either)?

regards, tom lane



Yes, I was surprised as well, which is why I decided to post.

The hardware is a Dell 2950, two processor, dual-core each processor, 16 
GB memory, with a RAID disk controller.  The operating system is Debian 
Linux (sarge plus mods, currently using the Postgresql 8.1 backport).


Also, as I said before, I have done extensive query analysis and found 
that the plans for the queries that are taking a long time are in fact 
very reasonable.  Here's an example from the application log of a query 
that took way more time than its plan would seem to indicate it should:


>>
[2007-06-18 09:39:49,783]ERROR Found a query that took more than a 
minute: [UPDATE intrinsiclink SET isnew=? WHERE ((jobid=? AND 
childidhash=? AND childid=?)) AND (isnew=? OR isnew=?)]

[2007-06-18 09:39:49,783]ERROR   Parameter 0: 'B'
[2007-06-18 09:39:49,783]ERROR   Parameter 1: '1181766706097'
[2007-06-18 09:39:49,783]ERROR   Parameter 2: 
'7E130F3B688687757187F1638D8776ECEF3009E0'
[2007-06-18 09:39:49,783]ERROR   Parameter 3: 
'http://norwich.openguides.org/?action=index;index_type=category;index_value=Cafe;format=atom'

[2007-06-18 09:39:49,783]ERROR   Parameter 4: 'E'
[2007-06-18 09:39:49,783]ERROR   Parameter 5: 'N'
[2007-06-18 09:39:49,797]ERROR  Plan: Index Scan using i1181764142395 on 
intrinsiclink  (cost=0.00..14177.29 rows=5 width=253)
[2007-06-18 09:39:49,797]ERROR  Plan:   Index Cond: ((jobid = $2) AND 
((childidhash)::text = ($3)::text))
[2007-06-18 09:39:49,797]ERROR  Plan:   Filter: ((childid = ($4)::text) 
AND ((isnew = ($5)::bpchar) OR (isnew = ($6)::bpchar)))

[2007-06-18 09:39:49,797]ERROR
<<
(The intrinsiclink table above is the "child table" I was referring to 
earlier, with 13,000,000 rows at the moment.)


Overnight I shut things down and ran a VACUUM operation to see if that 
might help.  I'll post again when I find out if indeed that changed any 
performance numbers.  If not, I'll be able to post vmstat output at that 
time.


Karl




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Hardware suggestions

2007-06-19 Thread Claus Guttesen


At our institute we are currently establishing a small GIS working group.
The data storage for vector data should be the central PostGIS system.
Raster data will be held in file system.
Mostly the users are accessing the data base in read only mode. From the
client side there is not much write access this only will be done by the
admin of the system to load new datasets. A prototype is currently running
on an old desktop pc with ubuntu dapper - not very powerfull, of course!
We have about 1 € to spend for a new server including the storage. Do
you have any recommendations for us?


When it comes to server-hardware I'd go for intel's dual-core
(woodcrest) or quad-core. They seem to perform better atm. compared to
opterons.

--
regards
Claus

When lenity and cruelty play for a kingdom,
the gentlest gamester is the soonest winner.

Shakespeare

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] dbt2 NOTPM numbers

2007-06-19 Thread Jim Nasby


On Jun 13, 2007, at 11:43 AM, Markus Schiltknecht wrote:
In the mean time, I've figured out that the box in question peaked  
at about 1450 NOTPMs with 120 warehouses with RAID 1+0. I'll try to  
compare again to RAID 6.


Is there any place where such results are collected?


There is the ill-used -benchmarks list, but perhaps it would be  
better if we setup a wiki for this...

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance query about large tables, lots of concurrent access

An overnight VACUUM helped things quite a bit.  I am now getting 
throughput of around 75 transactions per minute, where before I was 
getting 30.  Also, the CPU is no longer pegged, and the machines load 
average has dropped to an acceptable 6-10 from somewhere above 20.


While this is still pretty far off the best performance I saw (when the 
tables were smaller), it's reasonably consistent with O(log(n)) 
performance at least.


This particular run lasted four days before a VACUUM became essential. 
The symptom that indicates that VACUUM is needed seems to be that the 
CPU usage of any given postgresql query skyrockets.  Is this essentially 
correct?


Karl

Karl Wright wrote:

Tom Lane wrote:

Karl Wright <[EMAIL PROTECTED]> writes:
- At any given time, there are up to 100 of these operations going on 
at once against the same database.


It sounds like your hardware is far past "maxed out".  Which is odd
since tables with a million or so rows are pretty small for modern
hardware.  What's the CPU and disk hardware here, exactly?  What do you
see when watching vmstat or iostat (as appropriate for OS, which you
didn't mention either)?

regards, tom lane



Yes, I was surprised as well, which is why I decided to post.

The hardware is a Dell 2950, two processor, dual-core each processor, 16 
GB memory, with a RAID disk controller.  The operating system is Debian 
Linux (sarge plus mods, currently using the Postgresql 8.1 backport).


Also, as I said before, I have done extensive query analysis and found 
that the plans for the queries that are taking a long time are in fact 
very reasonable.  Here's an example from the application log of a query 
that took way more time than its plan would seem to indicate it should:


 >>
[2007-06-18 09:39:49,783]ERROR Found a query that took more than a 
minute: [UPDATE intrinsiclink SET isnew=? WHERE ((jobid=? AND 
childidhash=? AND childid=?)) AND (isnew=? OR isnew=?)]

[2007-06-18 09:39:49,783]ERROR   Parameter 0: 'B'
[2007-06-18 09:39:49,783]ERROR   Parameter 1: '1181766706097'
[2007-06-18 09:39:49,783]ERROR   Parameter 2: 
'7E130F3B688687757187F1638D8776ECEF3009E0'
[2007-06-18 09:39:49,783]ERROR   Parameter 3: 
'http://norwich.openguides.org/?action=index;index_type=category;index_value=Cafe;format=atom' 


[2007-06-18 09:39:49,783]ERROR   Parameter 4: 'E'
[2007-06-18 09:39:49,783]ERROR   Parameter 5: 'N'
[2007-06-18 09:39:49,797]ERROR  Plan: Index Scan using i1181764142395 on 
intrinsiclink  (cost=0.00..14177.29 rows=5 width=253)
[2007-06-18 09:39:49,797]ERROR  Plan:   Index Cond: ((jobid = $2) AND 
((childidhash)::text = ($3)::text))
[2007-06-18 09:39:49,797]ERROR  Plan:   Filter: ((childid = ($4)::text) 
AND ((isnew = ($5)::bpchar) OR (isnew = ($6)::bpchar)))

[2007-06-18 09:39:49,797]ERROR
<<
(The intrinsiclink table above is the "child table" I was referring to 
earlier, with 13,000,000 rows at the moment.)


Overnight I shut things down and ran a VACUUM operation to see if that 
might help.  I'll post again when I find out if indeed that changed any 
performance numbers.  If not, I'll be able to post vmstat output at that 
time.


Karl




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Volunteer to build a configuration tool

2007-06-19 Thread Ray Stell

On Mon, Jun 18, 2007 at 04:35:11PM -0700, Steve Atkins wrote:
> 
> On Jun 18, 2007, at 4:09 PM, [EMAIL PROTECTED] wrote:
> 
> The tricky bits are going to be defining the problem and creating the
> alogrithm to do the maths from input to output.

Why not methodically discuss the the alogrithms on pgsql-performance,
thus improving the chance of being on target up front.  Plus, us newbies
get to see what you are thinking thus expanding our universe.  I know I'd 
read every word.

Thanks for doing this, btw.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [pgsql-advocacy] [PERFORM] [ADMIN] Postgres VS Oracle

2007-06-19 Thread Jim Nasby


Can we please trim this down to just advocacy?

On Jun 18, 2007, at 1:17 PM, Joshua D. Drake wrote:


Jonah H. Harris wrote:

On 6/18/07, Andreas Kostyrka <[EMAIL PROTECTED]> wrote:

As a cynic, I might ask, what Oracle is fearing?

As a realist, I might ask, how many times do we have to answer this
type of anti-commercial-database flamewar-starting question?


Depends? How many times are you going to antagonize the people that  
ask?


1. It has *nothing* to do with anti-commercial. It is anti- 
proprietary which is perfectly legitimate.


2. Oracle, Microsoft, and IBM have a "lot" to fear in the sense of  
a database like PostgreSQL. We can compete in 90-95% of cases where  
people would traditionally purchase a proprietary system for many,  
many thousands (if not hundreds of thousands) of dollars.


Sincerely,

Joshua D. Drake

--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/ 
donate

PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of  
broadcast)---

TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that  
your

  message can get through to the mailing list cleanly



--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Performance query about large tables, lots of concurrent access

2007-06-19 Thread Alvaro Herrera

Karl Wright wrote:

> This particular run lasted four days before a VACUUM became essential. 
> The symptom that indicates that VACUUM is needed seems to be that the 
> CPU usage of any given postgresql query skyrockets.  Is this essentially 
> correct?

Are you saying you weren't used to run VACUUM all the time?  If so,
that's where the problem lies.

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/CTMLCN8V17R4
"Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad; jugar
al alfarero y ver qué formas se pueden sacar del otro" (C. Halloway en
La Feria de las Tinieblas, R. Bradbury)

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Performance query about large tables, lots of concurrent access


Alvaro Herrera wrote:

Karl Wright wrote:

This particular run lasted four days before a VACUUM became essential. 
The symptom that indicates that VACUUM is needed seems to be that the 
CPU usage of any given postgresql query skyrockets.  Is this essentially 
correct?


Are you saying you weren't used to run VACUUM all the time?  If so,
that's where the problem lies.



Postgresql 7.4 VACUUM runs for so long that starting it with a cron job 
even every 24 hours caused multiple instances of VACUUM to eventually be 
running in my case.  So I tried to find a VACUUM schedule that permitted 
each individual vacuum to finish before the next one started.  A vacuum 
seemed to require 4-5 days with this particular database - or at least 
it did for 7.4.  So I had the VACUUM schedule set to run every six days.


I will be experimenting with 8.1 to see how long it takes to complete a 
vacuum under load conditions tonight.


Karl


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Performance query about large tables, lots of concurrent access

"Karl Wright" <[EMAIL PROTECTED]> writes:

> This particular run lasted four days before a VACUUM became essential. The
> symptom that indicates that VACUUM is needed seems to be that the CPU usage of
> any given postgresql query skyrockets.  Is this essentially correct?

Postgres is designed on the assumption that VACUUM is run regularly. By
"regularly" we're talking of an interval usually on the order of hours, or
even less. On some workloads some tables need to be vacuumed every 5 minutes,
for example.

VACUUM doesn't require shutting down the system, it doesn't lock any tables or
otherwise prevent other jobs from making progress. It does add extra i/o but
there are knobs to throttle its i/o needs. The intention is that VACUUM run in
the background more or less continually using spare i/o bandwidth.

The symptom of not having run vacuum regularly is that tables and indexes
bloat to larger sizes than necessary. If you run "VACUUM VERBOSE" it'll tell
you how much bloat your tables and indexes are suffering from (though the
output is a bit hard to interpret).

Table and index bloat slow things down but not generally by increasing cpu
usage. Usually they slow things down by causing queries to require more i/o.

It's only UPDATES and DELETES that create garbage tuples that need to be
vacuumed though. If some of your tables are mostly insert-only they might need
to be vacuumed as frequently or at all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Performance query about large tables, lots of concurrent access

2007-06-19 Thread Alvaro Herrera

Karl Wright wrote:
> Alvaro Herrera wrote:
> >Karl Wright wrote:
> >
> >>This particular run lasted four days before a VACUUM became essential. 
> >>The symptom that indicates that VACUUM is needed seems to be that the 
> >>CPU usage of any given postgresql query skyrockets.  Is this essentially 
> >>correct?
> >
> >Are you saying you weren't used to run VACUUM all the time?  If so,
> >that's where the problem lies.
> 
> Postgresql 7.4 VACUUM runs for so long that starting it with a cron job 
> even every 24 hours caused multiple instances of VACUUM to eventually be 
> running in my case.  So I tried to find a VACUUM schedule that permitted 
> each individual vacuum to finish before the next one started.  A vacuum 
> seemed to require 4-5 days with this particular database - or at least 
> it did for 7.4.  So I had the VACUUM schedule set to run every six days.

How large is the database?  I must admit I have never seen a database
that took 4 days to vacuum.  This could mean that your database is
humongous, or that the vacuum strategy is wrong for some reason.

You know that you can run vacuum on particular tables, right?  It would
be probably a good idea to run vacuum on the most updated tables, and
leave alone those that are not or little updated (hopefully the biggest;
this would mean that an almost-complete vacuum run would take much less
than a whole day).

Or maybe vacuum was stuck waiting on a lock somewhere.

> I will be experimenting with 8.1 to see how long it takes to complete a 
> vacuum under load conditions tonight.

You can also turn autovacuum on in 8.1, which might help quite a bit
with finding a good vacuum schedule (you would need a bit of tuning it
though, of course).

In any case, if you are struggling for performance you are strongly
adviced to upgrade to 8.2.

-- 
Alvaro Herrera http://www.amazon.com/gp/registry/DXLWNGRJD34J
"No single strategy is always right (Unless the boss says so)"
  (Larry Wall)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Performance query about large tables, lots of concurrent access

2007-06-19 Thread Mark Lewis

On Tue, 2007-06-19 at 09:37 -0400, Karl Wright wrote:
> Alvaro Herrera wrote:
> > Karl Wright wrote:
> > 
> >> This particular run lasted four days before a VACUUM became essential. 
> >> The symptom that indicates that VACUUM is needed seems to be that the 
> >> CPU usage of any given postgresql query skyrockets.  Is this essentially 
> >> correct?
> > 
> > Are you saying you weren't used to run VACUUM all the time?  If so,
> > that's where the problem lies.
> > 
> 
> Postgresql 7.4 VACUUM runs for so long that starting it with a cron job 
> even every 24 hours caused multiple instances of VACUUM to eventually be 
> running in my case.  So I tried to find a VACUUM schedule that permitted 
> each individual vacuum to finish before the next one started.  A vacuum 
> seemed to require 4-5 days with this particular database - or at least 
> it did for 7.4.  So I had the VACUUM schedule set to run every six days.
> 
> I will be experimenting with 8.1 to see how long it takes to complete a 
> vacuum under load conditions tonight.

The longer you wait between vacuuming, the longer each vacuum is going
to take.  

There is of course a point of diminishing returns for vacuum where this
no longer holds true; if you vacuum too frequently the overhead of
running the vacuum will dominate the running time.  But 6 days for a
busy database is probably way, way, way past that threshold.

Generally, the busier the database the more frequently you need to
vacuum, not less.  If your update/delete transaction rate is high enough
then you may need to vacuum multiple times per hour, at least on some
tables.  Playing with autovacuum might help you out here, because it can
look at how badly a vacuum is needed and adjust the vacuuming rate on
the fly on a per-table basis.  Be sure to look up some reasonable
autovacuum settings first; the 8.1 defaults aren't.

-- Mark

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[PERFORM] Maintenance question / DB size anomaly...


Gang,

Hoping you all can help me with a rather bizarre issue that I've run  
across.  I don't really need a solution, I think I have one, but I'd  
really like to run it by everyone in case I'm headed in the wrong  
direction.


I'm running a small Slony (v1.1.5)/postgresql 8.0.4 cluster (on  
RedHat) that contains one master database, and two slaves.  The db1  
(the master) has been up for about 1.5 years, db2 (slave 1) for about  
9 months, and db3 (second slave) for about two months.  I do a VACUUM  
ANALYZE every morning on all three databases.  However, the vacuum on  
db1 takes approxiamately 4.5 hours, and on the slaves it takes about  
1/2 hour.  As far as I can tell, my FSM settings are correct.  This  
is concerning because the vacuum on db1 is starting to run into  
production hours.  The master receives all inserts, updates and  
deletes (as well as a fair number of selects).  The slaves are select- 
only.


In my investigation of this anomaly, I noticed that the data/ dir on  
db1 (the master) is around 60 Gigs.  The data directory on the slaves  
is around 25Gb.  After about 3 months of head scratching, someone on  
the irc channel suggested that it may be due to index bloat.   
Although, doing some research, it would seem that those problems were  
resolved in 7.4(ish), and it wouldn't account for one database being  
2.5x bigger.  Another unknown is Slony overhead (both in size and  
vacuum times).


The ONLY thing I can think of is that I DROPped a large number of  
tables from db1 a few months ago (they weren't getting replicated).   
This is on the order of 1700+ fairly largeish (50,000+ row) tables.   
I do not remember doing a vacuum full after dropping them, so perhaps  
that's my problem.  I'm planning on doing some maintenance this  
weekend, during which I will take the whole system down, then on db1,  
run a VACUUM FULL ANALYZE on the whole database, then a REINDEX on my  
very large tables.  I may drop and recreate the indexes on my big  
tables, as I hear that may be faster than a REINDEX.  I will probably  
run a VACUUM FULL ANALYZE on the slaves as well.


Thoughts?  Suggestions?  Anyone think this will actually help my  
problem of size and vacuum times?  Do I need to take Slony down while  
I do this?  Will the VACUUM FULL table locking interfere with Slony?


Thanks for any light you all can shed on these issues...

/kurt


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Performance query about large tables, lots of concurrent access


Gregory Stark wrote:

"Karl Wright" <[EMAIL PROTECTED]> writes:


This particular run lasted four days before a VACUUM became essential. The
symptom that indicates that VACUUM is needed seems to be that the CPU usage of
any given postgresql query skyrockets.  Is this essentially correct?


Postgres is designed on the assumption that VACUUM is run regularly. By
"regularly" we're talking of an interval usually on the order of hours, or
even less. On some workloads some tables need to be vacuumed every 5 minutes,
for example.


Fine - but what if the previous vacuum is still in progress, and does 
not finish in 5 minutes?




VACUUM doesn't require shutting down the system, it doesn't lock any tables or
otherwise prevent other jobs from making progress. It does add extra i/o but
there are knobs to throttle its i/o needs. The intention is that VACUUM run in
the background more or less continually using spare i/o bandwidth.



This spare bandwidth is apparently hard to come by in my particular 
application.  That's the only way I can reconcile your information with 
it taking 4 days to complete.



The symptom of not having run vacuum regularly is that tables and indexes
bloat to larger sizes than necessary. If you run "VACUUM VERBOSE" it'll tell
you how much bloat your tables and indexes are suffering from (though the
output is a bit hard to interpret).

Table and index bloat slow things down but not generally by increasing cpu
usage. Usually they slow things down by causing queries to require more i/o.



Yes, that's what I understood, which is why I was puzzled by the effects 
I was seeing.



It's only UPDATES and DELETES that create garbage tuples that need to be
vacuumed though. If some of your tables are mostly insert-only they might need
to be vacuumed as frequently or at all.



Well, the smaller tables don't change much, but the bigger tables have a 
 lively mix of inserts and updates, so I would expect these would need 
vacuuming often.


I'll post again when I can find a vacuum schedule that seems to work.

Karl

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Performance query about large tables, lots of concurrent access


Alvaro Herrera wrote:

Karl Wright wrote:

Alvaro Herrera wrote:

Karl Wright wrote:

This particular run lasted four days before a VACUUM became essential. 
The symptom that indicates that VACUUM is needed seems to be that the 
CPU usage of any given postgresql query skyrockets.  Is this essentially 
correct?

Are you saying you weren't used to run VACUUM all the time?  If so,
that's where the problem lies.
Postgresql 7.4 VACUUM runs for so long that starting it with a cron job 
even every 24 hours caused multiple instances of VACUUM to eventually be 
running in my case.  So I tried to find a VACUUM schedule that permitted 
each individual vacuum to finish before the next one started.  A vacuum 
seemed to require 4-5 days with this particular database - or at least 
it did for 7.4.  So I had the VACUUM schedule set to run every six days.


How large is the database?  I must admit I have never seen a database
that took 4 days to vacuum.  This could mean that your database is
humongous, or that the vacuum strategy is wrong for some reason.



The database is humongus, and the machine is under intense load.  On the 
instance where this long vacuum occurred, there were several large 
tables - one with 7,000,000 rows, one with 14,000,000, one with 
140,000,000, and one with 250,000,000.



You know that you can run vacuum on particular tables, right?  It would
be probably a good idea to run vacuum on the most updated tables, and
leave alone those that are not or little updated (hopefully the biggest;
this would mean that an almost-complete vacuum run would take much less
than a whole day).


Yeah, sorry, that doesn't apply here.



Or maybe vacuum was stuck waiting on a lock somewhere.

I will be experimenting with 8.1 to see how long it takes to complete a 
vacuum under load conditions tonight.


You can also turn autovacuum on in 8.1, which might help quite a bit
with finding a good vacuum schedule (you would need a bit of tuning it
though, of course).

In any case, if you are struggling for performance you are strongly
adviced to upgrade to 8.2.



Ok - that's something I should be able to do once we can go to debian's 
etch release.  There's a backport of 8.2 available there.  (The one for 
sarge is still considered 'experimental').


Karl


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Maintenance question / DB size anomaly...

Kurt Overberg <[EMAIL PROTECTED]> writes:
> In my investigation of this anomaly, I noticed that the data/ dir on  
> db1 (the master) is around 60 Gigs.  The data directory on the slaves  
> is around 25Gb.  After about 3 months of head scratching, someone on  
> the irc channel suggested that it may be due to index bloat.   

This is not something you need to guess about.  Compare the table and
index sizes, one by one, between the master and slaves.  Do a VACUUM
VERBOSE on the one(s) that are radically bigger on the master, and look
at what it has to say.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Maintenance question / DB size anomaly...


Kurt Overberg wrote:


In my investigation of this anomaly, I noticed that the data/ dir on db1 
(the master) is around 60 Gigs.  The data directory on the slaves is 
around 25Gb.  After about 3 months of head scratching, someone on the 
irc channel suggested that it may be due to index bloat.  Although, 
doing some research, it would seem that those problems were resolved in 
7.4(ish), and it wouldn't account for one database being 2.5x bigger.  
Another unknown is Slony overhead (both in size and vacuum times).


Check the oid2name/dbsize utilities in the contrib RPM for 8.0.x
http://www.postgresql.org/docs/8.0/static/diskusage.html
Shouldn't be too hard to find out where the disk space is going.

Oh and 8.0.13 is the latest release of 8.0 series, so you'll want to use 
your maintenance window to upgrade too. Lots of good bugfixes there.


The ONLY thing I can think of is that I DROPped a large number of tables 
from db1 a few months ago (they weren't getting replicated).  This is on 
the order of 1700+ fairly largeish (50,000+ row) tables.  I do not 
remember doing a vacuum full after dropping them, so perhaps that's my 
problem.  I'm planning on doing some maintenance this weekend, during 
which I will take the whole system down, then on db1, run a VACUUM FULL 
ANALYZE on the whole database, then a REINDEX on my very large tables.  
I may drop and recreate the indexes on my big tables, as I hear that may 
be faster than a REINDEX.  I will probably run a VACUUM FULL ANALYZE on 
the slaves as well.


You'll probably find CLUSTER to be quicker than VACUUM FULL, although 
you need enough disk-space free for temporary copies of the 
table/indexes concerned.


Dropping and recreating indexes should prove much faster than VACUUMING 
with them. Shouldn't matter for CLUSTER afaict.


Thoughts?  Suggestions?  Anyone think this will actually help my problem 
of size and vacuum times?  Do I need to take Slony down while I do 
this?  Will the VACUUM FULL table locking interfere with Slony?


Well, I'd take the opportunity to uninstall/reinstall slony just to 
check my scripts/procedures are working.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance query about large tables, lots of concurrent access

2007-06-19 Thread Bill Moran

In response to Karl Wright <[EMAIL PROTECTED]>:

> Alvaro Herrera wrote:
> > Karl Wright wrote:
> >> Alvaro Herrera wrote:
> >>> Karl Wright wrote:
> >>>
>  This particular run lasted four days before a VACUUM became essential. 
>  The symptom that indicates that VACUUM is needed seems to be that the 
>  CPU usage of any given postgresql query skyrockets.  Is this essentially 
>  correct?
> >>> Are you saying you weren't used to run VACUUM all the time?  If so,
> >>> that's where the problem lies.
> >> Postgresql 7.4 VACUUM runs for so long that starting it with a cron job 
> >> even every 24 hours caused multiple instances of VACUUM to eventually be 
> >> running in my case.  So I tried to find a VACUUM schedule that permitted 
> >> each individual vacuum to finish before the next one started.  A vacuum 
> >> seemed to require 4-5 days with this particular database - or at least 
> >> it did for 7.4.  So I had the VACUUM schedule set to run every six days.
> > 
> > How large is the database?  I must admit I have never seen a database
> > that took 4 days to vacuum.  This could mean that your database is
> > humongous, or that the vacuum strategy is wrong for some reason.
> 
> The database is humongus, and the machine is under intense load.  On the 
> instance where this long vacuum occurred, there were several large 
> tables - one with 7,000,000 rows, one with 14,000,000, one with 
> 140,000,000, and one with 250,000,000.

Don't rule out the possibility that the only way to fix this _might_ be to
throw more hardware at it.  Proper configuration can buy you a lot, but if
your usage is exceeding the available bandwidth of the IO subsystem, the
only way you're going to get better performance is to put in a faster IO
subsystem.

> > You know that you can run vacuum on particular tables, right?  It would
> > be probably a good idea to run vacuum on the most updated tables, and
> > leave alone those that are not or little updated (hopefully the biggest;
> > this would mean that an almost-complete vacuum run would take much less
> > than a whole day).
> 
> Yeah, sorry, that doesn't apply here.

Why not?  I see no reason why an appropriate autovaccum schedule would not
apply to your scenario.  I'm not saying it does, only that your response
does not indicate that it doesn't, and thus I'm concerned that you're
writing autovacuum off without proper research.

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Performance query about large tables, lots of concurrent access


Bill Moran wrote:

In response to Karl Wright <[EMAIL PROTECTED]>:


Alvaro Herrera wrote:

Karl Wright wrote:

Alvaro Herrera wrote:

Karl Wright wrote:

This particular run lasted four days before a VACUUM became essential. 
The symptom that indicates that VACUUM is needed seems to be that the 
CPU usage of any given postgresql query skyrockets.  Is this essentially 
correct?

Are you saying you weren't used to run VACUUM all the time?  If so,
that's where the problem lies.
Postgresql 7.4 VACUUM runs for so long that starting it with a cron job 
even every 24 hours caused multiple instances of VACUUM to eventually be 
running in my case.  So I tried to find a VACUUM schedule that permitted 
each individual vacuum to finish before the next one started.  A vacuum 
seemed to require 4-5 days with this particular database - or at least 
it did for 7.4.  So I had the VACUUM schedule set to run every six days.

How large is the database?  I must admit I have never seen a database
that took 4 days to vacuum.  This could mean that your database is
humongous, or that the vacuum strategy is wrong for some reason.
The database is humongus, and the machine is under intense load.  On the 
instance where this long vacuum occurred, there were several large 
tables - one with 7,000,000 rows, one with 14,000,000, one with 
140,000,000, and one with 250,000,000.


Don't rule out the possibility that the only way to fix this _might_ be to
throw more hardware at it.  Proper configuration can buy you a lot, but if
your usage is exceeding the available bandwidth of the IO subsystem, the
only way you're going to get better performance is to put in a faster IO
subsystem.


You know that you can run vacuum on particular tables, right?  It would
be probably a good idea to run vacuum on the most updated tables, and
leave alone those that are not or little updated (hopefully the biggest;
this would mean that an almost-complete vacuum run would take much less
than a whole day).

Yeah, sorry, that doesn't apply here.


Why not?  I see no reason why an appropriate autovaccum schedule would not
apply to your scenario.  I'm not saying it does, only that your response
does not indicate that it doesn't, and thus I'm concerned that you're
writing autovacuum off without proper research.



I'm not writing off autovacuum - just the concept that the large tables 
aren't the ones that are changing.  Unfortunately, they *are* the most 
dynamically updated.


Karl


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance query about large tables, lots of concurrent access

"Karl Wright" <[EMAIL PROTECTED]> writes:

> Fine - but what if the previous vacuum is still in progress, and does not
> finish in 5 minutes?

Yes, well, there are problems with this design but the situation is already
much improved in 8.2 and there are more improvements on the horizon.

But it's likely that much of your pain is artificial here and once your
database is cleaned up a bit more it will be easier to manage. 

> Well, the smaller tables don't change much, but the bigger tables have a 
> lively
> mix of inserts and updates, so I would expect these would need vacuuming 
> often.

Hm, I wonder if you're running into a performance bug that was fixed sometime
back around then. It involved having large numbers of tuples indexed with the
same key value. Every search for a single record required linearly searching
through the entire list of values.

If you have thousands of updates against the same tuple between vacuums you'll
have the same kind of situation and queries against that key will indeed
require lots of cpu.

To help any more you'll have to answer the basic questions like how many rows
are in the tables that take so long to vacuum, and how large are they on disk.
On 7.4 I think the best way to get the table size actually is by doing
"select relfilenode from pg_class where relname = 'tablename'" and then looking
in the postgres directory for the files in base/*/*

The best information would be to do vacuum verbose and report the data it
prints out.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance query about large tables, lots of concurrent access

A useful utility that I've found is PgFouine.  It has an option to  
analyze VACUUM VERBOSE logs.  It has been instrumental in helping me  
figure out whats been going on with my VACUUM that is taking 4+  
hours, specifically tracking the tables that are taking the longest.   
I highly recommend checking it out.   It would also perhaps be a good  
idea rather than simply starting a vacuum every 6 days, set it so  
that it starts again as soon as it finishes (using a lock file or  
something that is polled for every few hours or minutes).  This way,  
a vacuum will kick off right when the other one finishes, hopefully  
slowly decreasing in time over time.


Hope this helps...

/kurt


On Jun 19, 2007, at 10:06 AM, Karl Wright wrote:


Alvaro Herrera wrote:

Karl Wright wrote:

Alvaro Herrera wrote:

Karl Wright wrote:

This particular run lasted four days before a VACUUM became  
essential. The symptom that indicates that VACUUM is needed  
seems to be that the CPU usage of any given postgresql query  
skyrockets.  Is this essentially correct?

Are you saying you weren't used to run VACUUM all the time?  If so,
that's where the problem lies.
Postgresql 7.4 VACUUM runs for so long that starting it with a  
cron job even every 24 hours caused multiple instances of VACUUM  
to eventually be running in my case.  So I tried to find a VACUUM  
schedule that permitted each individual vacuum to finish before  
the next one started.  A vacuum seemed to require 4-5 days with  
this particular database - or at least it did for 7.4.  So I had  
the VACUUM schedule set to run every six days.

How large is the database?  I must admit I have never seen a database
that took 4 days to vacuum.  This could mean that your database is
humongous, or that the vacuum strategy is wrong for some reason.


The database is humongus, and the machine is under intense load.   
On the instance where this long vacuum occurred, there were several  
large tables - one with 7,000,000 rows, one with 14,000,000, one  
with 140,000,000, and one with 250,000,000.


You know that you can run vacuum on particular tables, right?  It  
would

be probably a good idea to run vacuum on the most updated tables, and
leave alone those that are not or little updated (hopefully the  
biggest;
this would mean that an almost-complete vacuum run would take much  
less

than a whole day).


Yeah, sorry, that doesn't apply here.


Or maybe vacuum was stuck waiting on a lock somewhere.
I will be experimenting with 8.1 to see how long it takes to  
complete a vacuum under load conditions tonight.

You can also turn autovacuum on in 8.1, which might help quite a bit
with finding a good vacuum schedule (you would need a bit of  
tuning it

though, of course).
In any case, if you are struggling for performance you are strongly
adviced to upgrade to 8.2.


Ok - that's something I should be able to do once we can go to  
debian's etch release.  There's a backport of 8.2 available there.   
(The one for sarge is still considered 'experimental').


Karl


---(end of  
broadcast)---

TIP 2: Don't 'kill -9' the postmaster



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] Performance query about large tables, lots of concurrent access

Karl Wright <[EMAIL PROTECTED]> writes:
> Also, as I said before, I have done extensive query analysis and found 
> that the plans for the queries that are taking a long time are in fact 
> very reasonable.  Here's an example from the application log of a query 
> that took way more time than its plan would seem to indicate it should:

> [2007-06-18 09:39:49,797]ERROR  Plan: Index Scan using i1181764142395 on 
> intrinsiclink  (cost=0.00..14177.29 rows=5 width=253)
> [2007-06-18 09:39:49,797]ERROR  Plan:   Index Cond: ((jobid = $2) AND 
> ((childidhash)::text = ($3)::text))
> [2007-06-18 09:39:49,797]ERROR  Plan:   Filter: ((childid = ($4)::text) 
> AND ((isnew = ($5)::bpchar) OR (isnew = ($6)::bpchar)))

I see the discussion thread has moved on to consider lack-of-vacuuming
as the main problem, but I didn't want to let this pass without
comment.  The above plan is not necessarily good at all --- it depends
on how many rows are selected by the index condition alone (ie, jobid
and childidhash) versus how many are selected by the index and filter
conditions.  If the index retrieves many rows, most of which are
eliminated by the filter condition, it's still gonna take a long time.

In this case it looks like the planner is afraid that that's exactly
what will happen --- a cost of 14177 suggests that several thousand row
fetches are expected to happen, and yet it's only predicting 5 rows out
after the filter.  It's using this plan anyway because it has no better
alternative, but you should think about whether a different index
definition would help.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Performance query about large tables, lots of concurrent access


Tom Lane wrote:

Karl Wright <[EMAIL PROTECTED]> writes:
Also, as I said before, I have done extensive query analysis and found 
that the plans for the queries that are taking a long time are in fact 
very reasonable.  Here's an example from the application log of a query 
that took way more time than its plan would seem to indicate it should:


[2007-06-18 09:39:49,797]ERROR  Plan: Index Scan using i1181764142395 on 
intrinsiclink  (cost=0.00..14177.29 rows=5 width=253)
[2007-06-18 09:39:49,797]ERROR  Plan:   Index Cond: ((jobid = $2) AND 
((childidhash)::text = ($3)::text))
[2007-06-18 09:39:49,797]ERROR  Plan:   Filter: ((childid = ($4)::text) 
AND ((isnew = ($5)::bpchar) OR (isnew = ($6)::bpchar)))


I see the discussion thread has moved on to consider lack-of-vacuuming
as the main problem, but I didn't want to let this pass without
comment.  The above plan is not necessarily good at all --- it depends
on how many rows are selected by the index condition alone (ie, jobid
and childidhash) versus how many are selected by the index and filter
conditions.  If the index retrieves many rows, most of which are
eliminated by the filter condition, it's still gonna take a long time.

In this case it looks like the planner is afraid that that's exactly
what will happen --- a cost of 14177 suggests that several thousand row
fetches are expected to happen, and yet it's only predicting 5 rows out
after the filter.  It's using this plan anyway because it has no better
alternative, but you should think about whether a different index
definition would help.

regards, tom lane



Well, that's odd, because the hash in question that it is using is the 
SHA-1 hash of a URL.  There's essentially one row per URL in this table. 
 Even with a large table I would not expect more than a couple of 
collisions at most.


How does it arrive at that estimate of 14,000?

Karl


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Performance query about large tables, lots of concurrent access

Karl Wright <[EMAIL PROTECTED]> writes:
> [2007-06-18 09:39:49,797]ERROR  Plan: Index Scan using i1181764142395 on 
> intrinsiclink  (cost=0.00..14177.29 rows=5 width=253)
> [2007-06-18 09:39:49,797]ERROR  Plan:   Index Cond: ((jobid = $2) AND 
> ((childidhash)::text = ($3)::text))
> [2007-06-18 09:39:49,797]ERROR  Plan:   Filter: ((childid = ($4)::text) 
> AND ((isnew = ($5)::bpchar) OR (isnew = ($6)::bpchar)))

>> In this case it looks like the planner is afraid that that's exactly
>> what will happen --- a cost of 14177 suggests that several thousand row
>> fetches are expected to happen, and yet it's only predicting 5 rows out
>> after the filter.

> Well, that's odd, because the hash in question that it is using is the 
> SHA-1 hash of a URL.  There's essentially one row per URL in this table. 

What about isnew?

Also, how many rows do *you* expect out of the query?  The planner is
not going to be aware of the hashed relationship between childidhash
and childid --- it'll think those are independent conditions which they
evidently aren't.  So it may be that the query really does retrieve
thousands of rows, and the rows=5 estimate is bogus because it's
double-counting the selectivity of the childid condition.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Maintenance question / DB size anomaly...

2007-06-19 Thread Chris Browne

[EMAIL PROTECTED] (Kurt Overberg) writes:
> In my investigation of this anomaly, I noticed that the data/ dir on
> db1 (the master) is around 60 Gigs.  The data directory on the slaves
> is around 25Gb.  After about 3 months of head scratching, someone on
> the irc channel suggested that it may be due to index bloat.
> Although, doing some research, it would seem that those problems were
> resolved in 7.4(ish), and it wouldn't account for one database being
> 2.5x bigger.  Another unknown is Slony overhead (both in size and
> vacuum times).

There are three tables in Slony-I that would be of interest; on the
master, do a VACUUM VERBOSE on:

 - [clustername].sl_log_1
 - [clustername].sl_log_2
 - [clustername].sl_seqlog

If one or another is really bloated, that could be the cause of *some*
problems.  Though that shouldn't account for 35GB of space :-).

> The ONLY thing I can think of is that I DROPped a large number of
> tables from db1 a few months ago (they weren't getting replicated).
> This is on the order of 1700+ fairly largeish (50,000+ row) tables.
> I do not remember doing a vacuum full after dropping them, so perhaps
> that's my problem.  I'm planning on doing some maintenance this
> weekend, during which I will take the whole system down, then on db1,
> run a VACUUM FULL ANALYZE on the whole database, then a REINDEX on my
> very large tables.  I may drop and recreate the indexes on my big
> tables, as I hear that may be faster than a REINDEX.  I will probably
> run a VACUUM FULL ANALYZE on the slaves as well.

When tables are dropped, so are the data files.  So even if they were
bloated, they should have simply disappeared.  So I don't think that's
the problem.

> Thoughts?  Suggestions?  Anyone think this will actually help my
> problem of size and vacuum times?  Do I need to take Slony down while
> I do this?  Will the VACUUM FULL table locking interfere with Slony?

I'd be inclined to head to the filesystem level, and try to see what
tables are bloated *there*.

You should be able to search for bloated tables via the command:

$ find $PGDATA/base -name "[0-9]+\.[0-9]+"

That would be likely to give you a listing of filenames that look
something like:

12341.1
12341.2
12341.3
12341.4
12341.5
12341.6
231441.1
231441.2
231441.3

which indicates all table (or index) data files that had to be
extended past 1GB.

In the above, the relation with OID 12341 would be >6GB in size,
because it has been extended to have 6 additional files (in addition
to the "bare" filename, 12341).

You can then go into a psql session, and run the query:
  select * from pg_class where oid = 12341;
and thereby figure out what table is involved.

I'll bet that if you do this on the "origin" node, you'll find that
there is some small number of tables that have *way* more 1GB
partitions than there are on the subscriber nodes.

Those are the tables that will need attention.

You could probably accomplish the reorganization more quickly via the
"CLUSTER" statement; that will reorganize the table according based on
the ordering of one specified index, and then regenerate all the other
indices.  It's not MVCC-safe, so if you have reports running
concurrently, this could confuse them, but if you take the apps down,
as you surely should, it won't be a problem.

You don't forcibly have to take Slony-I down during this, but the
locks taken out on tables by CLUSTER/VACUUM FULL will block slons from
doing any work until those transactions complete.

I wouldn't think you need to do VACUUM FULL or CLUSTER against the
subscribers if they haven't actually bloated (and based on what you
have said, there is no indication that they have).
-- 
output = reverse("ofni.secnanifxunil" "@" "enworbbc")
http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
The quickest way to a man's heart is through his chest, with an axe. 

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Performance query about large tables, lots of concurrent access

Tom Lane wrote:

Karl Wright <[EMAIL PROTECTED]> writes:
[2007-06-18 09:39:49,797]ERROR  Plan: Index Scan using i1181764142395 on 
intrinsiclink  (cost=0.00..14177.29 rows=5 width=253)
[2007-06-18 09:39:49,797]ERROR  Plan:   Index Cond: ((jobid = $2) AND 
((childidhash)::text = ($3)::text))
[2007-06-18 09:39:49,797]ERROR  Plan:   Filter: ((childid = ($4)::text) 
AND ((isnew = ($5)::bpchar) OR (isnew = ($6)::bpchar)))

In this case it looks like the planner is afraid that that's exactly
what will happen --- a cost of 14177 suggests that several thousand row
fetches are expected to happen, and yet it's only predicting 5 rows out
after the filter.

Well, that's odd, because the hash in question that it is using is the 
SHA-1 hash of a URL.  There's essentially one row per URL in this table. 

What about isnew?

Isnew is simply a flag which I want to set for all rows that belong to 
this particular child, but only if it's one of two particular values.

Also, how many rows do *you* expect out of the query?  The planner is
not going to be aware of the hashed relationship between childidhash
and childid --- it'll think those are independent conditions which they
evidently aren't.  So it may be that the query really does retrieve
thousands of rows, and the rows=5 estimate is bogus because it's
double-counting the selectivity of the childid condition.

This can vary, but I expect there to be at on average a few dozen rows 
returned from the overall query.  The only way the index-condition part 
of the query can be returning thousands of rows would be if: (a) there 
is really a lot of data of this kind, or (b) the hash function is 
basically not doing its job and there are thousands of collisions occurring.

In fact, that's not the case. In psql I just did the following analysis:

>>
metacarta=> explain select count(*) from intrinsiclink where 
jobid=1181766706097 and 
childidhash='7E130F3B688687757187F1638D8776ECEF3009E0';
 QUERY PLAN 

 Aggregate  (cost=14992.23..14992.24 rows=1 width=0)
   ->  Index Scan using i1181764142395 on intrinsiclink 
(cost=0.00..14971.81 rows=8167 width=0)
 Index Cond: ((jobid = 1181766706097::bigint) AND 
((childidhash)::text = '7E130F3B688687757187F1638D8776ECEF3009E0'::text))

(3 rows)

metacarta=> select count(*) from intrinsiclink where jobid=1181766706097 
and childidhash='7E130F3B688687757187F1638D8776ECEF3009E0';

 count
---
 0
(1 row)
<<

Granted this is well after-the-fact, but you can see that the cost 
estimate is wildly wrong in this case.

I did an ANALYZE on that table and repeated the explain, and got this:

>>
metacarta=> analyze intrinsiclink;
ANALYZE
metacarta=> explain select count(*) from intrinsiclink where 
jobid=1181766706097 and 
childidhash='7E130F3B688687757187F1638D8776ECEF3009E0';
 QUERY PLAN 

 Aggregate  (cost=15276.36..15276.37 rows=1 width=0)
   ->  Index Scan using i1181764142395 on intrinsiclink 
(cost=0.00..15255.53 rows=8333 width=0)
 Index Cond: ((jobid = 1181766706097::bigint) AND 
((childidhash)::text = '7E130F3B688687757187F1638D8776ECEF3009E0'::text))

(3 rows)
<<

... even more wildly wrong.

Karl

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Hardware suggestions


[EMAIL PROTECTED] writes:

sizes etc., I wondered about the hardware. Most things were about the I/O 
of harddisks, RAM and file system. Is the filesystem that relevant? 
Because wo want to stay at Ubuntu because of the software support, 
espacially for the GIS-Systems. I think we need at least about 300-500Gb 
for storage and the server you get for this price are about two dualcore 
2.0 - 2.8 GHz Opterons.


I would suggest 8GB of RAM, 4 500GB (Seagate) drives in RAID10, a dual 
core CPU (AMD or Dual Core) and 3ware or Areca controller.


If you don't need a 1U case and you can use a tower case you should be able 
to get those specs within your budget.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] Performance query about large tables, lots of concurrent access


Gregory Stark writes:


VACUUM doesn't require shutting down the system, it doesn't lock any tables or
otherwise prevent other jobs from making progress. It does add extra i/o but


In addition to what Gregory pointed out, you may want to also consider using 
Autovacuum. That may also help.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Performance query about large tables, lots of concurrent access


Alvaro Herrera writes:


How large is the database?  I must admit I have never seen a database
that took 4 days to vacuum.  This could mean that your database is
humongous, or that the vacuum strategy is wrong for some reason.


Specially with 16GB of RAM.

I have a setup with several databases (the largest of which is 1TB database) 
and I do a nightly vacuum analyze for ALL databases. It takes about 22 
hours. And this is with constant updates to the large 1TB database. This is 
with Postgresql 8.1.3 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Performance query about large tables, lots of concurrent access


"Karl Wright" <[EMAIL PROTECTED]> writes:

>> In this case it looks like the planner is afraid that that's exactly
>> what will happen --- a cost of 14177 suggests that several thousand row
>> fetches are expected to happen, and yet it's only predicting 5 rows out
>> after the filter.  It's using this plan anyway because it has no better
>> alternative, but you should think about whether a different index
>> definition would help.

Another index won't help if the reason the cost is so high isn't because the
index isn't very selective but because there are lots of dead tuples.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Performance query about large tables, lots of concurrent access

Karl Wright <[EMAIL PROTECTED]> writes:
> I did an ANALYZE on that table and repeated the explain, and got this:
> ...
> ... even more wildly wrong.

Hmm.  You might need to increase the statistics target for your larger
tables.  It's probably not a big deal for queries like this one, but I'm
worried that you may be getting bad plans for complicated joins.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Performance query about large tables, lots of concurrent access

2007-06-19 Thread Alvaro Herrera

Karl Wright wrote:

> I did an ANALYZE on that table and repeated the explain, and got this:
> 
> >>
> metacarta=> analyze intrinsiclink;
> ANALYZE
> metacarta=> explain select count(*) from intrinsiclink where 
> jobid=1181766706097 and 
> childidhash='7E130F3B688687757187F1638D8776ECEF3009E0';
>  QUERY PLAN 
> 
> 
>  Aggregate  (cost=15276.36..15276.37 rows=1 width=0)
>->  Index Scan using i1181764142395 on intrinsiclink 
> (cost=0.00..15255.53 rows=8333 width=0)
>  Index Cond: ((jobid = 1181766706097::bigint) AND 
> ((childidhash)::text = '7E130F3B688687757187F1638D8776ECEF3009E0'::text))
> (3 rows)
> <<
> 
> ... even more wildly wrong.

Interesting.  What is the statistics target for this table?  Try
increasing it, with ALTER TABLE ... SET STATISTICS, rerun analyze, and
try again.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Performance query about large tables, lots of concurrent access


Karl Wright writes:

I'm not writing off autovacuum - just the concept that the large tables 
aren't the ones that are changing.  Unfortunately, they *are* the most 
dynamically updated.


Would be possible for you to partition the tables?
By date or some other fashion to try to have some tables not get affected by 
the updates/inserts?


I am in the process of breaking a DB.. to have tables by dates. Our 
historical data never changes.


Also, what is the physical size of all this data?


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] Maintenance question / DB size anomaly...

Chris Browne <[EMAIL PROTECTED]> writes:
> [EMAIL PROTECTED] (Kurt Overberg) writes:
>> In my investigation of this anomaly, I noticed that the data/ dir on
>> db1 (the master) is around 60 Gigs.  The data directory on the slaves
>> is around 25Gb.  After about 3 months of head scratching, someone on
>> the irc channel suggested that it may be due to index bloat.

> I'd be inclined to head to the filesystem level, and try to see what
> tables are bloated *there*.

At least as a first cut, it should be sufficient to look at
pg_class.relpages, which'd be far easier to correlate with table names
;-).  The relpages entry should be accurate as of the most recent VACUUM
on each table, which ought to be close enough unless I missed something
about the problem situation.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Performance query about large tables, lots of concurrent access

"Gregory Stark" <[EMAIL PROTECTED]> writes:

> "Karl Wright" <[EMAIL PROTECTED]> writes:
>
>>> In this case it looks like the planner is afraid that that's exactly
>>> what will happen --- a cost of 14177 suggests that several thousand row
>>> fetches are expected to happen, and yet it's only predicting 5 rows out
>>> after the filter.  It's using this plan anyway because it has no better
>>> alternative, but you should think about whether a different index
>>> definition would help.
>
> Another index won't help if the reason the cost is so high isn't because the
> index isn't very selective but because there are lots of dead tuples.

Sorry, I didn't mean to say that was definitely the case, only that having
bloated tables with lots of dead index pointers could have similar symptoms
because the query still has to follow all those index pointers.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Regarding Timezone

"soni de" <[EMAIL PROTECTED]> writes:
> But we want to set this timezone parameter to IST.

Which "IST" are you interested in?  Irish, Israel, or Indian Standard Time?
Postgres prefers to use the zic timezone names, which are less
ambiguous.  Try this to see likely options:

regression=# select * from pg_timezone_names where abbrev = 'IST';
 name  | abbrev | utc_offset | is_dst
---+++
 Asia/Calcutta | IST| 05:30:00   | f
 Asia/Colombo  | IST| 05:30:00   | f
 Europe/Dublin | IST| 01:00:00   | t
 Eire  | IST| 01:00:00   | t
(4 rows)

If you're after Indian Standard Time, set timezone to 'Asia/Calcutta'.
You'll probably also want to set timezone_abbreviations to 'India' so
that "IST" is interpreted the way you want in timestamp datatype input.
See
http://www.postgresql.org/docs/8.2/static/datetime-config-files.html

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

[PERFORM] PostgreSQL Configuration Tool for Dummies

Please read the whole email before replying:

 

I love the feedback I have received but I feel that somehow I did not
communicate the intent of this mini project very well.  So let me
outline a few basics and who the audience was intended for.

 

 

Mini project title:

Initial Configuration Tool for PostgreSQL for Dummies

 

1) This is intended for newbie's.  Not for experienced users or advanced
DBAs.

 

2) This tool is NOT intended to monitor your PostgreSQL efficiency.

 

3) I suggested JavaScript because most people that get started with
PostgreSQL will go to the web in order to find out about issues relating
to configuration.  I wanted a very simple way for people to access the
tool that would not be tied to any particular environment or OS.  If
there is someone that is using a text browser to view the web then they
are probably geeky enough not to want to bother with using this tool.

 

4) The intent is just to give people that have no clue a better starting
point than some very generic defaults.

 

Please think simple.  I stress the word simple.  The real challenge here
is getting the formulas correct.  Someone mentioned to not focus on the
values but just get something out there for everyone to help tweak.  I
agree!

 

What questions do you think should be asked in order to figure out what
values should go into the formulas for the configuration suggestions?  

 

My thoughts:

 

What version of PostgreSQL are you using?

How much memory will be available to PostgreSQL?

How many connections will be made to PostgreSQL?

 

 

Thanks,

 

 

 

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

Re: [PERFORM] Maintenance question / DB size anomaly...


Richard,

Thanks for the feedback!  I found oid2name and have been mucking  
about with it, but haven't really found anything that stands out  
yet.  Most of the tables/indexes I'm comparing across machines seem  
to take up a similar amount of disk space.  I think I'm going to have  
to get fancy and write some shell scripts.  Regarding the slony  
configuration scripts, you're assuming that I have such scripts.  Our  
slony install was originally installed by a contractor, and modified  
since then so "getting my act together with respect to slony" is  
kinda beyond the scope of what I'm trying to accomplish with this  
maintenance.  I really just want to figure out whats going on with  
db1, and want to do so in a way that won't ruin slony since right now  
it runs pretty well, and I doubt I'd be able to fix it if it  
seriously broke.


Upon a cursory pass with oid2name, it seems that my  sl_log_1_idx1  
index is out of hand:



-bash-3.00$ oid2name -d mydb -f 955960160
From database "mydb":
   Filenode Table Name
--
  955960160  sl_log_1_idx1

-bash-3.00$ ls -al 955960160*
-rw---  1 postgres postgres 1073741824 Jun 19 11:08 955960160
-rw---  1 postgres postgres 1073741824 Jun 13  2006 955960160.1
-rw---  1 postgres postgres  909844480 Jun 19 10:47 955960160.10
-rw---  1 postgres postgres 1073741824 Jul 31  2006 955960160.2
-rw---  1 postgres postgres 1073741824 Sep 12  2006 955960160.3
-rw---  1 postgres postgres 1073741824 Oct 19  2006 955960160.4
-rw---  1 postgres postgres 1073741824 Nov 27  2006 955960160.5
-rw---  1 postgres postgres 1073741824 Feb  3 12:57 955960160.6
-rw---  1 postgres postgres 1073741824 Mar  2 11:57 955960160.7
-rw---  1 postgres postgres 1073741824 Mar 29 09:46 955960160.8
-rw---  1 postgres postgres 1073741824 Mar 29 09:46 955960160.9


I know that slony runs its vacuuming in the background, but it  
doesn't seem to be cleaning this stuff up.  Interestingly, from my  
VACUUM pgfouine output,
that index doesn't take that long at all to vacuum analyze (compared  
to my other, much larger tables).  Am I making the OID->filename  
translation properly?


Running this:
SELECT relname, relpages FROM pg_class ORDER BY relpages DESC;
...gives me...

sl_log_1_idx1   |   
1421785
xrefmembergroup |   
1023460
answerselectinstance|
565343


...does this jibe with what I'm seeing above?  I guess I'll run a  
full vacuum on the slony tables too?  I figured something would else  
would jump out bigger than this.  FWIW, the same table on db2 and db3  
is very small, like zero.  I guess this is looking like it is  
overhead from slony?  Should I take this problem over to the slony  
group?


Thanks again, gang-

/kurt



On Jun 19, 2007, at 10:13 AM, Richard Huxton wrote:


Kurt Overberg wrote:
In my investigation of this anomaly, I noticed that the data/ dir  
on db1 (the master) is around 60 Gigs.  The data directory on the  
slaves is around 25Gb.  After about 3 months of head scratching,  
someone on the irc channel suggested that it may be due to index  
bloat.  Although, doing some research, it would seem that those  
problems were resolved in 7.4(ish), and it wouldn't account for  
one database being 2.5x bigger.  Another unknown is Slony overhead  
(both in size and vacuum times).


Check the oid2name/dbsize utilities in the contrib RPM for 8.0.x
http://www.postgresql.org/docs/8.0/static/diskusage.html
Shouldn't be too hard to find out where the disk space is going.

Oh and 8.0.13 is the latest release of 8.0 series, so you'll want  
to use your maintenance window to upgrade too. Lots of good  
bugfixes there.


The ONLY thing I can think of is that I DROPped a large number of  
tables from db1 a few months ago (they weren't getting  
replicated).  This is on the order of 1700+ fairly largeish (50,000 
+ row) tables.  I do not remember doing a vacuum full after  
dropping them, so perhaps that's my problem.  I'm planning on  
doing some maintenance this weekend, during which I will take the  
whole system down, then on db1, run a VACUUM FULL ANALYZE on the  
whole database, then a REINDEX on my very large tables.  I may  
drop and recreate the indexes on my big tables, as I hear that may  
be faster than a REINDEX.  I will probably run a VACUUM FULL  
ANALYZE on the slaves as well.


You'll probably find CLUSTER to be quicker than VACUUM FULL,  
although you need enough disk-space free for temporary copies of  
the table/indexes concerned.


Dropping and recreating indexes should prove much faster than  
VACUUMING with them. Shouldn't matter for CLUSTER afaict.


Thoughts?  Suggestions?  Anyone think this will actually help my  
problem of size and vacuum times?  Do I need to take Slony down  
while I do this?  Will the VACUUM FULL table locking interfere  
with Slony?

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies


Campbell, Lance wrote:

Please think simple.  I stress the word simple.  The real challenge here
is getting the formulas correct.  Someone mentioned to not focus on the
values but just get something out there for everyone to help tweak.  I
agree!

What questions do you think should be asked in order to figure out what
values should go into the formulas for the configuration suggestions?  


My thoughts:

What version of PostgreSQL are you using?


OK, obviously not needed if embedded in the manuals.

> How many connections will be made to PostgreSQL?
OK (but changed order)


How much memory will be available to PostgreSQL?

Would structure it like:
- What is total memory of your machine?
- How much do you want to reserve for other apps (e.g. apache/java)?

Also:
- How many disks will PG be using?
- How much data do you think you'll store?
- Will your usage be: mostly reads|balance of read+write|mostly writes
- Are your searches: all very simple|few complex|lots of complex queries

Then, with the output provide a commentary stating reasons why for the 
chosen values. e.g.

  random_page_cost = 1.0
  Because you have [effective_cache_size = 1GB] and [total db size = 
0.5GB] the cost of fetching a page is the same no matter what order you 
fetch them in.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies


Campbell, Lance writes:

3) I suggested JavaScript because most people that get started with 
PostgreSQL will go to the web in order to find out about issues relating 


Why not c?
It could then go into contrib.
Anyways.. language is likely the least important issue..
As someone mentioned.. once the formulas are worked out it can be done in a 
few languages.. as people desire..



How much memory will be available to PostgreSQL?
How many connections will be made to PostgreSQL?


Will this be a dedicated Postgresql server?
Will there be mostly reads or will there also be significant amount of 
writes?


Are you on a RAID system or do you have several disks over which you would 
like to run postgresql on?


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

Francisco and Richard,
Why ask about disk or raid?  How would that impact any settings in
postgresql.conf?

I did forget the obvious question:

What OS are you using?

Thanks,

Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu

-Original Message-
From: Francisco Reyes [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 19, 2007 11:58 AM
To: Campbell, Lance
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

Campbell, Lance writes:

> 3) I suggested JavaScript because most people that get started with 
> PostgreSQL will go to the web in order to find out about issues
relating 

Why not c?
It could then go into contrib.
Anyways.. language is likely the least important issue..
As someone mentioned.. once the formulas are worked out it can be done
in a 
few languages.. as people desire..

> How much memory will be available to PostgreSQL?
> How many connections will be made to PostgreSQL?

Will this be a dedicated Postgresql server?
Will there be mostly reads or will there also be significant amount of 
writes?

Are you on a RAID system or do you have several disks over which you
would 
like to run postgresql on?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Maintenance question / DB size anomaly...


Kurt Overberg wrote:

Richard,

Thanks for the feedback!  I found oid2name and have been mucking about 
with it, but haven't really found anything that stands out yet.  Most of 
the tables/indexes I'm comparing across machines seem to take up a 
similar amount of disk space.  I think I'm going to have to get fancy 
and write some shell scripts.  Regarding the slony configuration 
scripts, you're assuming that I have such scripts.  Our slony install 
was originally installed by a contractor, and modified since then so 
"getting my act together with respect to slony" is kinda beyond the 
scope of what I'm trying to accomplish with this maintenance.  I really 
just want to figure out whats going on with db1, and want to do so in a 
way that won't ruin slony since right now it runs pretty well, and I 
doubt I'd be able to fix it if it seriously broke.


Upon a cursory pass with oid2name, it seems that my  sl_log_1_idx1 index 
is out of hand:


If the sl_log_1 table is large too, it'll be worth reading throught the 
FAQ to see if any of its notes apply.


http://cbbrowne.com/info/faq.html


-bash-3.00$ oid2name -d mydb -f 955960160
 From database "mydb":
   Filenode Table Name
--
  955960160  sl_log_1_idx1

-bash-3.00$ ls -al 955960160*
-rw---  1 postgres postgres 1073741824 Jun 19 11:08 955960160
-rw---  1 postgres postgres 1073741824 Jun 13  2006 955960160.1
-rw---  1 postgres postgres  909844480 Jun 19 10:47 955960160.10
-rw---  1 postgres postgres 1073741824 Jul 31  2006 955960160.2
-rw---  1 postgres postgres 1073741824 Sep 12  2006 955960160.3
-rw---  1 postgres postgres 1073741824 Oct 19  2006 955960160.4
-rw---  1 postgres postgres 1073741824 Nov 27  2006 955960160.5
-rw---  1 postgres postgres 1073741824 Feb  3 12:57 955960160.6
-rw---  1 postgres postgres 1073741824 Mar  2 11:57 955960160.7
-rw---  1 postgres postgres 1073741824 Mar 29 09:46 955960160.8
-rw---  1 postgres postgres 1073741824 Mar 29 09:46 955960160.9


I know that slony runs its vacuuming in the background, but it doesn't 
seem to be cleaning this stuff up.  Interestingly, from my VACUUM 
pgfouine output,
that index doesn't take that long at all to vacuum analyze (compared to 
my other, much larger tables).  Am I making the OID->filename 
translation properly?


Looks OK to me


Running this:
SELECT relname, relpages FROM pg_class ORDER BY relpages DESC;
...gives me...

sl_log_1_idx1   |  1421785
xrefmembergroup |  1023460
answerselectinstance|   565343

...does this jibe with what I'm seeing above?  I guess I'll run a full 
vacuum on the slony tables too?  I figured something would else would 
jump out bigger than this.  FWIW, the same table on db2 and db3 is very 
small, like zero.  I guess this is looking like it is overhead from 
slony?  Should I take this problem over to the slony group?


Well, pages are 8KB each (by default), so that'd be about 10.8GB, which 
seems to match your filesizes above.


Read through the FAQ I linked to - for some reason Slony's not clearing 
out transactions it's replicated to your slaves (they *are* in sync, 
aren't they?). Could be a transaction preventing vacuuming, or perhaps a 
 partially dropped node?


Check the size of the sl_log_1 table and see if that tallies.

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies


Campbell, Lance wrote:

Francisco and Richard,
Why ask about disk or raid?  How would that impact any settings in
postgresql.conf?


Well, random_page_cost will depend on how fast your disk system can 
locate a non-sequential page. If you have a 16-disk RAID-10 array that's 
noticably less time than a single 5400rpm IDE in a laptop.



I did forget the obvious question:

What OS are you using?


Tricky to keep simple, isn't it :-)

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread PFC





What version of PostgreSQL are you using?


	I think newbies should be pushed a bit to use the latest versions, maybe  
with some advice on how to setup the apt sources (in debian/ubuntu) to get  
them.



How much memory will be available to PostgreSQL?

How many connections will be made to PostgreSQL?


	I also think Postgres newbies using PHP should be encouraged to use  
something like ligttpd/fastcgi instead of Apache. The fastcgi model  
permits use of very few database connections and working PHP processes  
since lighttpd handles all the slow transfers to the client  
asynchronously. You can do the same with two Apache instances, one serving  
static pages and acting as a proxy for the second Apache serving dynamic  
pages.
	With this setup, even low-end server setups (For our personal sites, a  
friend and I share a dedicated server with 256MB of RAM, which we rent for  
20€ a month). This thing will never run 200 Apache processes, but we have  
no problem with lighttpd/php-fcgi and postgres.


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Maintenance question / DB size anomaly...

Chris,

I took your advice, and I had found that sl_log_1 seems to be causing
some of the problem. Here's the result of a VACUUM VERBOSE

mydb # vacuum verbose _my_cluster.sl_log_1 ;
INFO: vacuuming "_my_cluster.sl_log_1"
INFO: index "sl_log_1_idx1" now contains 309404 row versions in
1421785 pages

DETAIL: 455001 index row versions were removed.
1419592 index pages have been deleted, 1416435 are currently reusable.
CPU 16.83s/5.07u sec elapsed 339.19 sec.
[EMAIL PROTECTED]@[EMAIL PROTECTED]: index "sl_log_1_idx2" now contains 312864 row versions
in 507196 pages

DETAIL: 455001 index row versions were removed.
506295 index pages have been deleted, 504998 are currently reusable.
CPU 6.44s/2.27u sec elapsed 138.70 sec.
INFO: "sl_log_1": removed 455001 row versions in 7567 pages
DETAIL: CPU 0.56s/0.40u sec elapsed 6.63 sec.
INFO: "sl_log_1": found 455001 removable, 309318 nonremovable row
versions in 13764 pages

DETAIL: 0 dead row versions cannot be removed yet.
There were 51972 unused item pointers.
0 pages are entirely empty.
CPU 24.13s/7.85u sec elapsed 486.49 sec.
INFO: vacuuming "pg_toast.pg_toast_955960155"
INFO: index "pg_toast_955960155_index" now contains 9 row versions
in 2 pages

DETAIL: 0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_955960155": found 0 removable, 9 nonremovable row
versions in 3 pages

DETAIL: 0 dead row versions cannot be removed yet.
There were 3 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

...I then checked the disk and those pages are still there. If I do a:

select count(*) from _my_cluster.sl_log_1;
count
---
6366
(1 row)

Would a VACUUM FULL take care of this? It seems to me that its not
clearing up the indexes properly. You are correct in that
I do see things getting much bigger on the master than on the
subscriber nodes. Could this cause my slony replication to bog down?

Also- I have a question about this comment:

You don't forcibly have to take Slony-I down during this, but the
locks taken out on tables by CLUSTER/VACUUM FULL will block slons from
doing any work until those transactions complete.

Thats because no writing will be done to the tables, thus, no slony
triggers will get triggered, correct? I'd rather not
shut down slony if I dont have to, but will if it "is safer/better/
more badass".

For those playing along at home,

$ find $PGDATA/base -name "[0-9]+\.[0-9]+"

...I had to use:

find $PGDATA/base -name "[0-9]*\.[0-9]*"

...but the pluses should have worked too. Still a much better way
than how I was doing it. Thanks again for helping me with this, its
greatly appreciated!

/kurt

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies


Campbell, Lance writes:


Francisco and Richard,
Why ask about disk or raid?  How would that impact any settings in
postgresql.conf?


If the user has 2 disks and says that he will do a lot of updates he could 
put pg_xlog in the second disk.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Y Sidhu

On 6/19/07, Francisco Reyes <[EMAIL PROTECTED]> wrote:

Campbell, Lance writes:

> Francisco and Richard,
> Why ask about disk or raid?  How would that impact any settings in
> postgresql.conf?

If the user has 2 disks and says that he will do a lot of updates he could
put pg_xlog in the second disk.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Let's not ask about disk or raid at this level of sanity tuning. It is
important for a newbie to take the right first step. When it comes to disks,
we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI,
and controller cards. Then we throw in RAID and the different levels
therein. Add to that, we can talk about drivers controlling these drives and
which OS is faster, more stable, etc. As you can see, a newbie would get
drowned. So, please keep it simple. I know many people on this list are
Gurus. We know you are the best in this field, but we are not and are just
trying to improve what we have.

--
Yudhvir Singh Sidhu
408 375 3134 cell

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

Yudhvir,

I completely agree.  I was just putting together a similar email.

Thanks,

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Y Sidhu
Sent: Tuesday, June 19, 2007 12:49 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

On 6/19/07, Francisco Reyes <[EMAIL PROTECTED]> wrote:

Campbell, Lance writes:

> Francisco and Richard,
> Why ask about disk or raid?  How would that impact any settings in
> postgresql.conf?

If the user has 2 disks and says that he will do a lot of updates he
could 
put pg_xlog in the second disk.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Let's not ask about disk or raid at this level of sanity tuning. It is
important for a newbie to take the right first step. When it comes to
disks, we start talking I/O, SATA, SCSI and the varying degrees of SATA
and SCSI, and controller cards. Then we throw in RAID and the different
levels therein. Add to that, we can talk about drivers controlling these
drives and which OS is faster, more stable, etc. As you can see, a
newbie would get drowned. So, please keep it simple. I know many people
on this list are Gurus. We know you are the best in this field, but we
are not and are just trying to improve what we have.

-- 
Yudhvir Singh Sidhu
408 375 3134 cell

Re: [PERFORM] Performance query about large tables, lots of concurrent access

2007-06-19 Thread Heikki Linnakangas


Francisco Reyes wrote:

I have a setup with several databases (the largest of which is 1TB 
database) and I do a nightly vacuum analyze for ALL databases. It takes 
about 22 hours. And this is with constant updates to the large 1TB 
database. This is with Postgresql 8.1.3


22h nightly? Wow, you have long nights ;-).

On a serious note, the index vacuum improvements in 8.2 might help you 
to cut that down. You seem to be happy with your setup, but I thought 
I'd mention it..


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread D'Arcy J.M. Cain

On Tue, 19 Jun 2007 12:58:26 -0400
Francisco Reyes <[EMAIL PROTECTED]> wrote:
> Campbell, Lance writes:
> > 3) I suggested JavaScript because most people that get started with 
> > PostgreSQL will go to the web in order to find out about issues relating 
> 
> Why not c?

Why not whatever and install it on www.PostgreSQL.org?  Is there any
reason that this tool would need to be run on every installation.  Run
it on the site and it can always be up to date and can be written in
whatever language is easiest to maintain on the mother system.

I would also like to make a pitch for a JavaScript-free tool.  Just
collect all the pertinent information, work it out and display the
results in a second page.  Some people just don't like JavaScript and
turn it off even if we can run it in our browser.

-- 
D'Arcy J.M. Cain <[EMAIL PROTECTED]> |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

D'Arcy,
I wanted to put it on the www.postgresql.org site.  That is what I said
in my original email.  I don't believe anyone from the actual project
has contacted me.

I am setting up a JavaScript version first.  If someone wants to do a
different one feel free.  I will have all of the calculations in the
JavaScript so it should be easy to do it in any language.

Thanks,

Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of D'Arcy J.M.
Cain
Sent: Tuesday, June 19, 2007 12:32 PM
To: Francisco Reyes
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

On Tue, 19 Jun 2007 12:58:26 -0400
Francisco Reyes <[EMAIL PROTECTED]> wrote:
> Campbell, Lance writes:
> > 3) I suggested JavaScript because most people that get started with 
> > PostgreSQL will go to the web in order to find out about issues
relating 
> 
> Why not c?

Why not whatever and install it on www.PostgreSQL.org?  Is there any
reason that this tool would need to be run on every installation.  Run
it on the site and it can always be up to date and can be written in
whatever language is easiest to maintain on the mother system.

I would also like to make a pitch for a JavaScript-free tool.  Just
collect all the pertinent information, work it out and display the
results in a second page.  Some people just don't like JavaScript and
turn it off even if we can run it in our browser.

-- 
D'Arcy J.M. Cain <[EMAIL PROTECTED]> |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[PERFORM] PostgreSQL Configuration Tool for Dummies

Below is a link to the HTML JavaScript configuration page I am creating:

 

http://www.webservices.uiuc.edu/postgresql/

 

I had many suggestions.  Based on the feedback I received, I put
together the initial list of questions.  This list of questions can be
change.

 

Memory

There are many different ways to ask about memory.  Rather than ask a
series of questions I went with a single question, #2.  If it is better
to ask about the memory in a series of questions then please give me the
questions you would ask and why you would ask each of them.  From my
understanding the primary memory issue as it relates to PostgreSQL is
"how much memory is available to PostgreSQL".  Remember that this needs
to be as simple as possible.

 

My next step is to list the PostgreSQL parameters found in the
postgresql.conf file and how I will generate their values based on the
questions I have so far.  I will primarily focus on PostgreSQL 8.2.x.
Once I have a consensus from everyone then I will put functionality
behind the "Generate Suggested Settings" button.

 

Thanks for all of the feedback, 

 

 

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Y Sidhu


On 6/19/07, Campbell, Lance <[EMAIL PROTECTED]> wrote:


 Below is a link to the HTML JavaScript configuration page I am creating:



http://www.webservices.uiuc.edu/postgresql/



I had many suggestions.  Based on the feedback I received, I put together
the initial list of questions.  This list of questions can be change.



Memory

There are many different ways to ask about memory.  Rather than ask a
series of questions I went with a single question, #2.  If it is better to
ask about the memory in a series of questions then please give me the
questions you would ask and why you would ask each of them.  From my
understanding the primary memory issue as it relates to PostgreSQL is "how
much memory is available to PostgreSQL".  Remember that this needs to be as
simple as possible.



My next step is to list the PostgreSQL parameters found in the
postgresql.conf file and how I will generate their values based on the
questions I have so far.  I will primarily focus on PostgreSQL 8.2.x.
Once I have a consensus from everyone then I will put functionality behind
the "Generate Suggested Settings" button.



Thanks for all of the feedback,





Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu




Lance,

Simply awesome!


--
Yudhvir Singh Sidhu
408 375 3134 cell

Re: [PERFORM] Performance query about large tables, lots of concurrent access


Heikki Linnakangas writes:

On a serious note, the index vacuum improvements in 8.2 might help you 
to cut that down. You seem to be happy with your setup, but I thought 
I'd mention it..


I am really, really trying.. to go to 8.2.
I have a thread on "general" going on for about a week.
I am unable to restore a database on 8.2.4.. on a particular machine.
Don't know if the issue is the machine configuration or whether I have found 
a Postgresql bug.


The plan is to copy the data over and work on migrating to the second 
machine.


Also we are splitting the database so historical information (which never 
changes for us) will be in one DB and all the active/current data will be on 
another. This way our backups/vacuums will be faster.


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies


Campbell, Lance writes:

For the "6) Are your searches:"
How about having "many simple"


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread david

On Tue, 19 Jun 2007, Y Sidhu wrote:

On 6/19/07, Francisco Reyes <[EMAIL PROTECTED]> wrote:

 Campbell, Lance writes:

>  Francisco and Richard,
>  Why ask about disk or raid?  How would that impact any settings in
>  postgresql.conf?

 If the user has 2 disks and says that he will do a lot of updates he could
 put pg_xlog in the second disk.

 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

Let's not ask about disk or raid at this level of sanity tuning. It is
important for a newbie to take the right first step. When it comes to disks,
we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI,
and controller cards. Then we throw in RAID and the different levels
therein. Add to that, we can talk about drivers controlling these drives and
which OS is faster, more stable, etc. As you can see, a newbie would get
drowned. So, please keep it simple. I know many people on this list are
Gurus. We know you are the best in this field, but we are not and are just
trying to improve what we have.

I strongly agree.

besides, the number and types of drives, raid configurations, etc is so 
variable that I strongly believe that the right answer is going to be 
something along the lines of 'run this tool and then enter the number(s) 
that the tool reports' and then let the tool measure the end result of all 
the variables rather then trying to calculate the results.

David Lang

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread david


On Tue, 19 Jun 2007, Campbell, Lance wrote:


Memory

There are many different ways to ask about memory.  Rather than ask a
series of questions I went with a single question, #2.  If it is better
to ask about the memory in a series of questions then please give me the
questions you would ask and why you would ask each of them.  From my
understanding the primary memory issue as it relates to PostgreSQL is
"how much memory is available to PostgreSQL".  Remember that this needs
to be as simple as possible.


there are three catagories of memory useage

1. needed by other software
2. available for postgres
3. needed by the OS

it's not clear if what you are asking is #2 or a combination of #2 and #3

IMHO you should ask for #2 and #3, possibly along the lines of "how much 
memory is in the machine that isn't already used by other applications"


David Lang

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] Best use of second controller with faster disks?


Vivek Khera writes:

no file will ever be larger than 1Gb I didn't need to make any  
adjustments to the newfs parameters.


You should consider using "newfs -i 65536" for partitions to be used for 
postgresql. You will get more usable space and will still have lots of free 
inodes.


For my next postgresql server I am likely going to do "newfs -i 262144"

On my current primary DB I have 2049 inodes in use and 3,539,389 free.
That was with newfs -i 65536. 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Josh Berkus


> there are three catagories of memory useage
>
> 1. needed by other software
> 2. available for postgres
> 3. needed by the OS

There's actually only two required memory questions:

M1) How much RAM do you have on this machine?
M2) Is this:
() Dedicated PostgreSQL Server?
() Server shared with a few other applications?
() Desktop?

I don't think the "mostly reads / mostly writes" question covers anything,  
nor is it likely to produce accurate answers.  Instead, we need to ask the 
users to characterize what type of application they are running:

T1) Please characterize the general type of workload you will be running on 
this database.  Choose one of the following four:
() WEB: any scripting-language application which mainly needs to support 
90% or more data reads, and many rapid-fire small queries over a large 
number of connections. Examples: forums, content management systems, 
directories. 
() OLTP: this application involves a large number of INSERTs, UPDATEs and 
DELETEs because most users are modifying data instead of just reading it.  
Examples: accounting, ERP, logging tools, messaging engines.
() Data Warehousing: also called "decision support" and "BI", these 
database support a fairly small number of large, complicated reporting 
queries, very large tables, and large batch data loads.
() Mixed/Other: if your application doesn't fit any of the above, our 
script will try to pick "safe, middle-of-the-road" values.

Hmmm, drop question (6) too.

(2) should read: "What is the maximum number of database connections which 
you'll need to support?  If you don't know, we'll pick a default."

Other questions we need:

How many/how fast processors do you have?  Pick the option which seems 
closest to what you have:
() A single laptop processor
() Single or dual older processors (1ghz)
() Dual or quad current consumer processors (2ghz+)
() Large, recent multi-core server system

"What OS Are You Using", of course, needs to have Linux, Solaris, BSD, OSX 
and Windows.  At some point, this tool will also need to generate for the 
user any shmem settings that they need to make on the OS.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Ray Stell

On Tue, Jun 19, 2007 at 10:49:01AM -0700, Y Sidhu wrote:
> On 6/19/07, Francisco Reyes <[EMAIL PROTECTED]> wrote:
> >
> >Campbell, Lance writes:
> >
> >> Francisco and Richard,
> >> Why ask about disk or raid?  How would that impact any settings in
> >> postgresql.conf?
> >
> >If the user has 2 disks and says that he will do a lot of updates he could
> >put pg_xlog in the second disk.
> >
> >
> >---(end of broadcast)---
> >TIP 2: Don't 'kill -9' the postmaster
> >
> 
> Let's not ask about disk or raid at this level of sanity tuning. It is
> important for a newbie to take the right first step. When it comes to disks,
> we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI,
> and controller cards. Then we throw in RAID and the different levels
> therein. Add to that, we can talk about drivers controlling these drives and
> which OS is faster, more stable, etc. As you can see, a newbie would get
> drowned. So, please keep it simple. I know many people on this list are
> Gurus. We know you are the best in this field, but we are not and are just
> trying to improve what we have.



Ignoring the i/o subsystem in db configuration, there's an idea.

You could request some bonnie++ output (easy to aquire) as a baseline, 
do your magic analysis based on this, and skip it if it is not provided
with a warning.  Course the magic may be harder to come by. 

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[PERFORM] PostgreSQL Configuration Tool for Dummies

Now I am at the difficult part, what parameters to calculate and how to
calculate them.  Everything below has to do with PostgreSQL version 8.2:

 

The parameters I would think we should calculate are:

max_connections

shared_buffers

work_mem

maintenance_work_mem

effective_cache_size

random_page_cost

 

Any other variables?  I am open to suggestions.

 

 

Calculations based on values supplied in the questions at the top of the
page:

 

max_connection= question #3 or a minimum of 8

 

effective_cache_size={question #2}MB

 

maintenance_work_mem= ({question #2} * .1) MB

 

Any thoughts on the other variables based on the questions found at the
top of the below web page?

 

http://www.webservices.uiuc.edu/postgresql/ 

 

Thanks,

 

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

Re: [PERFORM] [ADMIN] Postgres VS Oracle

2007-06-19 Thread Simon Riggs

On Mon, 2007-06-18 at 17:55 +0200, David Tokmatchi wrote:

> I am DBA for Oracle and beginner on Postgres. For an company in
> France, I must make a comparative study, between Postgres and Oracle.
> Can you send any useful document which can help me.
> Scalability ? Performance? Benchmark ? Availability ? Architecture ?
> Limitation : users, volumes ? Resouces needed ? Support ?

I would suggest you make your comparison based upon your specific needs,
not a purely abstract comparison. If your not sure what your
requirements are, research those first.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Replication

2007-06-19 Thread Jeff Davis

On Thu, 2007-06-14 at 16:14 -0700, Craig James wrote:
> Looking for replication solutions, I find:
> 
> Slony-I
>  Seems good, single master only, master is a single point of failure,
>  no good failover system for electing a new master or having a failed
>  master rejoin the cluster.  Slave databases are mostly for safety or
>  for parallelizing queries for performance.  Suffers from O(N^2) 
>  communications (N = cluster size).
> 

There's MOVE SET which transfers the origin (master) from one node to
another without losing any committed transactions.

There's also FAILOVER, which can set a new origin even if the old origin
is completely gone, however you will lose the transactions that haven't
been replicated yet.

To have a new node join the cluster, you SUBSCRIBE SET, and you can MOVE
SET to it later if you want that to be the master.

Regards,
Jeff Davis

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Maintenance question / DB size anomaly...

Kurt Overberg <[EMAIL PROTECTED]> writes:
> mydb # vacuum verbose _my_cluster.sl_log_1 ;
> INFO:  "sl_log_1": found 455001 removable, 309318 nonremovable row  
> versions in 13764 pages
> DETAIL:  0 dead row versions cannot be removed yet.

Hmm.  So you don't have a long-running-transactions problem (else that
DETAIL number would have been large).  What you do have is a failure
to vacuum sl_log_1 on a regular basis (because there are so many
dead/removable rows).  I suspect also some sort of Slony problem,
because AFAIK a properly operating Slony system shouldn't have that
many live rows in sl_log_1 either --- don't they all represent
as-yet-unpropagated events?  I'm no Slony expert though.  You probably
should ask about that on the Slony lists.

> ...I then checked the disk and those pages are still there.

Yes, regular VACUUM doesn't try very hard to shorten the disk file.

> Would a VACUUM FULL take care of this?

It would, but it will take an unpleasantly long time with so many live
rows to reshuffle.  I'd advise first working to see if you can get the
table down to a few live rows.  Then a VACUUM FULL will be a snap.
Also, you might want to do REINDEX after VACUUM FULL to compress the
indexes --- VACUUM FULL isn't good at that.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

PFC <[EMAIL PROTECTED]> writes:
>> What version of PostgreSQL are you using?

>   I think newbies should be pushed a bit to use the latest versions,

How about pushed *hard* ?  I'm constantly amazed at the number of people
who show up in the lists saying they installed 7.3.2 or whatever random
version they found in a dusty archive somewhere.  "Please upgrade" is at
least one order of magnitude more valuable configuration advice than
anything else we could tell them.

If the configurator is a live tool on the website, then it could be
aware of the latest release numbers and prod people with an appropriate
amount of urgency depending on how old they say their version is.  This
may be the one good reason not to provide it as a standalone program.

(No, we shouldn't make it try to "phone home" for latest release numbers
--- in the first place, that won't work if the machine is really
isolated from the net, and in the second place people will be suspicious
of the motives.)

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [PERFORM] Maintenance question / DB size anomaly...

That's the thing thats kinda blowing my mind here, when I look at  
that table:


db1=# select count(*) from _my_cluster.sl_log_1 ;
count
---
  6788
(1 row)

As far as my DB is concerned, there's only ~7000 rows (on average)  
when I look
in there (it does fluctuate, I've seen it go as high as around 12k,  
but then its

gone back down, so I know events are moving around in there).

So from what I can tell- from the disk point of view, there's ~11Gb  
of data; from the
vacuum point of view there's 309318 rows.  From the psql point of  
view, there's only
around 7,000.  Am I missing something?  Unless there's something  
going on under the
hood that I don't know about (more than likely), it seems like my  
sl_log_1 table is munged or
somehow otherwise very screwed up.  I fear that a re-shuffling or  
dropping/recreating
the index will mess it up further.  Maybe when I take my production  
systems down for
maintenance, can I wait until sl_log_1 clears out, so then I can just  
drop that

table altogether (and re-create it of course)?

Thanks!

/kurt




On Jun 19, 2007, at 5:33 PM, Tom Lane wrote:


Kurt Overberg <[EMAIL PROTECTED]> writes:

mydb # vacuum verbose _my_cluster.sl_log_1 ;
INFO:  "sl_log_1": found 455001 removable, 309318 nonremovable row
versions in 13764 pages
DETAIL:  0 dead row versions cannot be removed yet.


Hmm.  So you don't have a long-running-transactions problem (else that
DETAIL number would have been large).  What you do have is a failure
to vacuum sl_log_1 on a regular basis (because there are so many
dead/removable rows).  I suspect also some sort of Slony problem,
because AFAIK a properly operating Slony system shouldn't have that
many live rows in sl_log_1 either --- don't they all represent
as-yet-unpropagated events?  I'm no Slony expert though.  You probably
should ask about that on the Slony lists.


...I then checked the disk and those pages are still there.


Yes, regular VACUUM doesn't try very hard to shorten the disk file.


Would a VACUUM FULL take care of this?


It would, but it will take an unpleasantly long time with so many live
rows to reshuffle.  I'd advise first working to see if you can get the
table down to a few live rows.  Then a VACUUM FULL will be a snap.
Also, you might want to do REINDEX after VACUUM FULL to compress the
indexes --- VACUUM FULL isn't good at that.

regards, tom lane

---(end of  
broadcast)---

TIP 4: Have you searched our list archives?

   http://archives.postgresql.org



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Josh Berkus

Lance,

> The parameters I would think we should calculate are:
>
> max_connections
>
> shared_buffers
>
> work_mem
>
> maintenance_work_mem
>
> effective_cache_size
>
> random_page_cost

Actually, I'm going to argue against messing with random_page_cost.  It's a 
cannon being used when a slingshot is called for.  Instead (and this was 
the reason for the "What kind of CPU?" question) you want to reduce the 
cpu_* costs.  I generally find that if cpu_* are reduced as appropriate to 
modern faster cpus, and effective_cache_size is set appropriately, a 
random_page_cost of 3.5 seems to work for appropriate choice of index 
scans.

If you check out my spreadsheet version of this:
http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc
... you'll see that the approach I found most effective was to create 
profiles for each of the types of db applications, and then adjust the 
numbers based on those.  

Other things to adjust:
wal_buffers
checkpoint_segments
commit_delay
vacuum_delay
autovacuum

Anyway, do you have a pgfoundry ID?  I should add you to the project.


-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Maintenance question / DB size anomaly...

Kurt Overberg <[EMAIL PROTECTED]> writes:
> That's the thing thats kinda blowing my mind here, when I look at  
> that table:

> db1=# select count(*) from _my_cluster.sl_log_1 ;
> count
> ---
>6788
> (1 row)

Well, that's real interesting.  AFAICS there are only two possibilities:

1. VACUUM sees the other 300k tuples as INSERT_IN_PROGRESS; a look at
the code shows that these are counted the same as plain live tuples,
but they'd not be visible to other transactions.  I wonder if you could
have any really old open transactions that might have inserted all those
tuples?

2. The other 300k tuples are committed good, but they are not seen as
valid by a normal MVCC-aware transaction, probably because of
transaction wraparound.  This would require the sl_log_1 table to have
escaped vacuuming for more than 2 billion transactions, which seems a
bit improbable but maybe not impossible.  (You did say you were running
PG 8.0.x, right?  That's the last version without any strong defenses
against transaction wraparound...)

The way to get some facts, instead of speculating, would be to get hold
of the appropriate version of pg_filedump from
http://sources.redhat.com/rhdb/ and dump out sl_log_1 with it
(probably the -i option would be sufficient), then take a close look
at the tuples that aren't visible to other transactions.  (You could
do "select ctid from sl_log_1" to determine which ones are visible.)

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Charles Sprickman


On Tue, 19 Jun 2007, Josh Berkus wrote:


"What OS Are You Using", of course, needs to have Linux, Solaris, BSD, OSX
and Windows.  At some point, this tool will also need to generate for the
user any shmem settings that they need to make on the OS.


I also noticed that on FreeBSD (6.2) at least the stock config simply 
won't run without building a new kernel that bumps up all the SHM stuff or 
dropping down resource usage in the postgres config...


Overall, I like the idea.  I've been slowly working on weaning myself off 
of mysql and I think removing any roadblocks that new users might stumble 
upon seems like an excellent way to get more exposure.


Charles


--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [PERFORM] Maintenance question / DB size anomaly...

2007-06-19 Thread Bill Moran

Kurt Overberg <[EMAIL PROTECTED]> wrote:
>
> That's the thing thats kinda blowing my mind here, when I look at  
> that table:
> 
> db1=# select count(*) from _my_cluster.sl_log_1 ;
> count
> ---
>6788
> (1 row)
> 
> As far as my DB is concerned, there's only ~7000 rows (on average)  
> when I look
> in there (it does fluctuate, I've seen it go as high as around 12k,  
> but then its
> gone back down, so I know events are moving around in there).

This is consistent with my experience with Slony and sl_log_[12]

I'm pretty sure that the slon processes vacuum sl_log_* on a fairly
regular basis.  I'm absolutely positive that slon occasionally switches
from using sl_log_1, to sl_log_2, then truncates sl_log_1 (then, after
some time, does the same in reverse)

So, in order for you to get massive bloat of the sl_log_* tables, you
must be doing a LOT of transactions in the time before it switches
logs and truncates the unused version.  Either that, or something is
going wrong.

> So from what I can tell- from the disk point of view, there's ~11Gb  
> of data; from the
> vacuum point of view there's 309318 rows.  From the psql point of  
> view, there's only
> around 7,000.  Am I missing something?

Something seems wrong here.  Correct me if I'm missing something, but
you're saying the table takes up 11G on disk, but vacuum says there are
~14000 pages.  That would mean your page size is ~800K.  Doesn't seem
right.

> Unless there's something  
> going on under the
> hood that I don't know about (more than likely), it seems like my  
> sl_log_1 table is munged or
> somehow otherwise very screwed up.  I fear that a re-shuffling or  
> dropping/recreating
> the index will mess it up further.  Maybe when I take my production  
> systems down for
> maintenance, can I wait until sl_log_1 clears out, so then I can just  
> drop that
> table altogether (and re-create it of course)?

Possibly drop this node from the Slony cluster and re-add it.  Unless
it's the origin node, in which case you'll have to switchover, then
redo the origin then switch back ...

> 
> Thanks!
> 
> /kurt
> 
> 
> 
> 
> On Jun 19, 2007, at 5:33 PM, Tom Lane wrote:
> 
> > Kurt Overberg <[EMAIL PROTECTED]> writes:
> >> mydb # vacuum verbose _my_cluster.sl_log_1 ;
> >> INFO:  "sl_log_1": found 455001 removable, 309318 nonremovable row
> >> versions in 13764 pages
> >> DETAIL:  0 dead row versions cannot be removed yet.
> >
> > Hmm.  So you don't have a long-running-transactions problem (else that
> > DETAIL number would have been large).  What you do have is a failure
> > to vacuum sl_log_1 on a regular basis (because there are so many
> > dead/removable rows).  I suspect also some sort of Slony problem,
> > because AFAIK a properly operating Slony system shouldn't have that
> > many live rows in sl_log_1 either --- don't they all represent
> > as-yet-unpropagated events?  I'm no Slony expert though.  You probably
> > should ask about that on the Slony lists.
> >
> >> ...I then checked the disk and those pages are still there.
> >
> > Yes, regular VACUUM doesn't try very hard to shorten the disk file.
> >
> >> Would a VACUUM FULL take care of this?
> >
> > It would, but it will take an unpleasantly long time with so many live
> > rows to reshuffle.  I'd advise first working to see if you can get the
> > table down to a few live rows.  Then a VACUUM FULL will be a snap.
> > Also, you might want to do REINDEX after VACUUM FULL to compress the
> > indexes --- VACUUM FULL isn't good at that.
> >
> > regards, tom lane
> >
> > ---(end of  
> > broadcast)---
> > TIP 4: Have you searched our list archives?
> >
> >http://archives.postgresql.org
> 
> 
> ---(end of broadcast)---
> TIP 7: You can help support the PostgreSQL project by donating at
> 
> http://www.postgresql.org/about/donate
> 
> 
> 
> 
> 
> 

-- 
Bill Moran
Collaborative Fusion Inc.

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

IMPORTANT: This message contains confidential information
and is intended only for the individual named. If the reader of
this message is not an intended recipient (or the individual
responsible for the delivery of this message to an intended
recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited. Please
notify the sender immediately by e-mail if you have received
this e-mail by mistake and delete this e-mail from your system.
E-mail transmission cannot be guaranteed to be secure or
error-free as information could be intercepted, corrupted, lost,
destroyed, arrive late or incomplete, or contain viruses. The
sender therefore does not accept liability for any errors or
omissions in the contents of this message, which arise as a
result of e-mail transmission.
*

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies


Campbell, Lance writes:


max_connections


Shouldn't that come straight from the user?


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [PERFORM] Volunteer to build a configuration tool

2007-06-19 Thread Mike Benoit

It would be cool if someone started a generic configuration+benchmark
utility that could be used with virtually any software. Something like
this:

1. Create a configuration file parser for your specific application, be
it PostgreSQL, MySQL, Apache, whatever.

2. Create a min/max or X,Y,Z configuration option file that determines
which options to try. ie:

shared_buffers = 1000-2[1000] //1000 is the increment by
wal_buffers = 8,16,32
...

3. Create start/stop scripts for the specific application

4. Create a benchmark script for the application that returns relevant
metrics. In PGSQL's case, it would be tied in to PG bench probably. In
Apache's case AB. This utility would of course need to know how to read
the metrics to determine what is "best".

5. Run the utility. Ideally it would use some sort of genetic algorithm
to benchmark the application initially to get base numbers, then
one-by-one apply the different configuration options and re-run the
benchmark. It would output the metrics for each run and once it is done,
pick the best run and let you know what those settings are.

I don't think something like this would be very difficult at all to
write, and it would be modular enough to work for virtually any
application. For a database it would take a while to run depending on
the benchmark script, but even that you could have a "fast" and "slow"
benchmark script that could be easily run when you first install
PostgreSQL. This way too your not worrying about how much memory the
system has, or how many disks they have, etc... The system will figure
out the best possible settings for a specific benchmark. 

Not to mention people could easily take a SQL log of their own
application running, and use that as the benchmark to get "real world"
numbers. 

Any other sort of configuration "suggestion" utility will always have
the question of what do you recommend? How much data do you try to get
and what can be determined from that data to get the best settings? Is
it really going to be that much better then the default, at least enough
better to warrant the work and effort put into it?

On Mon, 2007-06-18 at 10:04 -0500, Campbell, Lance wrote:
> I am a Java Software architect, DBA, and project manager for the
> University of Illinois, Department of Web Services.  We use PostgreSQL
> to serve about 2 million pages of dynamic content a month; everything
> from calendars, surveys, forms, discussion boards, RSS feeds, etc.  I
> am really impressed with this tool.
> 
>  
> 
> The only major problem area I have found where PostgreSQL is really
> lacking is in “what should my initial configuration settings be?”  I
> realize that there are many elements that can impact a DBA’s specific
> database settings but it would be nice to have a “configuration tool”
> that would get someone up and running better in the beginning.  
> 
>  
> 
> This is my idea:
> 
>  
> 
> A JavaScript HTML page that would have some basic questions at the
> top:
> 
> 1) How much memory do you have?
> 
> 2) How many connections will be made to the database?
> 
> 3) What operating system do you use?
> 
> 4) Etc…
> 
>  
> 
> Next the person would press a button, “generate”, found below the
> questions.  The JavaScript HTML page would then generate content for
> two Iframes at the bottom on the page.  One Iframe would contain the
> contents of the postgresql.conf file.  The postgresql.conf settings
> would be tailored more to the individuals needs than the standard
> default file.  The second Iframe would contain the default settings
> one should consider using with their operating system.
> 
>  
> 
> My web team would be very happy to develop this for the PostgreSQL
> project.   It would have saved us a lot of time by having a
> configuration tool in the beginning.  I am willing to make this a very
> high priority for my team.
> 
>  
> 
> Thanks,
> 
>  
> 
> Lance Campbell
> 
> Project Manager/Software Architect
> 
> Web Services at Public Affairs
> 
> University of Illinois
> 
> 217.333.0382
> 
> http://webservices.uiuc.edu
> 
>  
> 
> 
-- 
Mike Benoit <[EMAIL PROTECTED]>

signature.asc
Description: This is a digitally signed message part

Re: [PERFORM] Maintenance question / DB size anomaly...



On Jun 19, 2007, at 7:26 PM, Tom Lane wrote:


Kurt Overberg <[EMAIL PROTECTED]> writes:

That's the thing thats kinda blowing my mind here, when I look at
that table:



db1=# select count(*) from _my_cluster.sl_log_1 ;
count
---
   6788
(1 row)


Well, that's real interesting.  AFAICS there are only two  
possibilities:


1. VACUUM sees the other 300k tuples as INSERT_IN_PROGRESS; a look at
the code shows that these are counted the same as plain live tuples,
but they'd not be visible to other transactions.  I wonder if you  
could
have any really old open transactions that might have inserted all  
those

tuples?



Unlikely- the database has been stopped and restarted, which I think  
closes

out transactions?  Or could that cause the problems?



2. The other 300k tuples are committed good, but they are not seen as
valid by a normal MVCC-aware transaction, probably because of
transaction wraparound.  This would require the sl_log_1 table to have
escaped vacuuming for more than 2 billion transactions, which seems a
bit improbable but maybe not impossible.  (You did say you were  
running

PG 8.0.x, right?  That's the last version without any strong defenses
against transaction wraparound...)


Yep, this 8.0.4.  It has been running for over a year, fairly heavy  
updates, so

I would guess its possible.

The way to get some facts, instead of speculating, would be to get  
hold

of the appropriate version of pg_filedump from
http://sources.redhat.com/rhdb/ and dump out sl_log_1 with it
(probably the -i option would be sufficient), then take a close look
at the tuples that aren't visible to other transactions.  (You could
do "select ctid from sl_log_1" to determine which ones are visible.)



Okay, I've grabbed pg_filedump and got it running on the appropriate  
server.
I really have No Idea how to read its output though.  Where does the  
ctid from sl_log_1

appear in the following listing?


Block0 
 -
Block Offset: 0x Offsets: Lower  20 (0x0014)
Block: Size 8192  Version2Upper8176 (0x1ff0)
LSN:  logid949 recoff 0xae63b06c  Special  8176 (0x1ff0)
Items:0   Free Space: 8156
Length (including item array): 24

BTree Meta Data:  Magic (0x00053162)   Version (2)
   Root: Block (1174413)  Level (3)
   FastRoot: Block (4622)  Level (1)

 --
Empty block - no items listed

 -
BTree Index Section:
  Flags: 0x0008 (META)
  Blocks: Previous (0)  Next (0)  Level (0)


.../this was taken from the first page file (955960160.0 I guess you  
could

call it).  Does this look interesting to you, Tom?

FWIW- this IS on my master DB.  I've been slowly preparing an upgrade  
to 8.2, I guess

I'd better get that inta gear, hmmm?  :-(

/kurt




regards, tom lane



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] Maintenance question / DB size anomaly...

Kurt Overberg <[EMAIL PROTECTED]> writes:
> Okay, I've grabbed pg_filedump and got it running on the appropriate  
> server.
> I really have No Idea how to read its output though.  Where does the  
> ctid from sl_log_1
> appear in the following listing?

ctid is (block number, item number)

> Block0 
> BTree Meta Data:  Magic (0x00053162)   Version (2)
> Root: Block (1174413)  Level (3)
> FastRoot: Block (4622)  Level (1)

This seems to be an index, not the sl_log_1 table.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Volunteer to build a configuration tool

2007-06-19 Thread Greg Smith


On Wed, 20 Jun 2007, Mike Benoit wrote:


It would be cool if someone started a generic configuration+benchmark
utility that could be used with virtually any software.


It would be cool.  It would also be impossible.

Create a benchmark script for the application that returns relevant 
metrics. In PGSQL's case, it would be tied in to PG bench probably. In 
Apache's case AB. This utility would of course need to know how to read 
the metrics to determine what is "best".


The usual situation in these benchmarks is that you get parameters that 
adjust along a curve where there's a trade-off between, say, total 
throughput and worse-case latency.  Specifying "best" here would require a 
whole specification language if you want to model how real tuning efforts 
work.  The AB case is a little simpler, but for PostgreSQL you'd want 
something like "With this database and memory sizing, I want the best 
throughput possible where maximum latency is usually <5 seconds with 1-30 
clients running this transaction, while still maintaining at least 400 TPS 
with up to 100 clients, and the crash recovery time can't take more than 
10 minutes".  There are all sorts of local min/max situations and 
non-robust configurations an automated tool will put you into if you don't 
force an exhaustive search by being very specific like this.



I don't think something like this would be very difficult at all to
write


Here I just smile and say that proves you've never tried to write one :) 
It's a really hard problem that gets harder the more you poke at it. 
There's certainly lots of value to writing a utility that automatically 
tests out multiple parameter values in a batch and compares the results. 
If you're not doing that now, you should consider scripting something up 
that does.  Going beyond that to having it pick the optimal parameters 
more automatically would take AI much stronger than just a genetic 
algorithm approach.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [PERFORM] Volunteer to build a configuration tool