Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tambet Matiisen

> 
> Josh Berkus  writes:
> >> 1) When is it necessary to run REINDEX or drop/create
> >> an index?  All I could really find in the docs is:
> 
> > If you need to VACUUM FULL, you need to REINDEX as well.  
> For example, 
> > if you drop millions of rows from a table.
> 
> That's probably a pretty good rule of thumb.  It's worth 
> noting that VACUUM FULL tends to actively bloat indexes, not 
> reduce them in size, because it has to create new index 
> entries for the rows it moves before it can delete the old 
> ones.  So if a VACUUM FULL moves many rows you are likely to 
> see the indexes get bigger not smaller.
> 

Is my current understanding correct:

1) VACUUM defragments each page locally - moves free space to the end of
page.

2) VACUUM FULL defragments table globally - tries to fill up all
partially free pages and deletes all resulting empty pages.

3) Both VACUUM and VACUUM FULL do only local defragment for indexes.

4) If you want indexes to become fully defragmented, you need to
REINDEX.


If you happen to use triggers for denormalization, like I do, then you
have a lot of updates, which means that tables and indexes become quicky
cluttered with pages, which contain mostly dead tuples. If those tables
and indexes fill up shared buffers, then PostgreSQL slows down, because
it has to do a lot more IO than normal. Regular VACUUM FULL helped, but
I needed REINDEX as well, otherwise indexes grew bigger than tables
itself!

> > Better to up your max_fsm_pages and do regular VACUUMs regularly and
> > frequently so that you don't have to REINDEX at all.
> 
> Yes, definitely.  Also consider using CLUSTER rather than 
> VACUUM FULL when you need to clean up after massive deletions 
> from a table.  It's not any less intrusive in terms of 
> locking, but it's often faster and it avoids the index bloat 
> problem (since it effectively does a REINDEX).
> 

Hmm, thanks for a tip. BTW, is output of 

select count(1), sum(relpages) from pg_class where relkind in
('r','i','t')

good estimate for max_fsm_relations and max_fsm_pages?
Are these parameters used only during VACUUM or in runtime too?

  Tambet

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Dave Held
> -Original Message-
> From: Alex Turner [mailto:[EMAIL PROTECTED]
> Sent: Monday, April 18, 2005 5:50 PM
> To: Bruce Momjian
> Cc: Kevin Brown; pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] How to improve db performance with $7K?
> 
> Does it really matter at which end of the cable the queueing is done
> (Assuming both ends know as much about drive geometry etc..)?
> [...]

The parenthetical is an assumption I'd rather not make.  If my
performance depends on my kernel knowing how my drive is laid
out, I would always be wondering if a new drive is going to 
break any of the kernel's geometry assumptions.  Drive geometry
doesn't seem like a kernel's business any more than a kernel
should be able to decode the ccd signal of an optical mouse.
The kernel should queue requests at a level of abstraction that
doesn't depend on intimate knowledge of drive geometry, and the
drive should queue requests on the concrete level where geometry
matters.  A drive shouldn't guess whether a process is trying to
read a file sequentially, and a kernel shouldn't guess whether
sector 30 is contiguous with sector 31 or not.

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tom Lane
"Tambet Matiisen" <[EMAIL PROTECTED]> writes:
> Is my current understanding correct:

> 1) VACUUM defragments each page locally - moves free space to the end of
> page.

> 2) VACUUM FULL defragments table globally - tries to fill up all
> partially free pages and deletes all resulting empty pages.

Both versions of VACUUM do within-page defragmentation.  Also, both
versions will remove entirely-empty pages at the end of a table.
The difference is that VACUUM FULL actively attempts to make pages
at the end empty, by moving their contents into free space in earlier
pages.  Plain VACUUM never does cross-page data movement, which is
how come it doesn't need as strong a lock.

BTW, VACUUM FULL does the data movement back-to-front, and stops as soon
as it finds a tuple it cannot move down; which is a reasonable strategy
since the goal is merely to make the file shorter.  But it's entirely
likely that there will be lots of empty space left at the end.  For
instance the final state could have one 4K tuple in the last page and
up to 4K-1 free bytes in every earlier page.

> 3) Both VACUUM and VACUUM FULL do only local defragment for indexes.

> 4) If you want indexes to become fully defragmented, you need to
> REINDEX.

I don't think "defragment" is a notion that applies to indexes, at least
not in the same way as for tables.  It's true that there is no
cross-page data movement in either case.  In the last release or two
we've been able to recognize and recycle entirely-empty pages in both
btree and hash indexes, but such pages are almost never returned to the
OS; they're put on a freelist for re-use within the index, instead.

If you allow the table to grow to much more than its "normal" size,
ie, you allow many dead tuples to be formed, then getting back to
"normal" size is going to require VACUUM FULL + REINDEX (or you can use
CLUSTER or some varieties of ALTER TABLE).  This is not the recommended
maintenance process however.  Sufficiently frequent plain VACUUMs should
generally hold the free space to a tolerable level without requiring
any exclusive locking.

> Hmm, thanks for a tip. BTW, is output of 
> select count(1), sum(relpages) from pg_class where relkind in
> ('r','i','t')
> good estimate for max_fsm_relations and max_fsm_pages?

Within that one database, yes --- don't forget you must sum these
numbers across all DBs in the cluster.  Also you need some slop
in the max_fsm_pages setting because of quantization in the space
usage.  It's probably easier to let VACUUM VERBOSE do the calculation
for you.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross
Good question.  If the SCSI system was moving the head from track 1 to 10, and 
a request then came in for track 5, could the system make the head stop at 
track 5 on its way to track 10?  That is something that only the controller 
could do.  However, I have no idea if SCSI does that.

||  SCSI, AFAIK, does NOT do this. What SCSI can do is allow "next" request 
insertion into head
of request queue (queue-jumping), and/or defer request ordering to done by 
drive per se (queue
re-ordering).   I  have looked, in vain, for evidence that SCSI somehow 
magically "stops in the
middle of request to pick up data" (my words, not yours) 

The only part I am pretty sure about is that real-world experience shows SCSI 
is better for a mixed I/O environment.  Not sure why, exactly, but the command 
queueing obviously helps, and I am not sure what else does.

||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version of per se 
drive request reordering) 
   should go a looong way (but not all the way) toward making SATA 'enterprise 
acceptable'. Multiple 
   initiators (e.g. more than one host being able to talk to a drive) is a 
biggie, too. AFAIK only SCSI
   drives/controllers do that for now. 




-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Postgresql works too slow

2005-04-19 Thread Simon Riggs
On Mon, 2005-04-18 at 08:50 +0400, Nurlan Mukhanov (AL/EKZ) wrote:
> I'm trying to restore my database from dump in several parrallel processes, 
> but restore process works too slow.
> Number of rows about 100 000 000,
> RAM: 8192M
> CPU: Ultra Sparc 3
> Number of CPU: 4
> OS: SunOS sun 5.8
> RDBMS: PostgreSQL 8.0

> How to encrease postgresql speed? Why postgres took only 5.0% of CPU time?

When you say restore...what are you actually doing? 
An archive recovery?
A reload?
A file-level restore of database?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Postgresql works too slow

2005-04-19 Thread Brad Nicholson
Simon Riggs wrote:
On Mon, 2005-04-18 at 08:50 +0400, Nurlan Mukhanov (AL/EKZ) wrote:
 

I'm trying to restore my database from dump in several parrallel processes, but restore process works too slow.
Number of rows about 100 000 000,
RAM: 8192M
CPU: Ultra Sparc 3
Number of CPU: 4
OS: SunOS sun 5.8
RDBMS: PostgreSQL 8.0
   

 

How to encrease postgresql speed? Why postgres took only 5.0% of CPU time?
   

When you say restore...what are you actually doing? 
An archive recovery?
A reload?
A file-level restore of database?

 

If you are doing a restore off a pg_dump, did you dump the data as 
inserts?  This takes a lot more time to restore.

--
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp. 

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Josh Berkus
Tambet,

> Hmm, thanks for a tip. BTW, is output of
>
> select count(1), sum(relpages) from pg_class where relkind in
> ('r','i','t')

Well, if you do that for all databases in the cluster, it's the number you 
start with.  However, setting FSM_pages to that would be assuming that you 
excpected 100% of the rows to be replaced by UPDATES or DELETEs before you 
ran VACUUM.   I generally run VACUUM a little sooner than that.

See the end portion of:
http://www.powerpostgresql.com/PerfList

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Bruce Momjian
Mohan, Ross wrote:
> The only part I am pretty sure about is that real-world experience shows SCSI 
> is better for a mixed I/O environment.  Not sure why, exactly, but the 
> command queueing obviously helps, and I am not sure what else does.
> 
> ||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version of per 
> se drive request reordering) 
>should go a looong way (but not all the way) toward making SATA 
> 'enterprise acceptable'. Multiple 
>initiators (e.g. more than one host being able to talk to a drive) is a 
> biggie, too. AFAIK only SCSI
>drives/controllers do that for now. 

What is 'multiple initiators' used for in the real world?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross
Clustered file systems is the first/best example that
comes to mind. Host A and Host B can both request from diskfarm, eg. 



-Original Message-
From: Bruce Momjian [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 19, 2005 12:10 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Mohan, Ross wrote:
> The only part I am pretty sure about is that real-world experience 
> shows SCSI is better for a mixed I/O environment.  Not sure why, 
> exactly, but the command queueing obviously helps, and I am not sure 
> what else does.
> 
> ||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version 
> || of per se drive request reordering)
>should go a looong way (but not all the way) toward making SATA 
> 'enterprise acceptable'. Multiple 
>initiators (e.g. more than one host being able to talk to a drive) is a 
> biggie, too. AFAIK only SCSI
>drives/controllers do that for now.

What is 'multiple initiators' used for in the real world?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Bruce Momjian
Mohan, Ross wrote:
> Clustered file systems is the first/best example that
> comes to mind. Host A and Host B can both request from diskfarm, eg. 

So one host writes to part of the disk and another host writes to a
different part?

---

> -Original Message-
> From: Bruce Momjian [mailto:[EMAIL PROTECTED] 
> Sent: Tuesday, April 19, 2005 12:10 PM
> To: Mohan, Ross
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] How to improve db performance with $7K?
> 
> 
> Mohan, Ross wrote:
> > The only part I am pretty sure about is that real-world experience 
> > shows SCSI is better for a mixed I/O environment.  Not sure why, 
> > exactly, but the command queueing obviously helps, and I am not sure 
> > what else does.
> > 
> > ||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version 
> > || of per se drive request reordering)
> >should go a looong way (but not all the way) toward making SATA 
> > 'enterprise acceptable'. Multiple 
> >initiators (e.g. more than one host being able to talk to a drive) is a 
> > biggie, too. AFAIK only SCSI
> >drives/controllers do that for now.
> 
> What is 'multiple initiators' used for in the real world?
> 
> -- 
>   Bruce Momjian|  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us   |  (610) 359-1001
>   +  If your life is a hard drive, |  13 Roberts Road
>   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
> 
> ---(end of broadcast)---
> TIP 6: Have you searched our list archives?
> 
>http://archives.postgresql.org
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Richard_D_Levine


[EMAIL PROTECTED] wrote on 04/19/2005 11:10:22 AM:
>
> What is 'multiple initiators' used for in the real world?

I asked this same question and got an answer off list:  Somebody said their
SAN hardware used multiple initiators.  I would try to check the archives
for you, but this thread is becoming more of a rope.

Multiple initiators means multiple sources on the bus issuing I/O
instructions to the drives.  In theory you can have two computers on the
same SCSI bus issuing I/O requests to the same drive, or to anything else
on the bus, but I've never seen this implemented.  Others have noted this
feature as being a big deal, so somebody is benefiting from it.

Rick
>
> --
>   Bruce Momjian|  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us   |  (610) 359-1001
>   +  If your life is a hard drive, |  13 Roberts Road
>   +  Christ can be your backup.|  Newtown Square, Pennsylvania
19073
>
> ---(end of broadcast)---
> TIP 7: don't forget to increase your free space map settings


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tom Lane
Josh Berkus  writes:
>> select count(1), sum(relpages) from pg_class where relkind in
>> ('r','i','t')

> Well, if you do that for all databases in the cluster, it's the number you 
> start with.  However, setting FSM_pages to that would be assuming that you 
> excpected 100% of the rows to be replaced by UPDATES or DELETEs before you 
> ran VACUUM.   I generally run VACUUM a little sooner than that.

Not at all.  What it says is that you expect 100% of the pages to have
useful amounts of free space, which is a *much* weaker criterion.

I think you can usually get away with setting max_fsm_pages to less than
your actual disk footprint, but I'm not sure how much less.  It'd
probably depend a lot on your usage pattern --- for instance,
insert-only history tables don't need any FSM space.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross
Well, more like they both are allowed to issue disk
requests and the magical "clustered file system" manages
locking, etc. 

In reality, any disk is only reading/writing to one part of
the disk at any given time, of course, but that in the multiple
initiator deal, multiple streams of requests from multiple hosts
can be queued. 



-Original Message-
From: Bruce Momjian [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, April 19, 2005 12:16 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?


Mohan, Ross wrote:
> Clustered file systems is the first/best example that
> comes to mind. Host A and Host B can both request from diskfarm, eg.

So one host writes to part of the disk and another host writes to a different 
part?

---

> -Original Message-
> From: Bruce Momjian [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, April 19, 2005 12:10 PM
> To: Mohan, Ross
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] How to improve db performance with $7K?
> 
> 
> Mohan, Ross wrote:
> > The only part I am pretty sure about is that real-world experience
> > shows SCSI is better for a mixed I/O environment.  Not sure why, 
> > exactly, but the command queueing obviously helps, and I am not sure 
> > what else does.
> > 
> > ||  TCQ is the secret sauce, no doubt. I think NCQ (the SATA version
> > || of per se drive request reordering)
> >should go a looong way (but not all the way) toward making SATA 
> > 'enterprise acceptable'. Multiple 
> >initiators (e.g. more than one host being able to talk to a drive) is a 
> > biggie, too. AFAIK only SCSI
> >drives/controllers do that for now.
> 
> What is 'multiple initiators' used for in the real world?
> 
> -- 
>   Bruce Momjian|  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us   |  (610) 359-1001
>   +  If your life is a hard drive, |  13 Roberts Road
>   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
> 
> ---(end of 
> broadcast)---
> TIP 6: Have you searched our list archives?
> 
>http://archives.postgresql.org
> 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Josh Berkus
Tom,

> Not at all.  What it says is that you expect 100% of the pages to have
> useful amounts of free space, which is a *much* weaker criterion.

Hmmm.  Good point.   

This seems to be another instance where my rule-of-thumb was based on false 
logic but nevertheless arrived at correct numbers.  I've seldom, if ever, set 
FSM_pages above 50% of the pages in the active database ... and never run 
out.

H  actually, it seems like, if you are vacuuming regularly, you only 
*do* need to track pages that have been touched by DELETE or UPDATE.   Other 
pages would have already been vacuumed and not have any useful free space 
left.   Yes?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tom Lane
Josh Berkus  writes:
>> Not at all.  What it says is that you expect 100% of the pages to have
>> useful amounts of free space, which is a *much* weaker criterion.

> H  actually, it seems like, if you are vacuuming regularly, you only 
> *do* need to track pages that have been touched by DELETE or UPDATE.   Other 
> pages would have already been vacuumed and not have any useful free space 
> left.   Yes?

Well, the space has to be remembered until it's reused.  On the other
hand, there's nothing that says FSM has to be aware of all the free
space available at all times --- the real criterion to avoid bloat
is that after a VACUUM, enough space is logged in FSM to satisfy all
the insertions that will happen before the next VACUUM.  So you could
have situations where free space is temporarily forgotten (for lack
of slots in FSM), but other free space gets used instead, and eventually
a later VACUUM re-finds that free space and puts it into FSM.

I think it's true that the more often you vacuum, the less FSM you need,
but this doesn't have much to do with how much free space is actually
out there on disk.  It's because you only need enough FSM to record the
free space you'll need until the next vacuum.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Alvaro Herrera
On Tue, Apr 19, 2005 at 10:06:40AM -0400, Tom Lane wrote:

> BTW, VACUUM FULL does the data movement back-to-front, and stops as soon
> as it finds a tuple it cannot move down; which is a reasonable strategy
> since the goal is merely to make the file shorter.  But it's entirely
> likely that there will be lots of empty space left at the end.  For
> instance the final state could have one 4K tuple in the last page and
> up to 4K-1 free bytes in every earlier page.

Am I right in thinking that vacuum does at least two passes: one
front-to-back to find removable tuples, and other back-to-front for
movement?  Because if it doesn't work this way, it wouldn't relabel
(change Xmin/Xmax) tuples in early pages.  Or does it do something
different?

I know maintenance_work_mem is used for storing TIDs of to-be-moved
tuples for index cleanup ... how does it relate to the above?

-- 
Alvaro Herrera (<[EMAIL PROTECTED]>)
"Crear es tan difĂ­cil como ser libre" (Elsa Triolet)

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Am I right in thinking that vacuum does at least two passes: one
> front-to-back to find removable tuples, and other back-to-front for
> movement?

VACUUM FULL, yes.  VACUUM only does the first one.

> I know maintenance_work_mem is used for storing TIDs of to-be-moved
> tuples for index cleanup ... how does it relate to the above?

TIDs of to-be-deleted tuples, actually.  Movable tuples aren't stored,
they're just found on-the-fly during the back-to-front pass.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Josh Berkus
Folks,

Params:  PostgreSQL 8.0.1 on Solaris 10
Statistics = 500
(tablenames have been changed to protect NDA)

e1=# select tablename, null_frac, correlation, n_distinct from pg_stats where 
tablename = 'clickstream1' andattname = 'session_id';
  tablename   | null_frac | correlation | n_distinct
--+---+-+
 clickstream1 | 0 |0.412034 | 378174
(2 rows)

e1=# select count(distinct session_id) from clickstream1;
  count
-
 3174813

As you can see, n_distinct estimation is off by a factor of 10x and it's 
causing query planning problems.   Any suggested hacks to improve the 
histogram on this?

(BTW, increasing the stats to 1000 only doubles n_distinct, and doesn't solve 
the problem)

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Dave Held
> -Original Message-
> From: Josh Berkus [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, April 19, 2005 2:09 PM
> To: pgsql-perform
> Subject: [PERFORM] Bad n_distinct estimation; hacks suggested?
> 
> [...]
> (BTW, increasing the stats to 1000 only doubles n_distinct, 
> and doesn't solve the problem)

Speaking of which, is there a reason why statistics are limited
to 1000?  Performance?

__
David B. Held
Software Engineer/Array Services Group
200 14th Ave. East,  Sartell, MN 56377
320.534.3637 320.253.7800 800.752.8129

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Tom Lane
Josh Berkus  writes:
> As you can see, n_distinct estimation is off by a factor of 10x and it's 
> causing query planning problems.   Any suggested hacks to improve the 
> histogram on this?

What's the histogram itself look like?  (I'd like to see the whole
pg_stats row not just part of it ...)  There's probably no point in
showing the target=1000 version, but maybe target=100 would be
informative.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Josh Berkus
Tom,

> What's the histogram itself look like?  (I'd like to see the whole
> pg_stats row not just part of it ...)  There's probably no point in
> showing the target=1000 version, but maybe target=100 would be
> informative.

Here is the stats = 100 version.   Notice that n_distinct has gone down.

 schemaname |  tablename   |  attname   | null_frac | avg_width | 
n_distinct |   most_common_vals 

|most_common_freqs  
  
| histogram_bounds  | 
correlation
+--++---+---++---+--+-+-
 public | web_site_activity_fa | session_id | 0 | 8 |  
96107 | 
{4393922,6049228,6026260,4394034,60341,4393810,2562999,2573850,3006299,4705488,2561499,4705258,3007378,4705490,60327,60352,2560950,2567640,2569852,3006604,4394329,2570739,2406633,2407292,3006356,4393603,4394121,6449083,2565815,4387881,2406770,2407081,2564340,3007328,2406578,2407295,2562813,2567603,4387835,71014,2566253,2566900,6103079,2289424,2407597,2567627,2568333,3457448,23450,23670,60743,70739,2406818,2406852,2407511,2562816,3007446,6306095,60506,71902,591543,1169136,1447077,2285047,2406830,2573964,6222758,61393,70955,70986,71207,71530,262368,2289213,2406899,2567361,2775952,3006824,4387864,6239825,6244853,6422152,1739,58600,179293,278473,488407,1896390,2286976,2407020,2546720,2677019,2984333,3006133,3007497,3310286,3631413,3801909,4366116,4388025}
 
| 
{0.0017,0.00146667,0.0013,0.0011,0.00093,0.0009,0.0008,0.0008,0.00073,0.00073,0.0007,0.00063,0.0006,0.0006,0.00057,0.00057,0.00057,0.00057,0.00057,0.00057,0.00057,0.00053,0.0005,0.0005,0.0005,0.0005,0.0005,0.0005,0.00047,0.00047,0.00043,0.00043,0.00043,0.00043,0.0004,0.0004,0.0004,0.0004,0.0004,0.00037,0.00037,0.00037,0.00037,0.00033,0.00033,0.00033,0.00033,0.00033,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.00027,0.00027,0.00027,0.00027,0.00027,0.00027,0.00027,0.00027,0.00027,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00

Re: [PERFORM] [HACKERS] PLM pulling from CVS nightly for testing in STP

2005-04-19 Thread Mark Wong
I have dbt-2 tests automatically running against each pull from CVS
and have started to automatically compile results here:
http://developer.osdl.org/markw/postgrescvs/

I did start with a bit of a minimalistic approach, so I'm open for any
comments, feedback, etc.

Mark

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Thu, Apr 14, 2005 at 10:51:46AM -0500, Matthew Nuzum wrote:
> So if you all were going to choose between two hard drives where:
> drive A has capacity C and spins at 15K rpms, and
> drive B has capacity 2 x C and spins at 10K rpms and
> all other features are the same, the price is the same and C is enough
> disk space which would you choose?
> 
> I've noticed that on IDE drives, as the capacity increases the data
> density increases and there is a pereceived (I've not measured it)
> performance increase.
> 
> Would the increased data density of the higher capacity drive be of
> greater benefit than the faster spindle speed of drive A?

The increased data density will help transfer speed off the platter, but
that's it. It won't help rotational latency.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 07:41:49PM +0200, Jacques Caron wrote:
> It would be interesting to actually compare this to real-world (or 
> nearly-real-world) benchmarks to measure the effectiveness of features like 
> TCQ/NCQ etc.

I was just thinking that it would be very interesting to benchmark
different RAID configurations using dbt2. I don't know if this is
something that the lab is setup for or capable of, though.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 10:20:36AM -0500, Dave Held wrote:
> Hmm...so you're saying that at some point, quantity beats quality?
> That's an interesting point.  However, it presumes that you can
> actually distribute your data over a larger number of drives.  If
> you have a db with a bottleneck of one or two very large tables,
> the extra spindles won't help unless you break up the tables and
> glue them together with query magic.  But it's still a point to
> consider.

Huh? Do you know how RAID10 works?
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 06:41:37PM -, Mohan, Ross wrote:
> Don't you think "optimal stripe width" would be
> a good question to research the binaries for? I'd
> think that drives the answer, largely.  (uh oh, pun alert)
> 
> EG, oracle issues IO requests (this may have changed _just_ 
> recently) in 64KB chunks, regardless of what you ask for. 
> So when I did my striping (many moons ago, when the Earth 
> was young...) I did it in 128KB widths, and set the oracle 
> "multiblock read count" according. For oracle, any stripe size
> under 64KB=stupid, anything much over 128K/258K=wasteful. 
> 
> I am eager to find out how PG handles all this. 

AFAIK PostgreSQL requests data one database page at a time (normally
8k). Of course the OS might do something different.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Tue, Apr 19, 2005 at 11:22:17AM -0500, [EMAIL PROTECTED] wrote:
> 
> 
> [EMAIL PROTECTED] wrote on 04/19/2005 11:10:22 AM:
> >
> > What is 'multiple initiators' used for in the real world?
> 
> I asked this same question and got an answer off list:  Somebody said their
> SAN hardware used multiple initiators.  I would try to check the archives
> for you, but this thread is becoming more of a rope.
> 
> Multiple initiators means multiple sources on the bus issuing I/O
> instructions to the drives.  In theory you can have two computers on the
> same SCSI bus issuing I/O requests to the same drive, or to anything else
> on the bus, but I've never seen this implemented.  Others have noted this
> feature as being a big deal, so somebody is benefiting from it.

It's a big deal for Oracle clustering, which relies on shared drives. Of
course most people doing Oracle clustering are probably using a SAN and
not raw SCSI...
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-19 Thread Jim C. Nasby
You should re-run the function test using SQL as the function language
instead of plpgsql. There might be some performance to be had there.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] How to improve postgres performace

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 11:36:01AM -0300, Rodrigo Moreno wrote:
> I really worried about that, because it's no enough anymore, and users claim
> about performace. But running the vacuumdb full, everthing starts to run
> better again, so i think the problem is not related to a specific query.

Vacuum full will skew your results, unless you plan on running vacuum
full all the time. This is because you will always have some amount of
dead tuples in a table that has any update or delete activity. A regular
vacuum doesn't remove these tuples, it just marks them as available. So
over time, depending on how frequently a table is vacuumed, it will
settle down to a steady-state size that is greater than it's size after
a vacuum full.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Sort and index

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 10:44:43AM -0500, Dave Held wrote:
> Since you are fetching the entire table, you are touching all the rows.
> If the query were to fetch the rows in index order, it would be seeking
> all over the table's tracks.  By fetching in sequence order, it has a
> much better chance of fetching rows in a way that minimizes head seeks.
> Since disk I/O is generally 10-100x slower than RAM, the in-memory sort 
> can be surprisingly slow and still beat indexed disk access.  Of course,
> this is only true if the table can fit and be sorted entirely in memory
> (which, with 1500 rows, probably can).

Actually, the planner (at least in 7.4) isn't smart enough to consider
if the sort would fit in memory or not. I'm running a test right now to
see if it's actually faster to use an index in this case.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] What to do with 6 disks?

2005-04-19 Thread Jeff Frost
Now that we've hashed out which drives are quicker and more money equals 
faster...

Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid 
option would you use for a standalone postgres server?

a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os?
b) 1xRAID1 for OS/xlog, 1xRAID5 for data
c) 1xRAID10 for OS/xlong/data
d) 1xRAID1 for OS, 1xRAID10 for data
e) .
I was initially leaning towards b, but after talking to Josh a bit, I suspect 
that with only 4 disks the raid5 might be a performance detriment vs 3 raid 1s 
or some sort of split raid10 setup.

--
Jeff Frost, Owner   <[EMAIL PROTECTED]>
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [PERFORM] What to do with 6 disks?

2005-04-19 Thread Jim C. Nasby
http://stats.distributed.net is setup with the OS, WAL, and temp on a
RAID1 and the database on a RAID10. The drives are 200G SATA with a
3ware raid card. I don't think the controller has battery-backed cache,
but I'm not sure. In any case, it's almost never disk-bound on the
mirror; when it's disk-bound it's usually the RAID10. But this is a
read-mostly database. If it was write-heavy, that might not be the case.

Also, in general, I see very little disk activity from the OS itself, so
I don't think there's a large disadvantage to having it on the same
drives as part of your database. I would recommend different filesystems
for each, though. (ie: not one giant / partition)

On Tue, Apr 19, 2005 at 06:00:42PM -0700, Jeff Frost wrote:
> Now that we've hashed out which drives are quicker and more money equals 
> faster...
> 
> Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid 
> option would you use for a standalone postgres server?
> 
> a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os?
> b) 1xRAID1 for OS/xlog, 1xRAID5 for data
> c) 1xRAID10 for OS/xlong/data
> d) 1xRAID1 for OS, 1xRAID10 for data
> e) .
> 
> I was initially leaning towards b, but after talking to Josh a bit, I 
> suspect that with only 4 disks the raid5 might be a performance detriment 
> vs 3 raid 1s or some sort of split raid10 setup.
> 
> -- 
> Jeff Frost, Owner <[EMAIL PROTECTED]>
> Frost Consulting, LLC http://www.frostconsultingllc.com/
> Phone: 650-780-7908   FAX: 650-649-1954
> 
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
> 
>   http://www.postgresql.org/docs/faq
> 

-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Slow copy with little CPU/disk usage

2005-04-19 Thread Jim C. Nasby
A friend of mine has an application where he's copying in 4000 rows at a
time into a table that has about 4M rows. Each row is 40-50 bytes. This
is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk
SATA mirror, running FBSD 4.10-stable. There's one index on the table.

What's really odd is that neither the CPU or the disk are being
hammered. The box appears to be pretty idle; the postgresql proces is
using 4-5% CPU.

I seem to recall others running into this before, but I can't remember
what the issue was and I can't find it in the archives.

This is version 8.0, btw.
-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Sort and index

2005-04-19 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> Actually, the planner (at least in 7.4) isn't smart enough to consider
> if the sort would fit in memory or not.

Really?  Have you read cost_sort()?

It's certainly possible that the calculation is all wet, but to claim
that the issue is not considered is just wrong.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] What to do with 6 disks?

2005-04-19 Thread Josh Berkus
Jeff,

> Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid
> option would you use for a standalone postgres server?
>
> a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os?
> b) 1xRAID1 for OS/xlog, 1xRAID5 for data
> c) 1xRAID10 for OS/xlong/data
> d) 1xRAID1 for OS, 1xRAID10 for data
> e) .
>
> I was initially leaning towards b, but after talking to Josh a bit, I
> suspect that with only 4 disks the raid5 might be a performance detriment
> vs 3 raid 1s or some sort of split raid10 setup.

Knowing that your installation is read-heavy, I'd recommend (d), with the WAL 
on the same disk as the OS, i.e.

RAID1 2 disks OS, pg_xlog
RAID 1+0 4 disks pgdata

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] What to do with 6 disks?

2005-04-19 Thread William Yu
My experience:
1xRAID10 for postgres
1xRAID1 for OS + WAL
Jeff Frost wrote:
Now that we've hashed out which drives are quicker and more money equals 
faster...

Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid 
option would you use for a standalone postgres server?

a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os?
b) 1xRAID1 for OS/xlog, 1xRAID5 for data
c) 1xRAID10 for OS/xlong/data
d) 1xRAID1 for OS, 1xRAID10 for data
e) .
I was initially leaning towards b, but after talking to Josh a bit, I 
suspect that with only 4 disks the raid5 might be a performance 
detriment vs 3 raid 1s or some sort of split raid10 setup.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Slow copy with little CPU/disk usage

2005-04-19 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> A friend of mine has an application where he's copying in 4000 rows at a
> time into a table that has about 4M rows. Each row is 40-50 bytes. This
> is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk
> SATA mirror, running FBSD 4.10-stable. There's one index on the table.

If there's no hidden costs such as foreign key checks, that does seem
pretty dang slow.

> What's really odd is that neither the CPU or the disk are being
> hammered. The box appears to be pretty idle; the postgresql proces is
> using 4-5% CPU.

It's very hard to believe that *neither* disk nor CPU is maxed.
Can we see a reproducible test case, please?

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Sort and index

2005-04-19 Thread Jim C. Nasby
On Tue, Apr 19, 2005 at 11:01:26PM -0400, Tom Lane wrote:
> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> > Actually, the planner (at least in 7.4) isn't smart enough to consider
> > if the sort would fit in memory or not.
> 
> Really?  Have you read cost_sort()?
> 
> It's certainly possible that the calculation is all wet, but to claim
> that the issue is not considered is just wrong.

To be fair, no, I haven't looked at the code. This is based strictly on
anecdotal evidence on a 120M row table. I'm currently running a test to
see how an index scan compares to a seqscan. I also got the same results
when I added a where clause that would restrict it to about 7% of the
table.

Actually, after running some tests (below), the plan cost does change
when I change sort_mem (it was originally 5).

stats=# \d email_contrib
   Table "public.email_contrib"
   Column   |  Type   | Modifiers 
+-+---
 project_id | integer | not null
 id | integer | not null
 date   | date| not null
 team_id| integer | 
 work_units | bigint  | not null
Indexes:
"email_contrib_pkey" primary key, btree (project_id, id, date)
"email_contrib__pk24" btree (id, date) WHERE (project_id = 24)
"email_contrib__pk25" btree (id, date) WHERE (project_id = 25)
"email_contrib__pk8" btree (id, date) WHERE (project_id = 8)
"email_contrib__project_date" btree (project_id, date)
Foreign-key constraints:
"fk_email_contrib__id" FOREIGN KEY (id) REFERENCES stats_participant(id) ON 
UPDATE CASCADE
"fk_email_contrib__team_id" FOREIGN KEY (team_id) REFERENCES 
stats_team(team) ON UPDATE CASCADE

stats=# explain select * from email_contrib where project_id=8 order by 
project_id, id, date;
   QUERY PLAN   

 Sort  (cost=3613476.05..3635631.71 rows=8862263 width=24)
   Sort Key: project_id, id, date
   ->  Seq Scan on email_contrib  (cost=0.00..2471377.50 rows=8862263 width=24)
 Filter: (project_id = 8)
(4 rows)

stats=# explain select * from email_contrib order by project_id, id, date;
QUERY PLAN  
  
--
 Sort  (cost=25046060.83..25373484.33 rows=130969400 width=24)
   Sort Key: project_id, id, date
   ->  Seq Scan on email_contrib  (cost=0.00..2143954.00 rows=130969400 
width=24)
(3 rows)

stats=# select 8862263::float/130969400;
  ?column?  

 0.067687027657
(1 row)

stats=# explain select * from email_contrib where project_id=8 order by 
project_id, id, date;
 QUERY PLAN 
 
-
 Index Scan using email_contrib_pkey on email_contrib  (cost=0.00..6832005.57 
rows=8862263 width=24)
   Index Cond: (project_id = 8)
(2 rows)

stats=# explain select * from email_contrib order by project_id, id, date;
   QUERY PLAN   
 
-
 Index Scan using email_contrib_pkey on email_contrib  (cost=0.00..100055905.62 
rows=130969400 width=24)
(1 row)

stats=# set enable_seqscan=on;
SET
stats=# set sort_mem=1000;
SET
stats=# explain select * from email_contrib order by project_id, id, date;
QUERY PLAN  
  
--
 Sort  (cost=28542316.63..28869740.13 rows=130969400 width=24)
   Sort Key: project_id, id, date
   ->  Seq Scan on email_contrib  (cost=0.00..2143954.00 rows=130969400 
width=24)
(3 rows)

stats=# 

-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] Slow copy with little CPU/disk usage

2005-04-19 Thread Greg Stark
"Jim C. Nasby" <[EMAIL PROTECTED]> writes:

> What's really odd is that neither the CPU or the disk are being
> hammered. The box appears to be pretty idle; the postgresql proces is
> using 4-5% CPU.

Is he committing every row? In that case you would see fairly low i/o
bandwidth usage because most of the time is being spent seeking and waiting
for rotational latency.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Slow copy with little CPU/disk usage

2005-04-19 Thread Mischa Sandberg
Quoting Tom Lane <[EMAIL PROTECTED]>: 
 
> "Jim C. Nasby" <[EMAIL PROTECTED]> writes: 
> > A friend of mine has an application where he's copying in 4000 rows at a 
> > time into a table that has about 4M rows. Each row is 40-50 bytes. This 
> > is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk 
> > SATA mirror, running FBSD 4.10-stable. There's one index on the table. 
>  
> If there's no hidden costs such as foreign key checks, that does seem 
> pretty dang slow. 
>  
> > What's really odd is that neither the CPU or the disk are being 
> > hammered. The box appears to be pretty idle; the postgresql proces is 
> > using 4-5% CPU. 
--  
This sounds EXACTLY like my problem, if you make the box to a Xeon 2.4GHz, 2GB 
RAM ... with two SCSI drives (xlog and base); loading 10K rows of about 200 
bytes each; takes about 20 secs at the best, and much longer at the worst. By 
any chance does your friend have several client machines/processes trying to 
mass-load rows at the same time? Or at least some other processes updating 
that table in a bulkish way? What I get is low diskio, low cpu, even low 
context-switches ... and I'm betting he should take a look at pg_locks. For my 
own problem, I gather that an exclusive lock is necessary while updating 
indexes and heap, and the multiple processes doing the update can make that 
pathological. 
 
Anyway, have your friend check pg_locks. 
 
 
"Dreams come true, not free." -- S.Sondheim, ITW 


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

2005-04-19 Thread Jeff Frost
RAID1 2 disks OS, pg_xlog
RAID 1+0 4 disks pgdata
Looks like the consensus is RAID 1 for OS, pg_xlog and RAID10 for pgdata.  Now 
here's another performance related question:

I've seen quite a few folks touting the Opteron as 2.5x faster with postgres 
than a Xeon box.  What makes the Opteron so quick?  Is it that Postgres 
really prefers to run in 64-bit mode?

When I look at AMD's TPC-C scores where they are showing off the Opteron 
http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_8796_8800~96125,00.html
It doesn't appear 2.5x as fast as the Xeon systems, though I have heard from a 
few Postgres folks that a dual Opteron is 2.5x as fast as a dual Xeon.  I 
would think that AMD would be all over that press if they could show it, so 
what am I missing?  Is it a bus speed thing?  Better south bridge on the 
boards?

--
Jeff Frost, Owner   <[EMAIL PROTECTED]>
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] How to tell what your postgresql server is doing

2005-04-19 Thread Jeff Frost
Is there a way to look at the stats tables and tell what is jamming up your 
postgres server the most?  Other than seeing long running queries and watch 
top, atop, iostat, vmstat in separate xterms...I'm wondering if postgres keeps 
some stats on what it spends the most time doing or if there's a way to 
extract that sort of info from other metrics it keeps in the stats table?

Maybe a script which polls the stats table and correlates the info with stats 
about the system in /proc?

--
Jeff Frost, Owner   <[EMAIL PROTECTED]>
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] How to tell what your postgresql server is doing

2005-04-19 Thread Christopher Kings-Lynne
Is there a way to look at the stats tables and tell what is jamming up 
your postgres server the most?  Other than seeing long running queries 
and watch top, atop, iostat, vmstat in separate xterms...I'm wondering 
if postgres keeps some stats on what it spends the most time doing or if 
there's a way to extract that sort of info from other metrics it keeps 
in the stats table?

Maybe a script which polls the stats table and correlates the info with 
stats about the system in /proc?
Turn on logging of all queries, sample for a few hours or one day.  Then 
 run Practical Query Analyzer (PQA on pgfoundry.org) over it to get 
aggregate query information.

Chris
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

2005-04-19 Thread J. Andrew Rogers
I've seen quite a few folks touting the Opteron as 2.5x 
faster with postgres than a Xeon box.  What makes the 
Opteron so quick?  Is it that Postgres really prefers to 
run in 64-bit mode?

I don't know about 2.5x faster (perhaps on specific types 
of loads), but the reason Opterons rock for database 
applications is their insanely good memory bandwidth and 
latency that scales much better than the Xeon.  Opterons 
also have a ccNUMA-esque I/O fabric and two dedicated 
on-die memory channels *per processor* -- no shared bus 
there, closer to real UNIX server iron than a glorified 
PC.

We run a large Postgres database on a dual Opteron in 
32-bit mode that crushes Xeons running at higher clock 
speeds.  It has little to do with bitness or theoretical 
instruction dispatch, and everything to do with the 
superior memory controller and I/O fabric.  Databases are 
all about moving chunks of data around and the Opteron 
systems were engineered to do this very well and in a very 
scalable fashion.  For the money, it is hard to argue with 
the price/performance of Opteron based servers.  We 
started with one dual Opteron postgres server just over a 
year ago (with an equivalent uptime) and have considered 
nothing but Opterons for database servers since.  Opterons 
really are clearly superior to Xeons for this application. 
I don't work for AMD, just a satisfied customer. :-)

re: 6 disks.  Unless you are tight on disk space, a hot 
spare might be nice as well depending on your needs.

Cheers,
J. Andrew Rogers
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

2005-04-19 Thread Jeff Frost
On Tue, 19 Apr 2005, J. Andrew Rogers wrote:
I don't know about 2.5x faster (perhaps on specific types of loads), but the 
reason Opterons rock for database applications is their insanely good memory 
bandwidth and latency that scales much better than the Xeon.  Opterons also 
have a ccNUMA-esque I/O fabric and two dedicated on-die memory channels *per 
processor* -- no shared bus there, closer to real UNIX server iron than a 
glorified PC.
Thanks J!  That's exactly what I was suspecting it might be.  Actually, I 
found an anandtech benchmark that shows the Opteron coming in at close to 2.0x 
performance:

http://www.anandtech.com/linux/showdoc.aspx?i=2163&p=2
It's an Opteron 150 (2.4ghz) vs. Xeon 3.6ghz from August.  I wonder if the 
differences are more pronounced with the newer Opterons.

-Jeff
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match