Re: [PERFORM] COUNT(*) again (was Re: [HACKERS] Index/Function organized table layout)

2003-10-05 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I think that's not happening, conditionally or otherwise.  The atomicity
 problems alone are sufficient reason why not, even before you look at
 the performance issues.

 What are the atomicity problems of adding a create/expire xid to the
 index tuples?

You can't update a tuple's status in just one place ... you have to
update the copies in the indexes too.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Matt Clark
 On Sat, Oct 04, 2003 at 12:29:55AM +0100, Matt Clark wrote:
  My real world experience on a *very* heavily updated OLTP type
 DB, following
  advice from this list (thanks guys!), is that there is
 essentially zero cost
  to going ahead and vacuuming as often as you feel like it.  Go
 crazy, and
  speed up your DB!

 That's not quite true.  If vacuums start running into each other, you
 can very easily start eating up all your I/O bandwidth.  Even if you
 gots lots of it.

Very true, which is why all my scripts write a lockfile and delete it when
they're finished, to prevent that happening.  I should have mentioned that.

 Also, a vacuum pretty much destroys your shared buffers, so you have
 to be aware of that trade-off too.  Vacuum is not free.  It's _way_
 cheaper than it used to be, though.

That's _very_ interesting.  I've never been quite clear what's in shared
buffers apart from scratch space for currently running transactions.  Also
the docs imply that vacuum uses it's own space for working in.  Do you have
more info on how it clobbers shared_buffers?

M


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Andrew Sullivan
On Sun, Oct 05, 2003 at 12:14:24PM +0100, Matt Clark wrote:
 more info on how it clobbers shared_buffers?

Vacuum is like a seqscan.  It touches everything on a table.  So it
doesn't clobber them, but that's the latest data.  It's unlikely your
buffers are big enough to hold your database, unless your database is
small.  So you'll end up expiring potentially useful data in the
buffer.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Tom Lane
Matt Clark [EMAIL PROTECTED] writes:
 OK I'm definitely missing something here.

The point is that a big seqscan (either VACUUM or a plain table scan)
hits a lot of pages, and thereby tends to fill your cache with pages
that aren't actually likely to get hit again soon, perhaps pushing out
pages that will be needed again soon.  This happens at both the
shared-buffer and kernel-disk-cache levels of caching.

It would be good to find some way to prevent big seqscans from
populating cache, but I don't know of any portable way to tell the OS
that we don't want it to cache a page we are reading.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Matt Clark
 The point is that a big seqscan (either VACUUM or a plain table scan)
 hits a lot of pages, and thereby tends to fill your cache with pages
 that aren't actually likely to get hit again soon, perhaps pushing out
 pages that will be needed again soon.  This happens at both the
 shared-buffer and kernel-disk-cache levels of caching.

OK, I had thought (wrongly it seems, as usual, but this is how we learn!)
that a plain VACUUM did not incur a read of all pages.  I still don't
understand *why* it does, but I'll take your word for it.

Clearly if it distorts the 'normal' balance of pages in any caches, PG's or
the OS's, that's a _bad thing_.  I am currently in the nice position of
having a DB that (just about) fits in RAM, so I pretty much don't care about
read performance, but I will have to soon as it grows beyond 3GB :-(  These
conversations are invaluable in planning for that dread time...

 It would be good to find some way to prevent big seqscans from
 populating cache, but I don't know of any portable way to tell the OS
 that we don't want it to cache a page we are reading.

Quite.  The only natural way would be to read those pages through some
special device, but then you might as well do raw disk access from the
get-go.  Portability vs. Performance, the age old quandary.  FWIW I and many
others stand back in pure amazement at the sheer _quality_ of PostgreSQL.


Rgds,

Matt



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Tom Lane
Matt Clark [EMAIL PROTECTED] writes:
 OK, I had thought (wrongly it seems, as usual, but this is how we learn!)
 that a plain VACUUM did not incur a read of all pages.  I still don't
 understand *why* it does, but I'll take your word for it.

Mainly 'cause it doesn't know where the dead tuples are till it's
looked.  Also, VACUUM is the data collector for the free space map,
and so it is also charged with finding out how much free space exists
on every page.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] count(*) slow on large tables

2003-10-05 Thread Josh Berkus
Bruce,

 OK, I beefed up the TODO:

   * Use a fixed row count and a +/- count with MVCC visibility rules
 to allow fast COUNT(*) queries with no WHERE clause(?)

 I can always give the details if someone asks.  It doesn't seem complex
 enough for a separate TODO.detail item.

Hmmm ... this doesn't seem effort-worthy to me.   How often does anyone do 
COUNT with no where clause, except GUIs that give you a record count?  (of 
course, as always, if someone wants to code it, feel free ...)

And for those GUIs, wouldn't it be 97% as good to run an ANALYZE and give the 
approximate record counts for large tables?

As for counts with a WHERE clause, this is obviously up to the user.  Joe 
Conway and I tested using a C trigger to track some COUNT ... GROUP BY values 
for large tables based on additive numbers.   It worked fairly well for 
accuracy, but the performance penalty on data writes was significant ... 8% 
to 25% penalty for UPDATES, depending on the frequency and batch size ( 
frequency  batch size --   penalty)

It's possible that this could be improved through some mechanism more tightly 
integrated with the source code.   However,the coding effort would be 
significant ( 12-20 hours ) and it's possible that there would be no 
improvement, which is why we didn't do it.

We also discussed an asynchronous aggregates collector that would work 
something like the statistics collector, and keep pre-programmmed aggregate 
data, updating during low-activity periods.  This would significantly 
reduce the performance penalty, but at the cost of accuracy ... that is, a 
1%-5% variance on high-activity tables would be unavoidable, and all cached 
aggregates would have to be recalculated on database restart, significantly 
slowing down startup.   Again, we felt that the effort-result payoff was not 
worthwhile.

Overall, I think the stuff we already have planned ... the hash aggregates in 
7.4 and Tom's suggestion of adding an indexable flag to pg_aggs ... are far 
more likely to yeild useful fruit than any caching plan.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] count(*) slow on large tables

2003-10-05 Thread Rod Taylor
 And for those GUIs, wouldn't it be 97% as good to run an ANALYZE and give the 
 approximate record counts for large tables?

Interfaces which run a COUNT(*) like that are broken by design. They
fail to consider the table may really be a view which of course could
not be cached with results like that and may take days to load a full
result set (we had some pretty large views in an old billing system).


signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Christopher Browne
After a long battle with technology,[EMAIL PROTECTED] (Matt Clark), an earthling, 
wrote:
 The point is that a big seqscan (either VACUUM or a plain table scan)
 hits a lot of pages, and thereby tends to fill your cache with pages
 that aren't actually likely to get hit again soon, perhaps pushing out
 pages that will be needed again soon.  This happens at both the
 shared-buffer and kernel-disk-cache levels of caching.

 OK, I had thought (wrongly it seems, as usual, but this is how we learn!)
 that a plain VACUUM did not incur a read of all pages.  I still don't
 understand *why* it does, but I'll take your word for it.

How does it know what to do on any given page if it does not read it
in?  It has to evaluate whether tuples can be thrown away or not, and
that requires looking at the tuples.  It may only be looking at a
small portion of the page, but that still requires reading each page.

No free lunch, unfortunately...
-- 
wm(X,Y):-write(X),write('@'),write(Y). wm('cbbrowne','cbbrowne.com').
http://www3.sympatico.ca/cbbrowne/sgml.html
End users  are just test loads  for verifying that  the system works,
kind of like resistors in an electrical circuit.
-- Kaz Kylheku in c.o.l.d.s

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Neil Conway
On Sat, 2003-10-04 at 11:22, Andrew Sullivan wrote:
 Also, a vacuum pretty much destroys your shared buffers, so you have
 to be aware of that trade-off too.

True, although there is no reason that this necessary needs to be the
case (at least, as far as the PostgreSQL shared buffer goes). As has
been pointed out numerous times on -hackers and in the literature, using
LRU for a DBMS shared buffer cache is far from optimal, and better
algorithms have been proposed (e.g. LRU-K, ARC). We could even have the
VACUUM command inform the bufmgr that the pages it is in the process of
reading in are part of a seqscan, and so are unlikely to be needed in
the immediate future.

-Neil



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 ... We could even have the
 VACUUM command inform the bufmgr that the pages it is in the process of
 reading in are part of a seqscan, and so are unlikely to be needed in
 the immediate future.

This would be relatively easy to fix as far as our own buffering is
concerned, but the thing that's needed to make it really useful is
to prevent caching of seqscan-read pages in the kernel disk buffers.
I don't know any portable way to do that :-(

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Neil Conway
On Sun, 2003-10-05 at 19:43, Tom Lane wrote:
 This would be relatively easy to fix as far as our own buffering is
 concerned, but the thing that's needed to make it really useful is
 to prevent caching of seqscan-read pages in the kernel disk buffers.

True.

 I don't know any portable way to do that :-(

For the non-portable way of doing this, are you referring to O_DIRECT?

Even if it isn't available everywhere, it might be worth considering
this at least for the platforms on which it is supported.

-Neil



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] reindex/vacuum locking/performance?

2003-10-05 Thread Andrew Sullivan
On Sun, Oct 05, 2003 at 07:32:47PM -0400, Neil Conway wrote:

 been pointed out numerous times on -hackers and in the literature, using
 LRU for a DBMS shared buffer cache is far from optimal, and better
 algorithms have been proposed (e.g. LRU-K, ARC). We could even have the
 VACUUM command inform the bufmgr that the pages it is in the process of
 reading in are part of a seqscan, and so are unlikely to be needed in
 the immediate future.

Hey, when that happens, you'll find me first in line to praise the
implementor; but until then, it's important that people not get the
idea that vacuum is free.

It is _way_ imporved, and on moderately loaded boxes, it'salmost
unnoticable.  But under heavy load, you need to be _real_ careful
about calling vacuum.  I think one of the biggest needs in the AVD is
some sort of intelligence about current load on the postmaster, but I
haven't the foggiest idea how to give it such intelligence.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] COUNT(*) again (was Re: [HACKERS] Index/Function organized

2003-10-05 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  I think that's not happening, conditionally or otherwise.  The atomicity
  problems alone are sufficient reason why not, even before you look at
  the performance issues.
 
  What are the atomicity problems of adding a create/expire xid to the
  index tuples?
 
 You can't update a tuple's status in just one place ... you have to
 update the copies in the indexes too.

But we don't update the tuple status for a commit, we just mark the xid
as committed.  We do have lazy status bits that prevent later lookups in
pg_clog, but we have those in the index already also.

What am I missing?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 8: explain analyze is your friend