Re: [PERFORM] I/O on select count(*)

2008-05-27 Thread Simon Riggs
On Mon, 2008-05-26 at 11:36 -0400, Tom Lane wrote: > "Heikki Linnakangas" <[EMAIL PROTECTED]> writes: > > Decibel! wrote: > >> Also, Simon and Tom had an idea at PGCon: Don't set hint-bits in the > >> back-end if the page isn't already dirty. > > > Or even better: set the hint-bits, but don't d

Re: [PERFORM] I/O on select count(*)

2008-05-24 Thread Decibel!
On May 18, 2008, at 1:28 AM, Greg Smith wrote: I just collected all the good internals information included in this thread and popped it onto http://wiki.postgresql.org/wiki/ Hint_Bits where I'll continue to hack away at the text until it's readable. Thanks to everyone who answered my questi

Re: [PERFORM] I/O on select count(*)

2008-05-24 Thread Decibel!
On May 18, 2008, at 1:28 AM, Greg Smith wrote: I just collected all the good internals information included in this thread and popped it onto http://wiki.postgresql.org/wiki/ Hint_Bits where I'll continue to hack away at the text until it's readable. Thanks to everyone who answered my questi

Re: [PERFORM] I/O on select count(*)

2008-05-22 Thread Luke Lonergan
t;[EMAIL PROTECTED]> To: Luke Lonergan Cc: Pavan Deolasee <[EMAIL PROTECTED]>; Greg Smith <[EMAIL PROTECTED]>; Alvaro Herrera <[EMAIL PROTECTED]>; pgsql-performance@postgresql.org Sent: Thu May 22 12:10:02 2008 Subject: Re: [PERFORM] I/O on select count(*) On Thu, 2008-05

Re: [PERFORM] I/O on select count(*)

2008-05-19 Thread PFC
The real question in my mind is why this turns into a bottleneck before the similar task of cleaning the 16MB XLOG segment does. I expected that one would need to be cracked before the CLOG switch time could possibly be an issue, but reports from the field seem to suggest otherwise. H

Re: [PERFORM] I/O on select count(*)

2008-05-19 Thread Tom Lane
Greg Smith <[EMAIL PROTECTED]> writes: > The real question in my mind is why this turns into a bottleneck before > the similar task of cleaning the 16MB XLOG segment does. Because we do the latter off-line, or at least try to. regards, tom lane -- Sent via pgsql-perform

Re: [PERFORM] I/O on select count(*)

2008-05-19 Thread Greg Smith
On Mon, 19 May 2008, Matthew Wakeling wrote: Does it really take that long to zero out 8kB of RAM? I thought CPUs were really quick at doing that! You don't get the whole CPU--you get time slices of one. Some of the cases complaints have come in about have over a thousand connections all fi

Re: [PERFORM] I/O on select count(*)

2008-05-19 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > Matthew Wakeling wrote: >> Does it really take that long to zero out 8kB of RAM? I thought CPUs were >> really quick at doing that! > Yea, that was my assumption too. You have to write the page (to be sure there is space for it on disk) not only zero i

Re: [PERFORM] I/O on select count(*)

2008-05-19 Thread Bruce Momjian
Matthew Wakeling wrote: > On Fri, 16 May 2008, Tom Lane wrote: > > One additional point: this means that one transaction in every 32K > > writing transactions *does* have to do extra work when it assigns itself > > an XID, namely create and zero out the next page of pg_clog. And that > > doesn't j

Re: [PERFORM] I/O on select count(*)

2008-05-19 Thread Matthew Wakeling
Alvaro Herrera <[EMAIL PROTECTED]> writes: pg_clog is allocated in pages of 8kB apiece(*). On allocation, pages are zeroed, which is the bit pattern for "transaction in progress". So when a transaction starts, it only needs to ensure that the pg_clog page that corresponds to it is allocated, bu

Re: [PERFORM] I/O on select count(*)

2008-05-17 Thread Greg Smith
I just collected all the good internals information included in this thread and popped it onto http://wiki.postgresql.org/wiki/Hint_Bits where I'll continue to hack away at the text until it's readable. Thanks to everyone who answered my questions here, that's good progress toward clearing up

Re: [PERFORM] I/O on select count(*)

2008-05-17 Thread Jeremy Harris
Tom Lane wrote: One additional point: this means that one transaction in every 32K writing transactions *does* have to do extra work when it assigns itself an XID, namely create and zero out the next page of pg_clog. And that doesn't just slow down the transaction in question, but the next few g

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Greg Smith wrote: >> After reading the code and that documentation a bit, the part I'm still >> not sure about is whether the CLOG entry is created when the XID is >> assigned and then kept current as the state changes, or whether that >> isn't even

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Alvaro Herrera
Alvaro Herrera wrote: > pg_clog is allocated in pages of 8kB apiece(*). On allocation, pages are > zeroed, which is the bit pattern for "transaction in progress". So when > a transaction starts, it only needs to ensure that the pg_clog page that > corresponds to it is allocated, but it need not

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Alvaro Herrera
Greg Smith wrote: > After reading the code and that documentation a bit, the part I'm still > not sure about is whether the CLOG entry is created when the XID is > assigned and then kept current as the state changes, or whether that > isn't even in CLOG until the transaction is committed. It

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Simon Riggs
On Fri, 2008-05-16 at 14:05 -0400, Greg Smith wrote: > After reading the code and that documentation a bit, the part I'm > still not sure about is whether the CLOG entry is created when the XID > is assigned and then kept current as the state changes, or whether > that isn't even in CLOG until the

Re: [PERFORM] I/O on select count(*)

2008-05-16 Thread Greg Smith
On Thu, 15 May 2008, Alvaro Herrera wrote: Starting a transaction does not write anything to pg_clog. For Matt and others, some details here are in src/backend/access/transam/README: "pg_clog records the commit status for each transaction that has been assigned an XID." "Transactions and

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Kevin Grittner
>>> On Thu, May 15, 2008 at 5:11 PM, in message <[EMAIL PROTECTED]>, James Mansion <[EMAIL PROTECTED]> wrote: > Alvaro Herrera wrote: >> Hint bits are used to mark tuples as created and/or deleted by >> transactions that are know committed or aborted. To determine the >> visibility of a tuple wi

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Gregory Stark
"Luke Lonergan" <[EMAIL PROTECTED]> writes: > BTW ­ we¹ve removed HINT bit checking in Greenplum DB and improved the > visibility caching which was enough to provide performance at the same level > as with the HINT bit optimization, but avoids this whole ³write the data, > write it to the log also

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread James Mansion
Alvaro Herrera wrote: Hint bits are used to mark tuples as created and/or deleted by transactions that are know committed or aborted. To determine the visibility of a tuple without such bits set, you need to consult pg_clog and possibly pg_subtrans, so it is an expensive check. On the other

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Robert Lor
Tom Lane wrote: Hmm, the problem would be trying to figure out what percentage of writes could be blamed solely on hint-bit updates and not any other change to the page. I don't think that the bufmgr currently keeps enough state to know that, but you could probably modify it easily enough, since

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Tom Lane
Robert Lor <[EMAIL PROTECTED]> writes: > Tom Lane wrote: >> It's certainly true that hint-bit updates cost something, but >> quantifying how much isn't easy. > Maybe we can instrument the code with DTrace probes to quantify the > actual costs. Hmm, the problem would be trying to figure out what

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Robert Lor
Tom Lane wrote: It's certainly true that hint-bit updates cost something, but quantifying how much isn't easy. Maybe we can instrument the code with DTrace probes to quantify the actual costs. I'm not familiar with the code, but if I know where to place the probes, I can easily do a quick tes

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Heikki Linnakangas escribió: >> We know what kind of a relation we're dealing with in ReadBuffer, so we >> could add a flag to BufferDesc to mark heap pages. > Hmm, I was thinking that it would need access to the catalogs to know > where the tuples ar

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Alvaro Herrera
Matthew Wakeling wrote: Aside from the rest of commentary, a slight clarification: > So, as I understand it, Postgres works like this: > > 1. You begin a transaction. Postgres writes an entry into pg_clog. Starting a transaction does not write anything to pg_clog. -- Alvaro Herrera

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Alvaro Herrera
Heikki Linnakangas escribió: > Alvaro Herrera wrote: >> The problem is that the bgwriter does not understand about the content >> of the pages it is writing -- they're opaque pages for all it knows. So >> it cannot touch the hint bits. > > We know what kind of a relation we're dealing with in Rea

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Matthew Wakeling
On Thu, 15 May 2008, Heikki Linnakangas wrote: There's not much point optimizing something that only helps with aborted transactions. That's fair enough, but this list method is likely to speed up index writes anyway. The general problem with any idea that involves keeping a list of changes

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Heikki Linnakangas
Alvaro Herrera wrote: Greg Smith escribió: On Thu, 15 May 2008, Pavan Deolasee wrote: I had suggested in the past that whenever we set hint bits for a tuple, we should check all other tuples in the page and set their hint bits too to avoid multiple writes of the same page. I guess the idea go

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Heikki Linnakangas
Matthew Wakeling wrote: On Thu, 15 May 2008, Heikki Linnakangas wrote: > Is it really safe to update the hint bits in place? If there is a > power cut in the middle of writing a block, is there a guarantee from > the disc that the block will never be garbled? Don't know, to be honest. We've n

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Tom Lane
Matthew Wakeling <[EMAIL PROTECTED]> writes: > Hmm. That problem is what WAL full-page-writes is meant to handle, isn't > it? So basically, if you're telling people that WAL full-page-writes is > safer than partial WAL, because it avoids updating pages in-place, then > you shouldn't be updating

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Alvaro Herrera
Greg Smith escribió: > On Thu, 15 May 2008, Pavan Deolasee wrote: > >> I had suggested in the past that whenever we set hint bits for a tuple, >> we should check all other tuples in the page and set their hint bits >> too to avoid multiple writes of the same page. I guess the idea got >> rejecte

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Ron Mayer
Matthew Wakeling wrote: On Thu, 15 May 2008, Luke Lonergan wrote: ...HINT bit optimization, but avoids this whole ³write the data, write it to the log also, then write it again just for good measure² ... The hint data will be four bits per tuple plus overheads, so it could be made very compact

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Matthew Wakeling
On Thu, 15 May 2008, Heikki Linnakangas wrote: > Is it really safe to update the hint bits in place? If there is a > power cut in the middle of writing a block, is there a guarantee from > the disc that the block will never be garbled? Don't know, to be honest. We've never seen any reports of

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Jan de Visser
On Thursday 15 May 2008 03:02:19 Tom Lane wrote: > "Jan de Visser" <[EMAIL PROTECTED]> writes: > > Obviously, this issue is tied to the slow count(*) one, as I found out > > the hard way. Consider the following scenario: > > * Insert row > > * Update that row a couple of times > > * Rinse and repea

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Matthew Wakeling
On Thu, 15 May 2008, Luke Lonergan wrote: BTW ­ we¹ve removed HINT bit checking in Greenplum DB and improved the visibility caching which was enough to provide performance at the same level as with the HINT bit optimization, but avoids this whole ³write the data, write it to the log also, then wr

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Heikki Linnakangas
Matthew Wakeling wrote: Is it really safe to update the hint bits in place? If there is a power cut in the middle of writing a block, is there a guarantee from the disc that the block will never be garbled? Don't know, to be honest. We've never seen any reports of corrupted data that would su

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Matthew Wakeling
On Wed, 14 May 2008, Alvaro Herrera wrote: Hint bits are used to mark tuples as created and/or deleted by transactions that are know committed or aborted. To determine the visibility of a tuple without such bits set, you need to consult pg_clog and possibly pg_subtrans, so it is an expensive che

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Tom Lane
Greg Smith <[EMAIL PROTECTED]> writes: > ... To put that another way: if I've run a manual vacuum, is it true > that it will have updated all the hint bits to XMIN_COMMITTED for all the > tuples that were all done when the vacuum started? Any examination whatsoever of a tuple --- whether by vac

Re: [PERFORM] I/O on select count(*)

2008-05-15 Thread Tom Lane
"Jan de Visser" <[EMAIL PROTECTED]> writes: > Obviously, this issue is tied to the slow count(*) one, as I found out > the hard way. Consider the following scenario: > * Insert row > * Update that row a couple of times > * Rinse and repeat many times > Now somewhere during that cycle, do a select

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Tino Wildenhain
Hi, Luke Lonergan wrote: BTW – we’ve removed HINT bit checking in Greenplum DB and improved the visibility caching which was enough to provide performance at the same level as with the HINT bit optimization, but avoids this whole “write the data, write it to the log also, then write it again j

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Joshua D. Drake
On Thu, 15 May 2008 10:52:01 +0800 Luke Lonergan <[EMAIL PROTECTED]> wrote: > BTW ­ we¹ve removed HINT bit checking in Greenplum DB and improved the > visibility caching which was enough to provide performance at the > same level as with the HINT bit optimization, but avoids this whole > ³write th

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Greg Smith
On Thu, 15 May 2008, Pavan Deolasee wrote: I had suggested in the past that whenever we set hint bits for a tuple, we should check all other tuples in the page and set their hint bits too to avoid multiple writes of the same page. I guess the idea got rejected because of lack of benchmarks to

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Luke Lonergan
BTW ­ we¹ve removed HINT bit checking in Greenplum DB and improved the visibility caching which was enough to provide performance at the same level as with the HINT bit optimization, but avoids this whole ³write the data, write it to the log also, then write it again just for good measure² behavior

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Pavan Deolasee
On Thu, May 15, 2008 at 7:51 AM, Greg Smith <[EMAIL PROTECTED]> wrote: > > > So is vacuum helpful here because it will force all that to happen in one > batch? To put that another way: if I've run a manual vacuum, is it true > that it will have updated all the hint bits to XMIN_COMMITTED for all

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Jan de Visser
On 5/14/08, Greg Smith <[EMAIL PROTECTED]> wrote: > On Wed, 14 May 2008, Alvaro Herrera wrote: > > > > If neither of the bits is set, then the transaction is either in progress > (which you can check by examining the list of running transactions in shared > memory) or your process is the first one

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Greg Smith
On Wed, 14 May 2008, Alvaro Herrera wrote: If neither of the bits is set, then the transaction is either in progress (which you can check by examining the list of running transactions in shared memory) or your process is the first one to check (in which case, you need to consult pg_clog to kno

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Alvaro Herrera
Greg Smith wrote: > On Wed, 14 May 2008, Kevin Grittner wrote: > >> If this is the first time that the rows are being read since they were >> inserted (or since the database was loaded, including from backup), it >> may be rewriting the rows to set hint bits, which can make subsequent >> access fas

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Greg Smith
On Wed, 14 May 2008, Kevin Grittner wrote: If this is the first time that the rows are being read since they were inserted (or since the database was loaded, including from backup), it may be rewriting the rows to set hint bits, which can make subsequent access faster. This is the second time

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Kevin Grittner
>>> Doug Eck <[EMAIL PROTECTED]> wrote: > I am attempting to run a query to determine the number of rows for a given > day using something like "select count(*) from tbl1 where ts between > '2008-05-12 00:00:00.000' and '2008-05-12 23:59:59.999'". Explain tells me > that the query will be do

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Doug Eck
- Original Message From: Merlin Moncure <[EMAIL PROTECTED]> To: Doug Eck <[EMAIL PROTECTED]> Cc: pgsql-performance@postgresql.org Sent: Wednesday, May 14, 2008 3:38:23 PM Subject: Re: [PERFORM] I/O on select count(*) On Wed, May 14, 2008 at 4:09 PM, Doug Eck <[EMAIL PR

Re: [PERFORM] I/O on select count(*)

2008-05-14 Thread Merlin Moncure
On Wed, May 14, 2008 at 4:09 PM, Doug Eck <[EMAIL PROTECTED]> wrote: > I have a large table (~ 2B rows) that contains an indexed timestamp column. > I am attempting to run a query to determine the number of rows for a given > day using something like "select count(*) from tbl1 where ts between > '2

[PERFORM] I/O on select count(*)

2008-05-14 Thread Doug Eck
I have a large table (~ 2B rows) that contains an indexed timestamp column. I am attempting to run a query to determine the number of rows for a given day using something like "select count(*) from tbl1 where ts between '2008-05-12 00:00:00.000' and '2008-05-12 23:59:59.999'". Explain tells me