Re: [HACKERS] VACUUM always makes all pages dirty
"Pavan Deolasee" <[EMAIL PROTECTED]> writes: > The attached patch should fix this. We mark the buffer dirty only if there > is any state change in the page header. Applied, with minor additional tweak to avoid duplicate calls to SetBufferCommitInfoNeedsSave --- that seems (just) expensive enough to be worth the trouble. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] VACUUM always makes all pages dirty
On 10/24/07, Pavan Deolasee <[EMAIL PROTECTED]> wrote: > > > > I am looking at it. We must not call SetBufferCommitInfoNeedsSave unless > we make any state changes to the page. > > > The attached patch should fix this. We mark the buffer dirty only if there is any state change in the page header. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com Index: src/backend/access/heap/pruneheap.c === RCS file: /projects/cvsroot/pgsql/src/backend/access/heap/pruneheap.c,v retrieving revision 1.2 diff -c -r1.2 pruneheap.c *** src/backend/access/heap/pruneheap.c 21 Sep 2007 21:25:42 - 1.2 --- src/backend/access/heap/pruneheap.c 24 Oct 2007 09:32:21 - *** *** 146,162 intnredirected = 0; intndead = 0; intnunused = 0; START_CRIT_SECTION(); /* ! * Mark the page as clear of prunable tuples. If we find a tuple which ! * may soon become prunable, we shall set the hint again. Also clear ! * the "page is full" flag, since there's no point in repeating the ! * prune/defrag process until something else happens to the page. */ PageClearPrunable(page); ! PageClearFull(page); /* Scan the page */ maxoff = PageGetMaxOffsetNumber(page); --- 146,173 intnredirected = 0; intndead = 0; intnunused = 0; + TransactionId save_prune_xid; START_CRIT_SECTION(); /* ! * Save the current pd_prune_xid and mark the page as clear of prunable ! * tuples. If we find a tuple which may soon become prunable, we shall set ! * the hint again. */ + save_prune_xid = ((PageHeader) page)->pd_prune_xid; PageClearPrunable(page); ! ! /* ! * Also clear the "page is full" flag if it is set, since there's no point ! * in repeating the prune/defrag process until something else happens to the ! * page. ! */ ! if (PageIsFull(page)) ! { ! PageClearFull(page); ! SetBufferCommitInfoNeedsSave(buffer); ! } /* Scan the page */ maxoff = PageGetMaxOffsetNumber(page); *** *** 209,218 else { /* ! * If we didn't prune anything, we have nonetheless updated the * pd_prune_xid field; treat this as a non-WAL-logged hint. */ ! SetBufferCommitInfoNeedsSave(buffer); } END_CRIT_SECTION(); --- 220,230 else { /* ! * If we didn't prune anything, but have updated the * pd_prune_xid field; treat this as a non-WAL-logged hint. */ ! if (((PageHeader) page)->pd_prune_xid != save_prune_xid) ! SetBufferCommitInfoNeedsSave(buffer); } END_CRIT_SECTION(); ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] VACUUM always makes all pages dirty
On 10/24/07, Heikki Linnakangas <[EMAIL PROTECTED]> wrote: > > > Yeah, it's definitely a HOT-introdued thing. Vacuum calls > heap_page_prune on every page, and this in heap_page_prune is dirtying > the buffer: > > > else > > { > > /* > >* If we didn't prune anything, we have nonetheless > updated the > >* pd_prune_xid field; treat this as a non-WAL-logged > hint. > >*/ > > SetBufferCommitInfoNeedsSave(buffer); > > } > > I don't have time to dig deeper at this moment. I'll take a look later > today, unless someone beats me to it. We obviously don't want to call > SetBufferCommitInfoNeedsSave if we didn't really change the pd_prune_xid > field. > > I am looking at it. We must not call SetBufferCommitInfoNeedsSave unless we make any state changes to the page. Thanks, Pavan -- Pavan Deolasee EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] VACUUM always makes all pages dirty
ITAGAKI Takahiro wrote: > VACUUM in 8.3dev always makes all pages dirty even if there are no jobs. > In 8.2.5, VACUUM produces no dirty pages in the same workload. Therefore, > VACUUM on 8.3 takes longer time than 8.2. I doubt some bugs in the > HOT-related codes here, but I cannot point out the actual position yet... Yeah, it's definitely a HOT-introdued thing. Vacuum calls heap_page_prune on every page, and this in heap_page_prune is dirtying the buffer: > else > { > /* >* If we didn't prune anything, we have nonetheless updated the >* pd_prune_xid field; treat this as a non-WAL-logged hint. >*/ > SetBufferCommitInfoNeedsSave(buffer); > } I don't have time to dig deeper at this moment. I'll take a look later today, unless someone beats me to it. We obviously don't want to call SetBufferCommitInfoNeedsSave if we didn't really change the pd_prune_xid field. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] VACUUM always makes all pages dirty
VACUUM in 8.3dev always makes all pages dirty even if there are no jobs. In 8.2.5, VACUUM produces no dirty pages in the same workload. Therefore, VACUUM on 8.3 takes longer time than 8.2. I doubt some bugs in the HOT-related codes here, but I cannot point out the actual position yet... Do you have any idea on this issue? # CREATE VIEW buffers AS SELECT nbuf, ndirty, N.nspname AS schemaname, relname FROM (SELECT count(*) AS nbuf, sum(CASE WHEN isdirty THEN 1 ELSE 0 END) AS ndirty, relnamespace, relname FROM pg_buffercache B, pg_class C WHERE B.relfilenode = C.relfilenode GROUP BY relnamespace, relname) AS T LEFT JOIN pg_namespace N ON relnamespace = N.oid ORDER BY schemaname, relname; # BEGIN; # CREATE TABLE test (i integer, filler char(100) default ''); # INSERT INTO test(i) SELECT generate_series(1, 100); # COMMIT; # SELECT * FROM buffers WHERE relname = 'test'; nbuf | ndirty | schemaname | relname ---+++- 17242 | 17242 | public | test First query makes all pages dirty because of hit bits. # CHECKPOINT; # SELECT count(*) FROM test; count - 100 # SELECT * FROM buffers WHERE relname = 'test'; nbuf | ndirty | schemaname | relname ---+++- 17242 | 17242 | public | test First vacuum makes all pages dirty. Why? # CHECKPOINT; # VACUUM test; # SELECT * FROM buffers WHERE relname = 'test'; nbuf | ndirty | schemaname | relname ---+++- 17242 | 17242 | public | test Second vacuum makes all pages dirty, too. Why? # CHECKPOINT; # VACUUM test; # SELECT * FROM buffers WHERE relname = 'test'; nbuf | ndirty | schemaname | relname ---+++- 17242 | 17242 | public | test Regards, --- ITAGAKI Takahiro NTT Open Source Software Center ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate