On Tue, 2013-03-26 at 02:50 +0900, Fujii Masao wrote:
> Hi,
>
> I found that the regression test failed when I created the database
> cluster with the checksum and set wal_level to archive. I think that
> there are some bugs around checksum feature. Attached is the regression.diff.
Thank you for the report. This was a significant oversight, but simple
to diagnose and fix.
There were several places that were doing something like:
PageSetChecksumInplace
if (use_wal)
log_newpage
smgrextend
Which is obviously wrong, because log_newpage set the LSN of the page,
invalidating the checksum. We need to set the checksum after
log_newpage.
Also, I noticed that copy_relation_data was doing smgrread without
validating the checksum (or page header, for that matter), so I also
fixed that.
Patch attached. Only brief testing done, so I might have missed
something. I will look more closely later.
Regards,
Jeff Davis
*** a/src/backend/access/heap/rewriteheap.c
--- b/src/backend/access/heap/rewriteheap.c
***************
*** 273,286 **** end_heap_rewrite(RewriteState state)
/* Write the last page, if any */
if (state->rs_buffer_valid)
{
- PageSetChecksumInplace(state->rs_buffer, state->rs_blockno);
-
if (state->rs_use_wal)
log_newpage(&state->rs_new_rel->rd_node,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer);
RelationOpenSmgr(state->rs_new_rel);
smgrextend(state->rs_new_rel->rd_smgr, MAIN_FORKNUM, state->rs_blockno,
(char *) state->rs_buffer, true);
}
--- 273,287 ----
/* Write the last page, if any */
if (state->rs_buffer_valid)
{
if (state->rs_use_wal)
log_newpage(&state->rs_new_rel->rd_node,
MAIN_FORKNUM,
state->rs_blockno,
state->rs_buffer);
RelationOpenSmgr(state->rs_new_rel);
+
+ PageSetChecksumInplace(state->rs_buffer, state->rs_blockno);
+
smgrextend(state->rs_new_rel->rd_smgr, MAIN_FORKNUM, state->rs_blockno,
(char *) state->rs_buffer, true);
}
***************
*** 616,623 **** raw_heap_insert(RewriteState state, HeapTuple tup)
{
/* Doesn't fit, so write out the existing page */
- PageSetChecksumInplace(page, state->rs_blockno);
-
/* XLOG stuff */
if (state->rs_use_wal)
log_newpage(&state->rs_new_rel->rd_node,
--- 617,622 ----
***************
*** 632,637 **** raw_heap_insert(RewriteState state, HeapTuple tup)
--- 631,639 ----
* end_heap_rewrite.
*/
RelationOpenSmgr(state->rs_new_rel);
+
+ PageSetChecksumInplace(page, state->rs_blockno);
+
smgrextend(state->rs_new_rel->rd_smgr, MAIN_FORKNUM,
state->rs_blockno, (char *) page, true);
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***************
*** 51,56 ****
--- 51,57 ----
#include "commands/tablespace.h"
#include "commands/trigger.h"
#include "commands/typecmds.h"
+ #include "common/relpath.h"
#include "executor/executor.h"
#include "foreign/foreign.h"
#include "miscadmin.h"
***************
*** 8902,8913 **** copy_relation_data(SMgrRelation src, SMgrRelation dst,
smgrread(src, forkNum, blkno, buf);
! PageSetChecksumInplace(page, blkno);
/* XLOG stuff */
if (use_wal)
log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page);
/*
* Now write the page. We say isTemp = true even if it's not a temp
* rel, because there's no need for smgr to schedule an fsync for this
--- 8903,8923 ----
smgrread(src, forkNum, blkno, buf);
! if (!PageIsVerified(page, blkno))
! ereport(ERROR,
! (errcode(ERRCODE_DATA_CORRUPTED),
! errmsg("invalid page in block %u of relation %s",
! blkno,
! relpathbackend(src->smgr_rnode.node,
! src->smgr_rnode.backend,
! forkNum))));
/* XLOG stuff */
if (use_wal)
log_newpage(&dst->smgr_rnode.node, forkNum, blkno, page);
+ PageSetChecksumInplace(page, blkno);
+
/*
* Now write the page. We say isTemp = true even if it's not a temp
* rel, because there's no need for smgr to schedule an fsync for this
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers