Hi all,

I use hot standby stream replication in PostgreSQL 9.2.X.
And after i shut down the master server in fast stop mode, i compared the
xlog dump files between the master and slave and found that the shutdown
checkpoint was not replicated to the slave。
Then i check the pg_log in slave and found that redo process failed in
"record with zero length at %X/%X" during master shutdown 
startup process terminated current walreceiver and try to reconnect the
master but failed 'cause master is in shutting down mode.
Theoretically, all the wal records should be replicated to the slave when
the master shutdown in normal mode.
I read the source code and found that when we read a record to replay, we
use the last EndRecPtr to get an exact xlog page containing the next record.
If the EndRecPtr points to the end of the last page and the freespace of
that page is less than SizeOfXLogRecord, we align it to the next page.
I notice that there is an annotation below that code:
                 * RecPtr is pointing to end+1 of the previous WAL record.  We 
                 * advance it if necessary to where the next record starts.  
                 * align to next page if no more records can fit on the current 
                if (XLOG_BLCKSZ - (RecPtr->xrecoff % XLOG_BLCKSZ) < 

                /* Check for crossing of xlog logid boundary */
                if (RecPtr->xrecoff >= XLogFileSize)
                        RecPtr->xrecoff = 0;
                 * If at page start, we must skip over the page header.  But we 
                 * do that until we've read in the page, since the header size 
                 * variable.
The scenario is that:
1. when we do the shutdown checkpoint, we first advance the xlog buffer,
then checkpointguts, then log the checkpoint.
2. for the slave receiver, we received only the xlog page header of the next
page because the shutdown checkpoint record has not been assembled.
3. for the slave recovery, we request the next record using xlogpageread
with an LSN exactly pointing to the next page boundary.
4. for xlog page read, it uses this condition to confirm that receiver has
received some records.
[2]     /* See if we need to retrieve more data */
        if (readFile < 0 ||
                (readSource == XLOG_FROM_STREAM && !XLByteLT(*RecPtr, 
Here, the RecPtr points to the page boundary[1], receivedUpto points to the
end of page header(2). So it 
thinks that receiver has just received some records, so it returns the page
to caller(readrecord).
5. Readrecord check the pageheader ok in this page, and when it try to read
the record, it gets nothing...only a pageheader in the xlog page...

I think the problem is that we try to get an xlog page containing the
"record", and it should be a record, not a page boundary.

Can we use current boudary RecPtr to calculate the true record in the next
page ? Cause we know that next page is a long page header or a short page
header. I don't know the reason why we did not fix this problem in Postgres
9.2, even in 9.6 devel. 
Does this can work?
                if ((RecPtr->xrecoff % XLogSegSize) == 0)
                    XLByteAdvance((*RecPtr), SizeOfXLogLongPHD)
                     XLByteAdvance((*RecPtr), SizeOfXLogShortPHD)



View this message in context: 
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to