On Oct26, 2011, at 15:12 , Simon Riggs wrote: > On Wed, Oct 26, 2011 at 12:54 PM, Aidan Van Dyk <ai...@highrise.ca> wrote: > >> The read fails because their is no data at the location it's trying to >> read from, because clog hasn't been extended yet by recovery. > > You don't actually know that, though I agree it seems a reasonable > guess and was my first thought also.
The actual error message also supports that theory. Here's the relevant snippet from the OP's log (Found in ca9fd2fe.1d8d2%linas.virba...@continuent.com) 2011-09-21 13:41:05 CEST FATAL: could not access status of transaction 1188673 2011-09-21 13:41:05 CEST DETAIL: Could not read from file "pg_clog/0001" at offset 32768: Success. Note that it says "Success" at the end of the second log entry. That can only happen, I think, if we're trying to read the page adjacent to the last page in the file. The seek would be successfull, and the subsequent read() would indicate EOF by returning zero bytes. None of the calls would set errno. If there was a real IO error, read() would set errno, and if the page wasn't adjacent to the last page in the file, seek() would set errno. In both cases we'd see the corresponding error messag, not "Success". > The error is very specifically referring to 22811359, which is the > nextxid from pg_control and updated by checkpoint. Where does that XID come from? The reference to that XID in the archives that I can find is in your message CA+U5nMKUUoA8kRG=itfso5nzue3x_kdjz78eaun3_fkmq-u...@mail.gmail.com > 22811359 is mid-way through a clog page, so prior xids will already > have been allocated, pages extended and then those pages fsyncd before > the end of pg_start_backup(). So it shouldn't be possible for that > page to be absent from the base backup, unless the base backup was > taken without a preceding checkpoint, which seems is not the case from > the script output. Or unless the nextId we store in the checkpoint is for some reason higher than it should be. Or unless nextId somehow gets mangled during recovery. Or unless there's some interaction between VACUUM and CHECKPOINTS that we're overlooking... > Note that if you are correct, then the solution is to extend clog, > which Florian disagrees with as a solution. That's not what I said. As you said, the CLOG page corresponding to nextId *should* always be accessible at the start of recovery (Unless whole file has been removed by VACUUM, that is). So we shouldn't need to extends CLOG. Yet the error suggest that the CLOG is, in fact, too short. What I said is that we shouldn't apply any fix (for the CLOG problem) before we understand the reason for that apparent contradiction. Doing it nevertheless to get rid of this seems dangerous. What happens, for example, to the CLOG state of transactions earlier than the checkpoint's nextId? There COMMIT record may very well lie before the checkpoint's REDO pointer, so the CLOG we copied better contained their correct state. Yet if it does, then why isn't the nextId's CLOG page accessible? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers