Re: [HACKERS] Serious problem: media recovery fails after system or PostgreSQL crash

Jeff Janes Mon, 17 Dec 2012 13:30:57 -0800

On Sun, Dec 16, 2012 at 8:38 AM, Tomas Vondra <[email protected]> wrote:
> On 8.12.2012 03:08, Jeff Janes wrote:
>>
>> It seems to me you need considerable expertise to figure out how to do
>> optimal recovery (i.e. losing the least transactions) in this
>> situation, and that that expertise cannot be automated.  Do you trust
>> a partial file from a good hard drive, or a complete file from a
>> partially melted pg_xlog?
>
> It clearly is a rather complex issue, no doubt about that. And yes,
> reliability of the devices with pg_xlog on them is an important detail.
> Alghough if the WAL is not written in a reliable way, you're hosed
> anyway I guess.
>
> The recommended archive command is based on the assumption that the
> local pg_xlog is intact (e.g. because it's located on a reliable RAID1
> array), which seems to be the assumption of the OP too.
>
> In my opinion it's more likely to meet an incomplete copy of WAL in the
> archive than a corrupted local WAL. And if it really is corrupted, it
> would be identified during replay.


Wouldn't the way it would be identified be for it to fail a checksum,
assume it was garbage left over from the previous WAL file which was
the process of being overwritten at the time of crash, and so
terminate recovery and open the database?

Assuming your goal is to recover all the transactions you possibly can
(rather than restore to a known point), I think you would want to try
recovery both ways and keep whichever one got the furthest.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Serious problem: media recovery fails after system or PostgreSQL crash

Reply via email to