Re: Trying to handle db corruption 9.6

Flo Rance Mon, 20 May 2019 07:50:16 -0700

Hi,

First of all, as stated in the wiki, you'll need to do a filesystem level
copy of the database files and put them on another drive before attempting
to do anything else !


https://wiki.postgresql.org/wiki/Corruption

regards,
Flo

On Mon, May 20, 2019 at 4:40 PM Mariel Cherkassky <
[email protected]> wrote:

> Hey,
> I'm trying to handle a corruption that one of our customers is facing.
> His disk space was full and as a result of that he decided to run
> pg_resetxlog a few times(bad idea..) .
> When I connected to the machine I saw that the db was down.
> When I started the db (service postgresql start) I saw the next error in
> the logs :
>
> DETAIL:  Could not open file "pg_multixact/offsets/0000": No such file or
> directory.
>
> The pg_multixact/offset dir contained one file (0025).
> The pg_multixact/members dir contains 2 files : 0000 and 0001.
>
> I tried to follow the documentation of pg_resetxlog, and run pg_resetxlog
> with -m 0xF0A604,0xEA50CE which are 0025*65536  and 0026*65536  in hexa.
> However, it didnt help and the same error appeared.
> So I tried to rename the file to 0000 and then the db searched for a file
> in members that wasnt exist.
> I followed the documentation and changed the multitransaction offset (-O)
> and the transactions id (-c ) based on the doc and then the db was started
> succesfully.
> However after it started I saw the next msg in the logs :
> Multixact member wraparound protections are disabled because oldest
> checkpointed Multixact 65536 doesnt exist. In addition, no one is able to
> connect to the db (we keep getting errors database doesnt exist or user
> doesnt exist , even for postgresql user).
>
> current relevant rows from the control data :
>
> pg_control version number:            960
>
> Catalog version number:               201608131
>
> Database system identifier:           6692952810876880414
>
> Database cluster state:               shut down
>
> pg_control last modified:             Mon 20 May 2019 07:07:30 AM PDT
>
> Latest checkpoint location:           1837/E3000028
>
> Prior checkpoint location:            1837/E2000028
>
> Latest checkpoint's REDO location:    1837/E3000028
>
> Latest checkpoint's REDO WAL file:    0000000100001837000000E3
>
> Latest checkpoint's TimeLineID:       1
>
> Latest checkpoint's PrevTimeLineID:   1
>
> Latest checkpoint's full_page_writes: on
>
> Latest checkpoint's NextXID:          0:3
>
> Latest checkpoint's NextOID:          10000
>
> Latest checkpoint's NextMultiXactId:  131072
>
> Latest checkpoint's NextMultiOffset:  52352
>
> Latest checkpoint's oldestXID:        3
>
> Latest checkpoint's oldestXID's DB:   0
>
> Latest checkpoint's oldestActiveXID:  0
>
> Latest checkpoint's oldestMultiXid:   65536
>
> Latest checkpoint's oldestMulti's DB: 0
>
> Latest checkpoint's oldestCommitTsXid:4604
>
> Latest checkpoint's newestCommitTsXid:5041
>
>
>
> I also checked and I saw that the customer has all the wals (backed up)
> but without any basebackup..
> Any recommendations how to handle the case ?
>

Re: Trying to handle db corruption 9.6

Reply via email to