At Thu, 18 Mar 2010 13:53:38 -0400 (EDT), Becky Ligon wrote: > > In my experience with this same problem, db_recovery didn't work because > pvfs does not have journalling enabled. Do you have a backup? >
I'm afraid not, this is a scratch file system with no promises on data recovery other than best effort. We've had this particular pvfs file system for several years with software and hardware upgrades/migrations, so our users may have forgotten this fact. Is there no way to excise the problematic entry, or is this a naive view of the error? Doug > Becky > -- > Becky Ligon > PVFS Developer > Clemson University > 864-656-3865 > > > Hi, > > > > We have experienced a crash on one of our data servers. The log > > entries from the crash are, > > > > [E 03/18 00:11] TROVE:DBPF:Berkeley DB: page 68107: illegal page type > or > > format > > [E 03/18 00:11] TROVE:DBPF:Berkeley DB: PANIC: Invalid argument > > [E 03/18 00:11] TROVE:DBPF:Berkeley DB: dspace_db->put setattr: > > DB_RUNRECOVERY: Fata > > l error, run database recovery > > [E 03/18 00:11] Error: DB_RUNRECOVERY encountered. > > [E 03/18 00:11] trove_write_callback_fn: I/O error occurred > > [E 03/18 00:11] TROVE:DBPF:Berkeley DB: PANIC: fatal region error > > detected; run reco > > very > > [E 03/18 00:11] handle_io_error: flow proto error cleanup started on > > 0x2aaab8272ee0: > > Input/output error > > [E 03/18 00:11] handle_io_error: flow proto 0x2aaab8272ee0 canceled 0 > > operations, wi > > ll clean up. > > [E 03/18 00:11] handle_io_error: flow proto 0x2aaab8272ee0 error > cleanup > > finished: I > > nput/output error > > [E 03/18 00:11] TROVE:DBPF:Berkeley DB: DB->get: DB_RUNRECOVERY: Fatal > > error, run da > > tabase recovery > > [E 03/18 00:11] Error: DB_RUNRECOVERY encountered. > > [E 03/18 00:11] TROVE:DBPF:Berkeley DB: PANIC: fatal region error > > detected; run reco > > very > > [E 03/18 00:11] trove_write_callback_fn: I/O error occurred > > [E 03/18 00:11] TROVE:DBPF:Berkeley DB: DB->get: DB_RUNRECOVERY: Fatal > > error, run da > > tabase recovery > > [E 03/18 00:11] handle_io_error: flow proto error cleanup started on > > 0x2aaab8151800: > > Input/output error > > [E 03/18 00:11] Error: DB_RUNRECOVERY encountered. > > [E 03/18 00:11] PVFS2 server: signal 11, faulty address is (nil), from > > (nil) > > [E 03/18 00:11] [bt] [(nil)] > > > > The db_verify was successful on all BDB files except for > > dataspace_attributes.db. > > > > db_verify -o dataspace_attributes.db > > db_verify: Page 68107: item 23 of unrecognizable type > > db_verify: Page 68107: gap between items at offset 3136 > > db_verify: Page 68107: item order check unsafe: skipping > > db_verify: dataspace_attributes.db: DB_VERIFY_BAD: Database > > verification failed > > > > This error appears to be very similar to what is described in the > > thread that starts at, > > > > > > http://www.beowulf-underground.org/pipermail/pvfs2-users/2010-February/003029.html > > > > After attempting 'db_recover', I have installed an alternate version > > of db-4.3.29 with the pvfs specific changes for to 'db_dump'. > > However, I have encountered an error, > > > > opt2326:/fs/pvfs/pvfs/5810ab5d> db_dump -r \ > > -f /tmp/dataspace_attributes.out \ > > dataspace_attributes.db > > > > db_dump: dataspace_attributes.db: DB_VERIFY_BAD: Database verification > > failed > > > > This error occurs regardless of whether I've run 'db_recover'. Is > > there any chance of recovery? > > > > > > Doug > > _______________________________________________ > > Pvfs2-users mailing list > > [email protected] > > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > > > _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
