Hmm...my db_recovery docs say that This example only works for the keyval.db file. The dataspace_attributes.db file requires a different modification (not provided here). The file I'm having trouble with is the dataspace_attributes.db.
--Jim On Wed, Apr 4, 2012 at 11:04 AM, Phil Carns <[email protected]> wrote: > Another option to consider is the technique described in > pvfs2/doc/db-recovery.txt. It describes how to dump and reload two types of > db files. The latter is the one you want in this case > (dataspace_attributes.db). Please make a backup copy of the original .db > file if you try this. > > One thing to look out for that isn't mentioned in the doc is that the > rebuilt dataspace_attributes.db will probably be _much_ smaller than the > original. This doesn't mean that it lost data, its just that Berkeley DB > will pack it much more efficiently when all of the entries are rebuilt at > once. > > -Phil > > > On 04/02/2012 01:09 PM, Jim Kusznir wrote: >> >> Thanks Boyd: >> >> We have 3 io servers, each also running metadata servers. One will >> not come up (that's the 3rd server). I did try and run the db check >> command (forget the specifics), and it did return a single chunk of >> entries that are not readable. As you may guess from the above, I've >> never interacted with bdb on a direct or low level. I don't have a >> good answer for #3; I noticed about 1/3 of the directory entries were >> "red" on the terminal, and several individuals contacted me with pvfs >> problems. >> >> I will begin building new versions of bdb. Do I need to install this >> just on the servers, or do the clients need it as well? >> >> --Jim >> >> On Sun, Apr 1, 2012 at 4:03 PM, Boyd Wilson<[email protected]> wrote: >>> >>> Jim, >>> We have been discussing your issue internally. A few questions: >>> 1. How many metadata servers do you have? >>> 2. Do you know which one is affected (if there is more than one)? >>> 3. How much of the file system can you currently see? >>> >>> The issue you mentioned seems to be the one we have seen with the earlier >>> versions of BerkeleyDB and we have not seen them with the newer versions >>> as >>> Becky mentioned. In our discussions we can't recall if we tried doing a >>> low >>> level BDB access to the MD for the unaffected entries and back them up so >>> they can be restored in a new BDB. If you are comfortable with lower >>> level >>> BDB commands you may want to see if you can read the entries up to the >>> corruption and after, if you can do both, you may be able to write a >>> small >>> program to read out all the entries into a file or another BDB, then >>> rebuild >>> the BDB with the valid entries. >>> >>> thx >>> -boyd >>> >>> On Sat, Mar 31, 2012 at 6:07 PM, Becky Ligon<[email protected]> wrote: >>>> >>>> Jim: >>>> >>>> I understand your situation. Here at Clemson University, we went >>>> through >>>> the same situation a couple of years ago. Now, we backup the metadata >>>> databases. We don't have the space to backup our data either! >>>> >>>> Under no circumstances should you run pvfs2-fsck. If you do, then we >>>> won't be able to help at all, if you run this command in the destructive >>>> mode. If you're willing, Omnibond MAY be able to write some utilities >>>> that >>>> we help you recover most of the data. You will have to speak to Boyd >>>> Wilson >>>> ([email protected]) and workout something. >>>> >>>> Becky Ligon >>>> >>>> >>>> On Fri, Mar 30, 2012 at 5:55 PM, Jim Kusznir<[email protected]> wrote: >>>>> >>>>> I made no changes to my environment; it was up and running just fine. >>>>> I ran db_recover, and it immediately returned, with no apparent sign >>>>> of doing anything but creating a log.000000001 file. >>>>> >>>>> I have the centos DB installed, db4-4.3.29-10.el5 >>>>> >>>>> I have no backups; this is my high performance filesystem of 99TB; it >>>>> is the largest disk we have and therefore have no means of backing it >>>>> up. We don't have anything big enough to hold that much data. >>>>> >>>>> Is there any hope? Can we just identify and delete the files that >>>>> have the db dammange on it? (Note that I don't even have anywhere to >>>>> back up this data to temporally if we do get it running, so I'd need >>>>> to "fix in place". >>>>> >>>>> thanks! >>>>> --Jim >>>>> >>>>> --Jim >>>>> >>>>> On Fri, Mar 30, 2012 at 2:44 PM, Becky Ligon<[email protected]> >>>>> wrote: >>>>>> >>>>>> Jim: >>>>>> >>>>>> If you haven't made any recent changes to your pvfs environment or >>>>>> Berkeley >>>>>> Db installation, then it looks like you have a corrupted metadata >>>>>> database. >>>>>> There is no way to easily recover. Sometimes, the Berkeley db command >>>>>> "db_recover" might work, but PVFS doesn't have transactions turned on, >>>>>> so >>>>>> normally it doesn't work. It's worth a try, just to be sure. >>>>>> >>>>>> Do you have any recent backups of the databases? If so, then you will >>>>>> need >>>>>> to use a set of backups that were created around the same time, so the >>>>>> databases will be somewhat consistent with each other. >>>>>> >>>>>> Which version of Berkeley are you using? We have had corruption >>>>>> issues >>>>>> with >>>>>> older versions of it. We strongly recommend 4.8 or higher. There are >>>>>> some >>>>>> know problems with threads in the older versions . >>>>>> >>>>>> Becky Ligon >>>>>> >>>>>> On Fri, Mar 30, 2012 at 3:28 PM, Jim Kusznir<[email protected]> >>>>>> wrote: >>>>>>> >>>>>>> Hi all: >>>>>>> >>>>>>> I got some notices from my users with "wierdness with pvfs2" this >>>>>>> morning, and went and investagated. eventually, I found the >>>>>>> following >>>>>>> on one of my 3 serers: >>>>>>> >>>>>>> [S 03/30 12:22] PVFS2 Server on node pvfs2-io-0-2 version 2.8.2 >>>>>>> starting... >>>>>>> [E 03/30 12:23] Warning: got invalid handle or key size in >>>>>>> dbpf_dspace_iterate_handles(). >>>>>>> [E 03/30 12:23] Warning: skipping entry. >>>>>>> [E 03/30 12:23] c_get failed on iteration 3044 >>>>>>> [E 03/30 12:23] dbpf_dspace_iterate_handles_op_svc: Invalid argument >>>>>>> [E 03/30 12:23] Error adding handle range >>>>>>> 1431655768-2147483649,3579139414-4294967295 to filesystem pvfs2-fs >>>>>>> [E 03/30 12:23] Error: Could not initialize server interfaces; >>>>>>> aborting. >>>>>>> [E 03/30 12:23] Error: Could not initialize server; aborting. >>>>>>> >>>>>>> ------------ >>>>>>> pvfs2-fs.conf: >>>>>>> ----------- >>>>>>> >>>>>>> <Defaults> >>>>>>> UnexpectedRequests 50 >>>>>>> EventLogging none >>>>>>> LogStamp datetime >>>>>>> BMIModules bmi_tcp >>>>>>> FlowModules flowproto_multiqueue >>>>>>> PerfUpdateInterval 1000 >>>>>>> ServerJobBMITimeoutSecs 30 >>>>>>> ServerJobFlowTimeoutSecs 30 >>>>>>> ClientJobBMITimeoutSecs 300 >>>>>>> ClientJobFlowTimeoutSecs 300 >>>>>>> ClientRetryLimit 5 >>>>>>> ClientRetryDelayMilliSecs 2000 >>>>>>> StorageSpace /mnt/pvfs2 >>>>>>> LogFile /var/log/pvfs2-server.log >>>>>>> </Defaults> >>>>>>> >>>>>>> <Aliases> >>>>>>> Alias pvfs2-io-0-0 tcp://pvfs2-io-0-0:3334 >>>>>>> Alias pvfs2-io-0-1 tcp://pvfs2-io-0-1:3334 >>>>>>> Alias pvfs2-io-0-2 tcp://pvfs2-io-0-2:3334 >>>>>>> </Aliases> >>>>>>> >>>>>>> <Filesystem> >>>>>>> Name pvfs2-fs >>>>>>> ID 62659950 >>>>>>> RootHandle 1048576 >>>>>>> <MetaHandleRanges> >>>>>>> Range pvfs2-io-0-0 4-715827885 >>>>>>> Range pvfs2-io-0-1 715827886-1431655767 >>>>>>> Range pvfs2-io-0-2 1431655768-2147483649 >>>>>>> </MetaHandleRanges> >>>>>>> <DataHandleRanges> >>>>>>> Range pvfs2-io-0-0 2147483650-2863311531 >>>>>>> Range pvfs2-io-0-1 2863311532-3579139413 >>>>>>> Range pvfs2-io-0-2 3579139414-4294967295 >>>>>>> </DataHandleRanges> >>>>>>> <StorageHints> >>>>>>> TroveSyncMeta yes >>>>>>> TroveSyncData no >>>>>>> </StorageHints> >>>>>>> </Filesystem> >>>>>>> ------------- >>>>>>> Any suggestions for recovery? >>>>>>> >>>>>>> Thanks! >>>>>>> --Jim >>>>>>> _______________________________________________ >>>>>>> Pvfs2-users mailing list >>>>>>> [email protected] >>>>>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Becky Ligon >>>>>> OrangeFS Support and Development >>>>>> Omnibond Systems >>>>>> Anderson, South Carolina >>>>>> >>>>>> >>>> >>>> >>>> >>>> -- >>>> Becky Ligon >>>> OrangeFS Support and Development >>>> Omnibond Systems >>>> Anderson, South Carolina >>>> >>>> >>>> >>>> _______________________________________________ >>>> Pvfs2-users mailing list >>>> [email protected] >>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >>>> >> _______________________________________________ >> Pvfs2-users mailing list >> [email protected] >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > > > _______________________________________________ > Pvfs2-users mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
