On Mon, 26 Mar 2012 17:25:04 +0200 Matthias Gerstner <[email protected]> wrote:
> I'm recently experiencing trouble during my backup of OpenAFS volumes. > I perform backups using the > > 'vos dump -server <server> -partition <partition> -clone -id <vol>' <vol> I presume is an rw volume? Just so you know, a more common way of doing this is to use 'vos backupsys' and then backup the .backup volumes. Nothing 'wrong' with what you're doing, but it's a less common way. > However some days ago the backup of a specific volume failed with > a bad exit code (255). My backup script thus stopped further processing. > The concerned volume went offline as a result and did only show up in > 'vos listvol' as "couldn't attach volume ...". What did volserver say in VolserLog when that happened? It should give a reason as to why it could not attach. > After running a salvage on the affected volume it was brought back > online but most of the contained data was deleted due to a supposed > corruption of the directory strucuture detected during salvage. SalvageLog will say specifically why. Or SalsrvLog if you are running DAFS; are you running DAFS? > Attached is the VolserLog from the time when the last of the incidents > occured. What was the volume id for the volume in question? Possibly 536879790 or 536879793? > I'm currently running openafs 1.6.1 on Gentoo Linux with kernel > version 3.2.1. 1.6.1 is not a version that exists yet (or at least, certainly did not exist on Friday). What version is the volserver, and what version is 'vos'? (Running `strings </path/to/bin> | grep built` is a sure way to tell.) > Fri Mar 23 00:10:57 2012 1 Volser: Clone: Cloning volume 536879790 to new > volume 536889517 > Fri Mar 23 00:16:04 2012 1 Volser: Delete: volume 536889517 deleted > Fri Mar 23 00:16:04 2012 1 Volser: Clone: Cloning volume 536879793 to new > volume 536889518 > Fri Mar 23 00:16:06 2012 VDestroyVolumeDiskHeader: Couldn't unlink disk > header, error = 2 > Fri Mar 23 00:16:06 2012 VPurgeVolume: Error -1 when destroying volume > 536889517 header > Fri Mar 23 00:16:06 2012 1 Volser: Delete: volume 536889517 deleted > Fri Mar 23 00:16:09 2012 1 Volser: Delete: volume 536889518 deleted > Fri Mar 23 00:16:09 2012 VDestroyVolumeDiskHeader: Couldn't unlink disk > header, error = 2 > Fri Mar 23 00:16:09 2012 VPurgeVolume: Error -1 when destroying volume > 536889518 header > Fri Mar 23 00:16:09 2012 1 Volser: Delete: volume 536889518 deleted > Fri Mar 23 00:21:20 2012 trans 69 on volume 536889518 is older than 300 > seconds > Fri Mar 23 00:21:20 2012 trans 66 on volume 536889517 is older than 300 > seconds Hmm, are you sure 'vos dump' is the only thing you are running at the time? (You're running more than one in parallel... how many do you run at once?) This sequence of operations does not seem normal for just a 'vos dump'. -- Andrew Deason [email protected] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
