It was determined out of list that the particular error I'm seeing in this case is because I was adding a RO volume on the same server but different partition as the RW volume. While I know it's terrible practice, it did work in previous versions and I was using it for testing purposes. Apparently, this is no longer allowed, and gives the errors that I'm seeing.
So, that mystery is solved - however, why I got stuck in salvage loops with 1.4.10 is not, and as I don't have logs and am wary to bring production machines back to 1.4.10, it'll remain a mystery for the forseeable future I imagine. Thanks for the help! -stefan On Thu, Aug 20, 2009 at 09:19:01AM -0500, Stefan Strandberg wrote: > Hi, > > 1.4.11 isn't really doable until it's at least in lenny-backports as we > don't want to roll our own versions of this. > > As for a stale replica existing, I may be misunderstanding. If you're > saying that it's a replica of that volume, I don't see how that's the > case. > > Here's the creation and subsequent release attempt for a brand new > volume: > > ste...@cog ~ $ vos create beth a foo.bar > Volume 536885610 created on partition /vicepa of beth > ste...@cog ~ $ vos addsite beth b foo.bar > Added replication site beth /vicepb for volume foo.bar > ste...@cog ~ $ vos rel -v foo.bar > > foo.bar > RWrite: 536885610 > number of sites -> 2 > server beth.cae.wisc.edu partition /vicepa RW Site > server beth.cae.wisc.edu partition /vicepb RO Site -- Not released > This is a complete release of volume 536885610 > Cloning RW volume 536885610 to temporary RO... done > Getting status of RW volume 536885610... done > Ending cloning transaction on RW volume 536885610... done > Starting transaction on cloned volume 536885611... done > Creating new volume 536885611 on replication site beth.cae.wisc.edu: Failed > to create the ro volume: : Input/output error > The volume 536885610 could not be released to the following 1 sites: > beth.cae.wisc.edu /vicepb > VOLSER: release could not be completed > Error in vos release command. > VOLSER: release could not be completed > > And here's the VolserLog output: > > Thu Aug 20 09:15:14 2009 1 Volser: CreateVolume: volume 536885610 (foo.bar) > created > Thu Aug 20 09:15:24 2009 1 Volser: Clone: Cloning volume 536885610 to new > volume 536885611 > Thu Aug 20 09:15:24 2009 VAttachVolume: Failed to open > /vicepb/V0536885611.vol (errno 2) > Thu Aug 20 09:15:24 2009 1 Volser: CreateVolume: Unable to create the volume; > aborted, error code 18 > Thu Aug 20 09:15:24 2009 : Invalid cross-device link > > Turning up debugging doesn't show any extra anything really. > > Thanks again, > > -stefan > > On Thu, Aug 20, 2009 at 09:42:46AM -0400, Derrick Brashear wrote: > > On Thu, Aug 20, 2009 at 9:39 AM, Jeffrey > > Altman<[email protected]> wrote: > > > Stefan Strandberg wrote: > > >> Anyone have any ideas? I would really like to get everything on 1.4.10 > > >> for the performance increases. > > > > > > The current version of OpenAFS is 1.4.11 which addresses: > > > > > > - Fix race in background sync code which could cause volumes to go > > > offline. (124359) > > > > > > This is not the issue you are describing. However, please test with > > > 1.4.11 and see if the problem is still present. If so, send logs and > > > report to [email protected]. > > > > > > it will still be present. the real problem is you have a stale copy of > > the replica elsewhere on the disk. there should be exactly one copy of > > 536885604, and it should be on the same partition as 536885602, both > > according to the vldb and in vos listvol output. arrange to make that > > true, and your issue will go away. > > _______________________________________________ > > OpenAFS-info mailing list > > [email protected] > > https://lists.openafs.org/mailman/listinfo/openafs-info > > > > -- > Stefan Strandberg > UNIX group > Computer Aided Engineering - UW Madison > [email protected] > > > _______________________________________________ > OpenAFS-info mailing list > [email protected] > https://lists.openafs.org/mailman/listinfo/openafs-info > -- Stefan Strandberg UNIX group Computer Aided Engineering - UW Madison [email protected] _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
