On Monday, June 28, 2004 18:17:02 +0200 Frode Nilsen <[EMAIL PROTECTED]> wrote:
- Is the problem reproducible? Does it happen every time?
Yes, the problem happen every time I try to move spesific volumes; I have about 40 uservolumes that gives the same error.
- What versions of OpenAFS are you running on each server?
marvin is running 1.2.7, oliven is running 1.2.11
- How big is the volume?
# vos listvol marvin | grep 100554 user.h100554 536871605 RW 7183 K On-line
- What output do you get with -verbose ?
# vos move -fromserver marvin -frompartition /vicepa -toserver oliven -topartition /vicepa -id user.h100554 -verbose Starting transaction on source volume 536871605 ... done Cloning source volume 536871605 ... done Ending the transaction on the source volume 536871605 ... done Starting transaction on the cloned volume 536872062 ... done Creating the destination volume 536871605 ... done Dumping from clone 536872062 on source to volume 536871605 on destination ...Failed to move data for the volume 536871605 VOLSER: Problems encountered in doing the dump ! vos move: operation interrupted, cleanup in progress... clear transaction contexts access VLDB move incomplete - attempt cleanup of target partition - no guarantee cleanup complete - user verify desired result
OK. It looks like the failure is in the initial dump, not the final incremental. The volume is only about 7MB, which is not too large in the grand scheme of things. I cannot offhand think of a change since 1.2.7 that would break volume moves in this way, but I can't say for sure. And, I don't recall if you told us what platform these servers are.
I wonder if your source volserver is producing volume dumps that are broken in some fashion. I can't really debug the problem for you directly (well, I assume you're not interested in putting a volume dump containing a copy of all of your user's data someplace that I can see it). However, there are a few things you might be able to do to figure out what's going on...
- Dump the volume to a file (vos dump user.h100554 0 -file /some/temp/file) - Get and compile my dump analysis tools, which can be found in /afs/cs.cmu.edu/project/systems-jhutz/dumpscan - Run afsdump_scan -PHVv /some/temp/file If the dump is normal, it should spit out a dump header, a volume header, and then a list of all the vnodes in the volume, followed by a "dump end" tag. This "end" tag is what is apparently missing according to the errors we've seen so far.
Hopefully once you describe the output (really, it's probably safe to put it somewhere on the web and send a pointer, there's not really anything secret in it), I'll have some idea what to suggest next. In the meantime,
don't delete that dump file; we may have other ideas for analysis you can do on it.
Oh, one other thing you should try... Once you have a dump file, try restoring it to oliven, using a different volume name:
vos restore oliven a user.h100554.TEST /some/temp/file -verbose
It should be interesting to see if this indirect method works, and if not, it might help us determine where the problem might be.
Hm... Yet another thing you can check, though the afsdump_scan output should tell you this -- look at the last 5 bytes of the dump file. If the dump is terminated correctly, they should be 04 3a 21 4b 6e (this is the "dump end" tag and its magic number).
-- Jeffrey T. Hutzelman (N3NHS) <[EMAIL PROTECTED]> Sr. Research Systems Programmer School of Computer Science - Research Computing Facility Carnegie Mellon University - Pittsburgh, PA
_______________________________________________ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info
