Re: [OpenAFS] Shutdown/startup of entire cell
Am Donnerstag 10 Januar 2013, 22:50:27 schrieb Garance A Drosihn: > Due to circumstances way beyond my control (a major network > upgrade), I am going to need to shutdown our entire AFS cell > this Saturday. So, tha is less than 36 hours from now. Other have already commented on your issue, so I will not do so anymore. But I'll add another point: Don't know what version of OpenAFS you're running in your cell, but if you're on 1.6.x already it may be a good opportunity to switch your fileservers to dafs in case you didn't do that already. This will bring up your fileservers much faster after the outage. Bye... Dirk -- Dirk Heinrichs Tel: +49 (0)2471 209385 | Mobil: +49 (0)176 34473913 GPG Public Key C2E467BB | Jabber: dirk.heinri...@altum.de signature.asc Description: This is a digitally signed message part.
Re: [OpenAFS] Rsync-ing a vice* partition
Am Freitag 11 Januar 2013, 00:14:21 schrieb Derrick Brashear: > as long as you preserve owner, group and mode you're fine. -o (owner) > -g (group) -p (perms) needed, but -a (archive) > implies all those. so the usual -auv that people use is fine. Usual for me is -acv (c = checksum), will take a bit longer, though. And, depending on the filesystem on /vicepx, I'd add "--exclude lost+found". Bye... Dirk -- Dirk Heinrichs Tel: +49 (0)2471 209385 | Mobil: +49 (0)176 34473913 GPG Public Key C2E467BB | Jabber: dirk.heinri...@altum.de signature.asc Description: This is a digitally signed message part.
Re: [OpenAFS] Rsync-ing a vice* partition
as long as you preserve owner, group and mode you're fine. -o (owner) -g (group) -p (perms) needed, but -a (archive) implies all those. so the usual -auv that people use is fine. On Thu, Jan 10, 2013 at 11:02 PM, Garance A Drosihn wrote: > Consider a fileserver with the following partitions on it: > > /vicepa (in production use) > /vicepb (in production use) > /nextpa (totally empty) > > Assume that all the AFS processes will be shutdown on this > fileserver for a few hours (for unrelated reasons). > > As far as AFS is concerned, would it be safe and reasonable > to use rsync to duplicate all files on /vicepa to /nextpa, > dismount both partitions, and then mount what was /nextpa > as /vicepa? Or is that playing with fire, such that it'd be > much safer to move the AFS volumes via standard AFS commands > while AFS is running? > > It also happens that every volume on /vicepa is replicated > on multiple AFS fileservers. (some are RO's for volumes > where the RW is on this /vicepa, and some are RO's for > volumes where the RW is on other fileservers). > > This is not an urgent issue. I'm just wondering. > > -- > Garance Alistair Drosehn= dro...@rpi.edu > Senior Systems Programmer or g...@freebsd.org > Rensselaer Polytechnic Institute; Troy, NY; USA > ___ > OpenAFS-info mailing list > OpenAFS-info@openafs.org > https://lists.openafs.org/mailman/listinfo/openafs-info > -- Derrick ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Shutdown/startup of entire cell
On 1/10/13 11:45 PM, Thomas Kula wrote: On Thu, Jan 10, 2013, Garance A Drosihn wrote: Hi. Due to circumstances way beyond my control (a major network upgrade), I am going to need to shutdown our entire AFS cell this Saturday. So, tha is less than 36 hours from now. Others have addressed the proper order for shutting down AFS servers well, so I won't touch on them. I will point out that when I was at UMich we ensured that all of our AFS fileservers were restarted at least once a year. I did advocate for emptying fileservers before doing that, although that never took hold, for various reasons, but with modern hardware it wasn't too onerous of a process. The bulk of the time was "waiting for callbacks to be broken". It happens that we have done controlled shutdowns and restarts of all our fileservers "recently", although that is more by dumb luck than good planning. We did one restart of them all in the summer of 2012, and one in the summer of 2011. Before that, I think they had gone four or five years without a restart. I expect we're going to make a point of doing such restarts on more regular and planned basis! -- Garance Alistair Drosehn= dro...@rpi.edu Senior Systems Programmer or g...@freebsd.org Rensselaer Polytechnic Institute; Troy, NY; USA ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Shutdown/startup of entire cell
On Thu, Jan 10, 2013 at 10:50:27PM -0500, Garance A Drosihn wrote: > Hi. > > Due to circumstances way beyond my control (a major network > upgrade), I am going to need to shutdown our entire AFS cell > this Saturday. So, tha is less than 36 hours from now. Others have addressed the proper order for shutting down AFS servers well, so I won't touch on them. I will point out that when I was at UMich we ensured that all of our AFS fileservers were restarted at least once a year. I did advocate for emptying fileservers before doing that, although that never took hold, for various reasons, but with modern hardware it wasn't too onerous of a process. The bulk of the time was "waiting for callbacks to be broken". We did this so that we knew that a fileserver would properly restart --- after a while, you've upgraded various things, fixed stuff, etc., and it was good to have a sanity check that nothing creeped in during that time. And, with Murphy around, you know that at some point your fileserver was going to get its power cord yanked, be the victim of hitting the wrong power switch, etc. It was nice to know that in that case, things would just come back when power was restored, and if something had creeped in over the course of the year, it was nice that it happened when a couple sysadmins where in the office, well-rested and well-caffeinated and ready to handle weird problems, rather than getting a bleary-eyed sleepy beep at 3am -- Thomas L. Kula | k...@tproa.net | http://kula.tproa.net/ ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Shutdown/startup of entire cell
On 1/10/13 11:14 PM, Jeffrey Altman wrote: You can shutdown the file servers without shutting down the database servers. During the outage the database servers may lose the ability to elect a master. Therefore you should avoid making any database changes during the outage window. I should have added that the database servers are all on a single network switch, so unless something goes REALLY wrong they won't lose contact with each other. I can't imagine we would be making any database changes. I'm just hoping to survive the upgrade. I would run one file server with a single local disk partition containing a readonly site for the root.afs and root.cell volumes. This could be on one of the database servers. Hmm. Clever idea! I would shutdown any file server with network attached storage for the length of the outage window. I would also place a README file in the root.cell root directory describe the outage for those that might check. If you are going to shutdown the database servers. Shut them down after the fileservers and restart them before the file > servers. Jeffrey Altman Thanks for the quick answers! ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
Re: [OpenAFS] Shutdown/startup of entire cell
You can shutdown the file servers without shutting down the database servers. During the outage the database servers may lose the ability to elect a master. Therefore you should avoid making any database changes during the outage window. I would run one file server with a single local disk partition containing a readonly site for the root.afs and root.cell volumes. This could be on one of the database servers. I would shutdown any file server with network attached storage for the length of the outage window. I would also place a README file in the root.cell root directory describe the outage for those that might check. If you are going to shutdown the database servers. Shut them down after the fileservers and restart them before the file servers. Jeffrey Altman On 1/10/2013 10:50 PM, Garance A Drosihn wrote: > Hi. > > Due to circumstances way beyond my control (a major network > upgrade), I am going to need to shutdown our entire AFS cell > this Saturday. So, tha is less than 36 hours from now. > > Basically all our fileservers use disks which are connected > via iSCSI, and the network upgrade may sever all network > connectivity between the AFS fileservers and their vice* > partitions for at least one hour, and possibly two. Thus I > expect it would be wise to shutdown all fileservers. > > And if I'm shutting down all our fileservers, I assume I > should also shut down all database servers. (True?) > > The one nice thing is that I can do a controlled shutdown, in > whatever order seems appropriate. So I have two questions: > > 1) When shutting down, should all database servers be shutdown >before the fileservers, or should the fileservers be shutdown >first? > > 2) When starting up, should the fileservers be started first, >or should the database servers be started first? > > I realize this will disrupt many of our AFS clients. We're > pretty much expecting we will reboot all of our systems by the > time we're done with this. It's safe to assume that I'm not > happy about any of this, but I have no choice in the matter. > > Apologies for running to the list on this. I expect the answer > is somewhere in the documentation (or maybe even intuitively > obvious), but this issue didn't come up until early this morning, > and I've got about a dozen other (non-AFS) servers which are also > effected by this network upgrade, so I'd like some confirmation > of best practices for this case. > signature.asc Description: OpenPGP digital signature
RE: [OpenAFS] Shutdown/startup of entire cell
> 1) When shutting down, should all database servers be shutdown > before the fileservers, or should the fileservers be shutdown > first? > > 2) When starting up, should the fileservers be started first, > or should the database servers be started first? You need the database servers to be up while you shut down/start up the fileservers, since one of the databases in question records which fileservers have which volumes and fileservers register/unregister volumes with that database during startup/shutdown. If there's no need for the database servers themselves to be shut down, you can leave them up, but I'm not sure that actually gains you anything. -- brandon s allbery kf8nh sine nomine associates allber...@gmail.com ballb...@sinenomine.net unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Rsync-ing a vice* partition
Consider a fileserver with the following partitions on it: /vicepa (in production use) /vicepb (in production use) /nextpa (totally empty) Assume that all the AFS processes will be shutdown on this fileserver for a few hours (for unrelated reasons). As far as AFS is concerned, would it be safe and reasonable to use rsync to duplicate all files on /vicepa to /nextpa, dismount both partitions, and then mount what was /nextpa as /vicepa? Or is that playing with fire, such that it'd be much safer to move the AFS volumes via standard AFS commands while AFS is running? It also happens that every volume on /vicepa is replicated on multiple AFS fileservers. (some are RO's for volumes where the RW is on this /vicepa, and some are RO's for volumes where the RW is on other fileservers). This is not an urgent issue. I'm just wondering. -- Garance Alistair Drosehn= dro...@rpi.edu Senior Systems Programmer or g...@freebsd.org Rensselaer Polytechnic Institute; Troy, NY; USA ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info
[OpenAFS] Shutdown/startup of entire cell
Hi. Due to circumstances way beyond my control (a major network upgrade), I am going to need to shutdown our entire AFS cell this Saturday. So, tha is less than 36 hours from now. Basically all our fileservers use disks which are connected via iSCSI, and the network upgrade may sever all network connectivity between the AFS fileservers and their vice* partitions for at least one hour, and possibly two. Thus I expect it would be wise to shutdown all fileservers. And if I'm shutting down all our fileservers, I assume I should also shut down all database servers. (True?) The one nice thing is that I can do a controlled shutdown, in whatever order seems appropriate. So I have two questions: 1) When shutting down, should all database servers be shutdown before the fileservers, or should the fileservers be shutdown first? 2) When starting up, should the fileservers be started first, or should the database servers be started first? I realize this will disrupt many of our AFS clients. We're pretty much expecting we will reboot all of our systems by the time we're done with this. It's safe to assume that I'm not happy about any of this, but I have no choice in the matter. Apologies for running to the list on this. I expect the answer is somewhere in the documentation (or maybe even intuitively obvious), but this issue didn't come up until early this morning, and I've got about a dozen other (non-AFS) servers which are also effected by this network upgrade, so I'd like some confirmation of best practices for this case. -- Garance Alistair Drosehn= dro...@rpi.edu Senior Systems Programmer or g...@freebsd.org Rensselaer Polytechnic Institute; Troy, NY; USA ___ OpenAFS-info mailing list OpenAFS-info@openafs.org https://lists.openafs.org/mailman/listinfo/openafs-info