> > The best way I've found to deal with this problem is to have multiple > servers behind a load-balancer and do a "rolling" restart. If you have > servers A and B, you take A out of the load balancer temporarilly, > upgrade it, add it back in, take B out, upgrade it, add it back in. > Using this technique, we were able to smoothly upgrade production > servers on a very busy cluster of machines during normal business hours > while customers were on the site.
we do that frequently here - 7 servers behind a BigIP. I've always wondered, though, whether this approach is foolproof for major upgrades for applications that maintain state - since a user might have a session created using a new-code box, then hit an old-code box on the next page view. it takes us many minutes to work through restarting the entire array. were you ever concerned about something like that? --Geoff