Wheeler, JF (Jonathan) wrote: > One of our (3) AFS servers has a mounted read-write volume which must be > available 24x7 to our batch system. The server is as resilient is we > can make it, but still it may fail outside normal working hours for some > reason. For technical reasons related to the software installed on the > volume it is not possible to use read-only volumes mounted from our > other servers (the software must be installed and served from the same > directory name), so I have devised the following plan in the event of a > failure: > > a) create read-only volumes on the other 2 servers, but do not mount > them; use "vos release" whenever the software is updated > b) in the event of a failure of server1 (which has the rw volume), drop > the existing mount and mount one of the read-only volumes (we can live > with the read-only copy whilst server1 is being repaired/replaced) in > its place. > > Can anyone see problems with that scenario ? We could use "vos > convertROtoRW"; how would that affect the process ?
Volumes "app" and "app.readonly" have not only different names but different volume ids. Once an application has opened a directory or file on "app" it will continue to try to access the "app" volume even after the "app" volume is no longer present. Changing a mount point to refer to "app.readonly" will only affect future attempts to evaluate the path that resulted in "app" being accessed. The behavior you are looking for requires that the client believe that there is an alternate location for the "app" volume to failover to. In other words, you require that there be read-write replicas. This support does not currently exist. Using convertROtoRW will not provide the equivalence of a read-write replica because the client that is currently accessing the "app" volume on the one and only file server that the vlserver claimed the volume is located on. When the vldb is updated, the client will not receive any indication that the change occurred. During a volume move operation that notification would have come from the file server from which the volume was being moved. Of course, that file server is no longer responsive and is not involved in the move. Volume location information is valid for two hours but can be manually invalidated on the client using the "fs checkvolumes" command. During the 2009 Google Summer of Code some progress was made towards implementing a read-write replication model. If you are aware of resources that could be contributed to help complete this effort, please contact the OpenAFS gatekeepers. Jeffrey Altman _______________________________________________ OpenAFS-info mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-info
