The following commit has been merged in the openafs-stable-1_6_x branch:
commit 3f29b253bcda761305b5567b3936b38f797e7848
Author: Andrew Deason <adea...@dson.org>
Date:   Sun May 1 11:24:30 2016 -0500

    ubik: Don't RECFOUNDDB if can't contact most sites
    
    Currently, the ubik recovery code will always set UBIK_RECFOUNDDB
    during recovery, after asking all other sites for their dbversions.
    This happens regardless of how many sites we were actually able to
    successfully contact, even if we couldn't contact any of them.
    
    This can cause problems when we are unable to contact a majority of
    sites with DISK_GetVersion. Since, if we haven't contacted a majority
    of sites, we cannot say with confidence that we know what the best db
    version available is (which is what UBIK_RECFOUNDDB represents; that
    we've found which database is the one we should be using). This can
    also result in UBIK_RECHAVEDB in a similar situation, indicating that
    we have the best db version locally, even though we never actually
    asked anyone else what their db version was.
    
    For example, say site A is the sync site going through recovery, and
    DISK_GetVersion fails for the only other sites B and C. Site A will
    then set UBIK_RECFOUNDDB, and will claim that site A has the best db
    version available (UBIK_RECHAVEDB). This allows site A to process ubik
    write transactions (causing the db to be labelled with a new epoch),
    or possibly to send the db to the other sites via DISK_SendFile, if
    they quickly become available during recovery. Ubik write transactions
    can succeed in this situation, because our ContactQuorum_* calls will
    succeed if we never try to contact a remote site ('rcode' defaults to
    0).
    
    This situation should be rather rare, because normally a majority of
    sites must be reachable by site A for site A to be voted the sync site
    in the first place. However, it is possible for site A to lose
    connectivity to all other sites immediately after sync site election.
    It is also possible for site A to proceed far enough in the recovery
    process to set UBIK_RECHAVEDB before it loses its sync site status.
    
    As a result of all of this, if a site with an old database comes
    online and there are network connectivity problems between the other
    sites and a ubik write request comes in, it's possible for the "old"
    database to overwrite the "new" database. This makes it look as if the
    database has "rolled back" to an earlier version.
    
    This should be possible with any ubik database, though how to actually
    trigger this bug can change due to different ubik servers setting
    different network timeouts. It is probably the most likely with the
    VLDB, because the VLDB is typically the most frequently written
    database.
    
    If a VLDB reverts to an earlier version, it can result in existing
    volumes to appear to not exist in the VLDB, and can result in new
    volumes re-using volume IDs from existing volumes. This can result in
    rather confusing errors.
    
    To fix this, ensure that we have contacted a majority of sites with
    DISK_GetVersion before indicating that we have located the best db
    version. If we've contacted a majority of sites, then we are
    guaranteed (under ubik assumptions) that we've found the best version,
    since previous writes to the database should be guaranteed to hit a
    majority of sites (otherwise they wouldn't be successful).
    
    If we cannot reach a majority of sites, we just don't set
    UBIK_RECFOUNDDB, and the recovery process restarts. Presumably on the
    next iteration we'll be able to contact them, or we'll lose sync site
    status if we can't reach the other sites for long enough.
    
    Reviewed-on: https://gerrit.openafs.org/12281
    Tested-by: BuildBot <build...@rampaginggeek.com>
    Reviewed-by: Benjamin Kaduk <ka...@mit.edu>
    (cherry picked from commit d3dbdade7e8eaf6da37dd6f1f53d9f1384626071)
    
    Change-Id: I4f4e7255efd3e16e3acfec8f90bf2019cab1fb63
    Reviewed-on: https://gerrit.openafs.org/12339
    Tested-by: BuildBot <build...@rampaginggeek.com>
    Reviewed-by: Mark Vitale <mvit...@sinenomine.net>
    Reviewed-by: Michael Meffie <mmef...@sinenomine.net>
    Reviewed-by: Stephan Wiesand <stephan.wies...@desy.de>

 src/ubik/recovery.c |   45 +++++++++++++++++++++++++++++----------------
 1 files changed, 29 insertions(+), 16 deletions(-)

-- 
OpenAFS Master Repository
_______________________________________________
OpenAFS-cvs mailing list
OpenAFS-cvs@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-cvs

Reply via email to