Author: rmacklem
Date: Tue Jun 12 19:36:32 2018
New Revision: 335012
URL: https://svnweb.freebsd.org/changeset/base/335012

Log:
  Merge the pNFS server code from projects/pnfs-planb-server into head.
  
  This code merge adds a pNFS service to the NFSv4.1 server. Although it is
  a large commit it should not affect behaviour for a non-pNFS NFS server.
  Some documentation on how this works can be found at:
  http://people.freebsd.org/~rmacklem/pnfs-planb-setup.txt
  and will hopefully be turned into a proper document soon.
  This is a merge of the kernel code. Userland and man page changes will
  come soon, once the dust settles on this merge.
  It has passed a "make universe", so I hope it will not cause build problems.
  It also adds NFSv4.1 server support for the "current stateid".
  
  Here is a brief overview of the pNFS service:
  A pNFS service separates the Read/Write oeprations from all the other NFSv4.1
  Metadata operations. It is hoped that this separation allows a pNFS service
  to be configured that exceeds the limits of a single NFS server for either
  storage capacity and/or I/O bandwidth.
  It is possible to configure mirroring within the data servers (DSs) so that
  the data storage file for an MDS file will be mirrored on two or more of
  the DSs.
  When this is used, failure of a DS will not stop the pNFS service and a
  failed DS can be recovered once repaired while the pNFS service continues
  to operate.  Although two way mirroring would be the norm, it is possible
  to set a mirroring level of up to four or the number of DSs, whichever is
  less.
  The Metadata server will always be a single point of failure,
  just as a single NFS server is.
  
  A Plan B pNFS service consists of a single MetaData Server (MDS) and K
  Data Servers (DS), all of which are recent FreeBSD systems.
  Clients will mount the MDS as they would a single NFS server.
  When files are created, the MDS creates a file tree identical to what a
  single NFS server creates, except that all the regular (VREG) files will
  be empty. As such, if you look at the exported tree on the MDS directly
  on the MDS server (not via an NFS mount), the files will all be of size 0.
  Each of these files will also have two extended attributes in the system
  attribute name space:
  pnfsd.dsfile - This extended attrbute stores the information that
      the MDS needs to find the data storage file(s) on DS(s) for this file.
  pnfsd.dsattr - This extended attribute stores the Size, AccessTime, ModifyTime
      and Change attributes for the file, so that the MDS doesn't need to
      acquire the attributes from the DS for every Getattr operation.
  For each regular (VREG) file, the MDS creates a data storage file on one
  (or more if mirroring is enabled) of the DSs in one of the "dsNN"
  subdirectories.  The name of this file is the file handle
  of the file on the MDS in hexadecimal so that the name is unique.
  The DSs use subdirectories named "ds0" to "dsN" so that no one directory
  gets too large. The value of "N" is set via the sysctl vfs.nfsd.dsdirsize
  on the MDS, with the default being 20.
  For production servers that will store a lot of files, this value should
  probably be much larger.
  It can be increased when the "nfsd" daemon is not running on the MDS,
  once the "dsK" directories are created.
  
  For pNFS aware NFSv4.1 clients, the FreeBSD server will return two pieces
  of information to the client that allows it to do I/O directly to the DS.
  DeviceInfo - This is relatively static information that defines what a DS
               is. The critical bits of information returned by the FreeBSD
               server is the IP address of the DS and, for the Flexible
               File layout, that NFSv4.1 is to be used and that it is
               "tightly coupled".
               There is a "deviceid" which identifies the DeviceInfo.
  Layout     - This is per file and can be recalled by the server when it
               is no longer valid. For the FreeBSD server, there is support
               for two types of layout, call File and Flexible File layout.
               Both allow the client to do I/O on the DS via NFSv4.1 I/O
               operations. The Flexible File layout is a more recent variant
               that allows specification of mirrors, where the client is
               expected to do writes to all mirrors to maintain them in a
               consistent state. The Flexible File layout also allows the
               client to report I/O errors for a DS back to the MDS.
               The Flexible File layout supports two variants referred to as
               "tightly coupled" vs "loosely coupled". The FreeBSD server always
               uses the "tightly coupled" variant where the client uses the
               same credentials to do I/O on the DS as it would on the MDS.
               For the "loosely coupled" variant, the layout specifies a
               synthetic user/group that the client uses to do I/O on the DS.
               The FreeBSD server does not do striping and always returns
               layouts for the entire file. The critical information in a layout
               is Read vs Read/Writea and DeviceID(s) that identify which
               DS(s) the data is stored on.
  
  At this time, the MDS generates File Layout layouts to NFSv4.1 clients
  that know how to do pNFS for the non-mirrored DS case unless the sysctl
  vfs.nfsd.default_flexfile is set non-zero, in which case Flexible File
  layouts are generated.
  The mirrored DS configuration always generates Flexible File layouts.
  For NFS clients that do not support NFSv4.1 pNFS, all I/O operations
  are done against the MDS which acts as a proxy for the appropriate DS(s).
  When the MDS receives an I/O RPC, it will do the RPC on the DS as a proxy.
  If the DS is on the same machine, the MDS/DS will do the RPC on the DS as
  a proxy and so on, until the machine runs out of some resource, such as
  session slots or mbufs.
  As such, DSs must be separate systems from the MDS.
  
  Tested by:    james.r...@framestore.com
  Relnotes:     yes

Modified:
  head/sys/fs/nfs/nfs.h
  head/sys/fs/nfs/nfs_commonacl.c
  head/sys/fs/nfs/nfs_commonport.c
  head/sys/fs/nfs/nfs_commonsubs.c
  head/sys/fs/nfs/nfs_var.h
  head/sys/fs/nfs/nfsport.h
  head/sys/fs/nfs/nfsproto.h
  head/sys/fs/nfs/nfsrvstate.h
  head/sys/fs/nfsclient/nfs_clport.c
  head/sys/fs/nfsclient/nfs_clrpcops.c
  head/sys/fs/nfsclient/nfs_clstate.c
  head/sys/fs/nfsclient/nfs_clvfsops.c
  head/sys/fs/nfsserver/nfs_nfsdkrpc.c
  head/sys/fs/nfsserver/nfs_nfsdport.c
  head/sys/fs/nfsserver/nfs_nfsdserv.c
  head/sys/fs/nfsserver/nfs_nfsdsocket.c
  head/sys/fs/nfsserver/nfs_nfsdstate.c
  head/sys/fs/nfsserver/nfs_nfsdsubs.c
  head/sys/nfs/nfs_nfssvc.c
  head/sys/nfs/nfssvc.h

Modified: head/sys/fs/nfs/nfs.h
==============================================================================
--- head/sys/fs/nfs/nfs.h       Tue Jun 12 19:26:25 2018        (r335011)
+++ head/sys/fs/nfs/nfs.h       Tue Jun 12 19:36:32 2018        (r335012)
@@ -98,6 +98,7 @@
 #define        NFSSESSIONHASHSIZE      20      /* Size of server session hash 
table */
 #endif
 #define        NFSSTATEHASHSIZE        10      /* Size of server stateid hash 
table */
+#define        NFSLAYOUTHIGHWATER      1000000 /* Upper limit for # of layouts 
*/
 #ifndef        NFSCLDELEGHIGHWATER
 #define        NFSCLDELEGHIGHWATER     10000   /* limit for client delegations 
*/
 #endif
@@ -171,11 +172,20 @@ struct nfsd_addsock_args {
 
 /*
  * nfsd argument for new krpc.
+ * (New version supports pNFS, indicated by NFSSVC_NEWSTRUCT flag.)
  */
 struct nfsd_nfsd_args {
        const char *principal;  /* GSS-API service principal name */
        int     minthreads;     /* minimum service thread count */
        int     maxthreads;     /* maximum service thread count */
+       int     version;        /* Allow multiple variants */
+       char    *addr;          /* pNFS DS addresses */
+       int     addrlen;        /* Length of addrs */
+       char    *dnshost;       /* DNS names for DS addresses */
+       int     dnshostlen;     /* Length of DNS names */
+       char    *dspath;        /* DS Mount path on MDS */
+       int     dspathlen;      /* Length of DS Mount path on MDS */
+       int     mirrorcnt;      /* Number of mirrors to create on DSs */
 };
 
 /*
@@ -186,6 +196,23 @@ struct nfsd_nfsd_args {
 #define        NFSDEV_MAXMIRRORS       4
 #define        NFSDEV_MAXVERS          4
 
+struct nfsd_pnfsd_args {
+       int     op;             /* Which pNFSd op to perform. */
+       char    *mdspath;       /* Path of MDS file. */
+       char    *dspath;        /* Path of recovered DS mounted on dir. */
+       char    *curdspath;     /* Path of current DS mounted on dir. */
+};
+
+#define        PNFSDOP_DELDSSERVER     1
+#define        PNFSDOP_COPYMR          2
+
+/* Old version. */
+struct nfsd_nfsd_oargs {
+       const char *principal;  /* GSS-API service principal name */
+       int     minthreads;     /* minimum service thread count */
+       int     maxthreads;     /* maximum service thread count */
+};
+
 /*
  * Arguments for use by the callback daemon.
  */
@@ -593,8 +620,8 @@ struct nfsrv_descript {
        NFSSOCKADDR_T           nd_nam2;        /* return socket addr */
        caddr_t                 nd_dpos;        /* Current dissect pos */
        caddr_t                 nd_bpos;        /* Current build pos */
+       u_int64_t               nd_flag;        /* nd_flag */
        u_int16_t               nd_procnum;     /* RPC # */
-       u_int32_t               nd_flag;        /* nd_flag */
        u_int32_t               nd_repstat;     /* Reply status */
        int                     *nd_errp;       /* Pointer to ret status */
        u_int32_t               nd_retxid;      /* Reply xid */
@@ -613,6 +640,8 @@ struct nfsrv_descript {
        uint32_t                nd_slotid;      /* Slotid for this RPC */
        SVCXPRT                 *nd_xprt;       /* Server RPC handle */
        uint32_t                *nd_sequence;   /* Sequence Op. ptr */
+       nfsv4stateid_t          nd_curstateid;  /* Current StateID */
+       nfsv4stateid_t          nd_savedcurstateid; /* Saved Current StateID */
 };
 
 #define        nd_princlen     nd_gssnamelen
@@ -649,6 +678,9 @@ struct nfsrv_descript {
 #define        ND_CACHETHIS            0x08000000
 #define        ND_LASTOP               0x10000000
 #define        ND_LOOPBADSESS          0x20000000
+#define        ND_DSSERVER             0x40000000
+#define        ND_CURSTATEID           0x80000000
+#define        ND_SAVEDCURSTATEID      0x100000000
 
 /*
  * ND_GSS should be the "or" of all GSS type authentications.

Modified: head/sys/fs/nfs/nfs_commonacl.c
==============================================================================
--- head/sys/fs/nfs/nfs_commonacl.c     Tue Jun 12 19:26:25 2018        
(r335011)
+++ head/sys/fs/nfs/nfs_commonacl.c     Tue Jun 12 19:36:32 2018        
(r335012)
@@ -450,36 +450,6 @@ nfsrv_buildacl(struct nfsrv_descript *nd, NFSACL_T *ac
 }
 
 /*
- * Set an NFSv4 acl.
- */
-APPLESTATIC int
-nfsrv_setacl(vnode_t vp, NFSACL_T *aclp, struct ucred *cred,
-    NFSPROC_T *p)
-{
-       int error;
-
-       if (nfsrv_useacl == 0 || nfs_supportsnfsv4acls(vp) == 0) {
-               error = NFSERR_ATTRNOTSUPP;
-               goto out;
-       }
-       /*
-        * With NFSv4 ACLs, chmod(2) may need to add additional entries.
-        * Make sure it has enough room for that - splitting every entry
-        * into two and appending "canonical six" entries at the end.
-        * Cribbed out of kern/vfs_acl.c - Rick M.
-        */
-       if (aclp->acl_cnt > (ACL_MAX_ENTRIES - 6) / 2) {
-               error = NFSERR_ATTRNOTSUPP;
-               goto out;
-       }
-       error = VOP_SETACL(vp, ACL_TYPE_NFS4, aclp, cred, p);
-
-out:
-       NFSEXITCODE(error);
-       return (error);
-}
-
-/*
  * Compare two NFSv4 acls.
  * Return 0 if they are the same, 1 if not the same.
  */

Modified: head/sys/fs/nfs/nfs_commonport.c
==============================================================================
--- head/sys/fs/nfs/nfs_commonport.c    Tue Jun 12 19:26:25 2018        
(r335011)
+++ head/sys/fs/nfs/nfs_commonport.c    Tue Jun 12 19:36:32 2018        
(r335012)
@@ -69,6 +69,9 @@ int nfscl_debuglevel = 0;
 char nfsv4_callbackaddr[INET6_ADDRSTRLEN];
 struct callout newnfsd_callout;
 int nfsrv_lughashsize = 100;
+struct mtx nfsrv_dslock_mtx;
+struct nfsdevicehead nfsrv_devidhead;
+volatile int nfsrv_devidcnt = 0;
 void (*nfsd_call_servertimer)(void) = NULL;
 void (*ncl_call_invalcaches)(struct vnode *) = NULL;
 
@@ -768,6 +771,8 @@ nfscommon_modevent(module_t mod, int type, void *data)
                mtx_init(&nfs_req_mutex, "nfs_req_mutex", NULL, MTX_DEF);
                mtx_init(&nfsrv_nfsuserdsock.nr_mtx, "nfsuserd", NULL,
                    MTX_DEF);
+               mtx_init(&nfsrv_dslock_mtx, "nfs4ds", NULL, MTX_DEF);
+               TAILQ_INIT(&nfsrv_devidhead);
                callout_init(&newnfsd_callout, 1);
                newnfs_init();
                nfsd_call_nfscommon = nfssvc_nfscommon;
@@ -794,6 +799,7 @@ nfscommon_modevent(module_t mod, int type, void *data)
                mtx_destroy(&nfs_slock_mutex);
                mtx_destroy(&nfs_req_mutex);
                mtx_destroy(&nfsrv_nfsuserdsock.nr_mtx);
+               mtx_destroy(&nfsrv_dslock_mtx);
                loaded = 0;
                break;
        default:

Modified: head/sys/fs/nfs/nfs_commonsubs.c
==============================================================================
--- head/sys/fs/nfs/nfs_commonsubs.c    Tue Jun 12 19:26:25 2018        
(r335011)
+++ head/sys/fs/nfs/nfs_commonsubs.c    Tue Jun 12 19:36:32 2018        
(r335012)
@@ -70,15 +70,24 @@ gid_t nfsrv_defaultgid = GID_NOGROUP;
 int nfsrv_lease = NFSRV_LEASE;
 int ncl_mbuf_mlen = MLEN;
 int nfsd_enable_stringtouid = 0;
+int nfsrv_doflexfile = 0;
 static int nfs_enable_uidtostring = 0;
 NFSNAMEIDMUTEX;
 NFSSOCKMUTEX;
 extern int nfsrv_lughashsize;
+extern struct mtx nfsrv_dslock_mtx;
+extern volatile int nfsrv_devidcnt;
+extern int nfscl_debuglevel;
+extern struct nfsdevicehead nfsrv_devidhead;
 
 SYSCTL_DECL(_vfs_nfs);
 SYSCTL_INT(_vfs_nfs, OID_AUTO, enable_uidtostring, CTLFLAG_RW,
     &nfs_enable_uidtostring, 0, "Make nfs always send numeric owner_names");
 
+int nfsrv_maxpnfsmirror = 1;
+SYSCTL_INT(_vfs_nfs, OID_AUTO, pnfsmirror, CTLFLAG_RD,
+    &nfsrv_maxpnfsmirror, 0, "Mirror level for pNFS service");
+
 /*
  * This array of structures indicates, for V4:
  * retfh - which of 3 types of calling args are used
@@ -487,7 +496,7 @@ nfsm_fhtom(struct nfsrv_descript *nd, u_int8_t *fhp, i
 {
        u_int32_t *tl;
        u_int8_t *cp;
-       int fullsiz, bytesize = 0;
+       int fullsiz, rem, bytesize = 0;
 
        if (size == 0)
                size = NFSX_MYFH;
@@ -504,6 +513,7 @@ nfsm_fhtom(struct nfsrv_descript *nd, u_int8_t *fhp, i
        case ND_NFSV3:
        case ND_NFSV4:
                fullsiz = NFSM_RNDUP(size);
+               rem = fullsiz - size;
                if (set_true) {
                    bytesize = 2 * NFSX_UNSIGNED + fullsiz;
                    NFSM_BUILD(tl, u_int32_t *, NFSX_UNSIGNED);
@@ -1768,6 +1778,40 @@ nfsv4_loadattr(struct nfsrv_descript *nd, vnode_t vp,
                        }
                        attrsum += cnt;
                        break;
+               case NFSATTRBIT_FSLAYOUTTYPE:
+               case NFSATTRBIT_LAYOUTTYPE:
+                       NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
+                       attrsum += NFSX_UNSIGNED;
+                       i = fxdr_unsigned(int, *tl);
+                       if (i > 0) {
+                               NFSM_DISSECT(tl, u_int32_t *, i *
+                                   NFSX_UNSIGNED);
+                               attrsum += i * NFSX_UNSIGNED;
+                               j = fxdr_unsigned(int, *tl);
+                               if (i == 1 && compare && !(*retcmpp) &&
+                                   (((nfsrv_doflexfile != 0 ||
+                                      nfsrv_maxpnfsmirror > 1) &&
+                                     j != NFSLAYOUT_FLEXFILE) ||
+                                   (nfsrv_doflexfile == 0 &&
+                                    j != NFSLAYOUT_NFSV4_1_FILES)))
+                                       *retcmpp = NFSERR_NOTSAME;
+                       }
+                       if (nfsrv_devidcnt == 0) {
+                               if (compare && !(*retcmpp) && i > 0)
+                                       *retcmpp = NFSERR_NOTSAME;
+                       } else {
+                               if (compare && !(*retcmpp) && i != 1)
+                                       *retcmpp = NFSERR_NOTSAME;
+                       }
+                       break;
+               case NFSATTRBIT_LAYOUTALIGNMENT:
+               case NFSATTRBIT_LAYOUTBLKSIZE:
+                       NFSM_DISSECT(tl, u_int32_t *, NFSX_UNSIGNED);
+                       attrsum += NFSX_UNSIGNED;
+                       i = fxdr_unsigned(int, *tl);
+                       if (compare && !(*retcmpp) && i != NFS_SRVMAXIO)
+                               *retcmpp = NFSERR_NOTSAME;
+                       break;
                default:
                        printf("EEK! nfsv4_loadattr unknown attr=%d\n",
                                bitpos);
@@ -2024,7 +2068,8 @@ APPLESTATIC int
 nfsv4_fillattr(struct nfsrv_descript *nd, struct mount *mp, vnode_t vp,
     NFSACL_T *saclp, struct vattr *vap, fhandle_t *fhp, int rderror,
     nfsattrbit_t *attrbitp, struct ucred *cred, NFSPROC_T *p, int isdgram,
-    int reterr, int supports_nfsv4acls, int at_root, uint64_t 
mounted_on_fileno)
+    int reterr, int supports_nfsv4acls, int at_root, uint64_t 
mounted_on_fileno,
+    struct statfs *pnfssf)
 {
        int bitpos, retnum = 0;
        u_int32_t *tl;
@@ -2426,25 +2471,45 @@ nfsv4_fillattr(struct nfsrv_descript *nd, struct mount
                        break;
                case NFSATTRBIT_SPACEAVAIL:
                        NFSM_BUILD(tl, u_int32_t *, NFSX_HYPER);
-                       if (priv_check_cred(cred, PRIV_VFS_BLOCKRESERVE, 0))
-                               uquad = (u_int64_t)fs->f_bfree;
+                       if (priv_check_cred(cred, PRIV_VFS_BLOCKRESERVE, 0)) {
+                               if (pnfssf != NULL)
+                                       uquad = (u_int64_t)pnfssf->f_bfree;
+                               else
+                                       uquad = (u_int64_t)fs->f_bfree;
+                       } else {
+                               if (pnfssf != NULL)
+                                       uquad = (u_int64_t)pnfssf->f_bavail;
+                               else
+                                       uquad = (u_int64_t)fs->f_bavail;
+                       }
+                       if (pnfssf != NULL)
+                               uquad *= pnfssf->f_bsize;
                        else
-                               uquad = (u_int64_t)fs->f_bavail;
-                       uquad *= fs->f_bsize;
+                               uquad *= fs->f_bsize;
                        txdr_hyper(uquad, tl);
                        retnum += NFSX_HYPER;
                        break;
                case NFSATTRBIT_SPACEFREE:
                        NFSM_BUILD(tl, u_int32_t *, NFSX_HYPER);
-                       uquad = (u_int64_t)fs->f_bfree;
-                       uquad *= fs->f_bsize;
+                       if (pnfssf != NULL) {
+                               uquad = (u_int64_t)pnfssf->f_bfree;
+                               uquad *= pnfssf->f_bsize;
+                       } else {
+                               uquad = (u_int64_t)fs->f_bfree;
+                               uquad *= fs->f_bsize;
+                       }
                        txdr_hyper(uquad, tl);
                        retnum += NFSX_HYPER;
                        break;
                case NFSATTRBIT_SPACETOTAL:
                        NFSM_BUILD(tl, u_int32_t *, NFSX_HYPER);
-                       uquad = (u_int64_t)fs->f_blocks;
-                       uquad *= fs->f_bsize;
+                       if (pnfssf != NULL) {
+                               uquad = (u_int64_t)pnfssf->f_blocks;
+                               uquad *= pnfssf->f_bsize;
+                       } else {
+                               uquad = (u_int64_t)fs->f_blocks;
+                               uquad *= fs->f_bsize;
+                       }
                        txdr_hyper(uquad, tl);
                        retnum += NFSX_HYPER;
                        break;
@@ -2514,6 +2579,33 @@ nfsv4_fillattr(struct nfsrv_descript *nd, struct mount
                        NFSCLRBIT_ATTRBIT(&attrbits, NFSATTRBIT_TIMEACCESSSET);
                        retnum += nfsrv_putattrbit(nd, &attrbits);
                        break;
+               case NFSATTRBIT_FSLAYOUTTYPE:
+               case NFSATTRBIT_LAYOUTTYPE:
+                       if (nfsrv_devidcnt == 0)
+                               siz = 1;
+                       else
+                               siz = 2;
+                       if (siz == 2) {
+                               NFSM_BUILD(tl, u_int32_t *, 2 * NFSX_UNSIGNED);
+                               *tl++ = txdr_unsigned(1);       /* One entry. */
+                               if (nfsrv_doflexfile != 0 ||
+                                   nfsrv_maxpnfsmirror > 1)
+                                       *tl = txdr_unsigned(NFSLAYOUT_FLEXFILE);
+                               else
+                                       *tl = txdr_unsigned(
+                                           NFSLAYOUT_NFSV4_1_FILES);
+                       } else {
+                               NFSM_BUILD(tl, u_int32_t *, NFSX_UNSIGNED);
+                               *tl = 0;
+                       }
+                       retnum += siz * NFSX_UNSIGNED;
+                       break;
+               case NFSATTRBIT_LAYOUTALIGNMENT:
+               case NFSATTRBIT_LAYOUTBLKSIZE:
+                       NFSM_BUILD(tl, u_int32_t *, NFSX_UNSIGNED);
+                       *tl = txdr_unsigned(NFS_SRVMAXIO);
+                       retnum += NFSX_UNSIGNED;
+                       break;
                default:
                        printf("EEK! Bad V4 attribute bitpos=%d\n", bitpos);
                }
@@ -4238,5 +4330,40 @@ nfsv4_freeslot(struct nfsclsession *sep, int slot)
        sep->nfsess_slots &= ~bitval;
        wakeup(&sep->nfsess_slots);
        mtx_unlock(&sep->nfsess_mtx);
+}
+
+/*
+ * Search for a matching pnfsd mirror device structure, base on the nmp arg.
+ * Return one if found, NULL otherwise.
+ */
+struct nfsdevice *
+nfsv4_findmirror(struct nfsmount *nmp)
+{
+       struct nfsdevice *ds, *fndds;
+       int fndmirror;
+
+       mtx_assert(NFSDDSMUTEXPTR, MA_OWNED);
+       /*
+        * Search the DS server list for a match with nmp.
+        * Remove the DS entry if found and there is a mirror.
+        */
+       fndds = NULL;
+       fndmirror = 0;
+       if (nfsrv_devidcnt == 0)
+               return (fndds);
+       TAILQ_FOREACH(ds, &nfsrv_devidhead, nfsdev_list) {
+               if (ds->nfsdev_nmp == nmp) {
+                       NFSCL_DEBUG(4, "fnd main ds\n");
+                       fndds = ds;
+               } else if (ds->nfsdev_nmp != NULL)
+                       fndmirror = 1;
+               if (fndds != NULL && fndmirror != 0)
+                       break;
+       }
+       if (fndmirror == 0) {
+               NFSCL_DEBUG(4, "no mirror for DS\n");
+               return (NULL);
+       }
+       return (fndds);
 }
 

Modified: head/sys/fs/nfs/nfs_var.h
==============================================================================
--- head/sys/fs/nfs/nfs_var.h   Tue Jun 12 19:26:25 2018        (r335011)
+++ head/sys/fs/nfs/nfs_var.h   Tue Jun 12 19:36:32 2018        (r335012)
@@ -63,6 +63,7 @@ union nethostaddr;
 struct nfsstate;
 struct nfslock;
 struct nfsclient;
+struct nfslayout;
 struct nfsdsession;
 struct nfslockconflict;
 struct nfsd_idargs;
@@ -82,6 +83,9 @@ struct nfsv4lock;
 struct nfsvattr;
 struct nfs_vattr;
 struct NFSSVCARGS;
+struct nfsdevice;
+struct pnfsdsfile;
+struct pnfsdsattr;
 #ifdef __FreeBSD__
 NFS_ACCESS_ARGS;
 NFS_OPEN_ARGS;
@@ -112,9 +116,9 @@ int nfsrv_openctrl(struct nfsrv_descript *, vnode_t,
 int nfsrv_opencheck(nfsquad_t, nfsv4stateid_t *, struct nfsstate *,
     vnode_t, struct nfsrv_descript *, NFSPROC_T *, int);
 int nfsrv_openupdate(vnode_t, struct nfsstate *, nfsquad_t,
-    nfsv4stateid_t *, struct nfsrv_descript *, NFSPROC_T *);
+    nfsv4stateid_t *, struct nfsrv_descript *, NFSPROC_T *, int *);
 int nfsrv_delegupdate(struct nfsrv_descript *, nfsquad_t, nfsv4stateid_t *,
-    vnode_t, int, struct ucred *, NFSPROC_T *);
+    vnode_t, int, struct ucred *, NFSPROC_T *, int *);
 int nfsrv_releaselckown(struct nfsstate *, nfsquad_t, NFSPROC_T *);
 void nfsrv_zapclient(struct nfsclient *, NFSPROC_T *);
 int nfssvc_idname(struct nfsd_idargs *);
@@ -131,7 +135,7 @@ int nfsrv_checksetattr(vnode_t, struct nfsrv_descript 
     nfsv4stateid_t *, struct nfsvattr *, nfsattrbit_t *, struct nfsexstuff *,
     NFSPROC_T *);
 int nfsrv_checkgetattr(struct nfsrv_descript *, vnode_t,
-    struct nfsvattr *, nfsattrbit_t *, struct ucred *, NFSPROC_T *);
+    struct nfsvattr *, nfsattrbit_t *, NFSPROC_T *);
 int nfsrv_nfsuserdport(struct sockaddr *, u_short, NFSPROC_T *);
 void nfsrv_nfsuserddelport(void);
 void nfsrv_throwawayallstate(NFSPROC_T *);
@@ -140,6 +144,30 @@ int nfsrv_checksequence(struct nfsrv_descript *, uint3
 int nfsrv_checkreclaimcomplete(struct nfsrv_descript *);
 void nfsrv_cache_session(uint8_t *, uint32_t, int, struct mbuf **);
 void nfsrv_freeallbackchannel_xprts(void);
+int nfsrv_layoutcommit(struct nfsrv_descript *, vnode_t, int, int, uint64_t,
+    uint64_t, uint64_t, int, struct timespec *, int, nfsv4stateid_t *,
+    int, char *, int *, uint64_t *, struct ucred *, NFSPROC_T *);
+int nfsrv_layoutget(struct nfsrv_descript *, vnode_t, struct nfsexstuff *,
+    int, int *, uint64_t *, uint64_t *, uint64_t, nfsv4stateid_t *, int, int *,
+    int *, char *, struct ucred *, NFSPROC_T *);
+void nfsrv_flexmirrordel(char *, NFSPROC_T *);
+void nfsrv_recalloldlayout(NFSPROC_T *);
+int nfsrv_layoutreturn(struct nfsrv_descript *, vnode_t, int, int, uint64_t,
+    uint64_t, int, int, nfsv4stateid_t *, int, uint32_t *, int *,
+    struct ucred *, NFSPROC_T *);
+int nfsrv_getdevinfo(char *, int, uint32_t *, uint32_t *, int *, char **);
+void nfsrv_freeonedevid(struct nfsdevice *);
+void nfsrv_freealllayoutsanddevids(void);
+void nfsrv_freefilelayouts(fhandle_t *);
+int nfsrv_deldsserver(char *, NFSPROC_T *);
+struct nfsdevice *nfsrv_deldsnmp(struct nfsmount *, NFSPROC_T *);
+int nfsrv_createdevids(struct nfsd_nfsd_args *, NFSPROC_T *);
+int nfsrv_checkdsattr(struct nfsrv_descript *, vnode_t, NFSPROC_T *);
+int nfsrv_copymr(vnode_t, vnode_t, vnode_t, struct nfsdevice *,
+    struct pnfsdsfile *, struct pnfsdsfile *, int, struct ucred *, NFSPROC_T 
*);
+int nfsrv_mdscopymr(char *, char *, char *, char *, int *, char *, NFSPROC_T *,
+    struct vnode **, struct vnode **, struct pnfsdsfile **, struct nfsdevice 
**,
+    struct nfsdevice **);
 
 /* nfs_nfsdserv.c */
 int nfsrvd_access(struct nfsrv_descript *, int,
@@ -240,6 +268,14 @@ int nfsrvd_destroysession(struct nfsrv_descript *, int
     vnode_t, NFSPROC_T *, struct nfsexstuff *);
 int nfsrvd_freestateid(struct nfsrv_descript *, int,
     vnode_t, NFSPROC_T *, struct nfsexstuff *);
+int nfsrvd_layoutget(struct nfsrv_descript *, int,
+    vnode_t, NFSPROC_T *, struct nfsexstuff *);
+int nfsrvd_getdevinfo(struct nfsrv_descript *, int,
+    vnode_t, NFSPROC_T *, struct nfsexstuff *);
+int nfsrvd_layoutcommit(struct nfsrv_descript *, int,
+    vnode_t, NFSPROC_T *, struct nfsexstuff *);
+int nfsrvd_layoutreturn(struct nfsrv_descript *, int,
+    vnode_t, NFSPROC_T *, struct nfsexstuff *);
 int nfsrvd_teststateid(struct nfsrv_descript *, int,
     vnode_t, NFSPROC_T *, struct nfsexstuff *);
 int nfsrvd_notsupp(struct nfsrv_descript *, int,
@@ -306,6 +342,7 @@ int nfsv4_sequencelookup(struct nfsmount *, struct nfs
     int *, uint32_t *, uint8_t *);
 void nfsv4_freeslot(struct nfsclsession *, int);
 struct ucred *nfsrv_getgrpscred(struct ucred *);
+struct nfsdevice *nfsv4_findmirror(struct nfsmount *);
 
 /* nfs_clcomsubs.c */
 void nfsm_uiombuf(struct nfsrv_descript *, struct uio *, int);
@@ -339,7 +376,7 @@ void nfsrv_wcc(struct nfsrv_descript *, int, struct nf
     struct nfsvattr *);
 int nfsv4_fillattr(struct nfsrv_descript *, struct mount *, vnode_t, NFSACL_T 
*,
     struct vattr *, fhandle_t *, int, nfsattrbit_t *,
-    struct ucred *, NFSPROC_T *, int, int, int, int, uint64_t);
+    struct ucred *, NFSPROC_T *, int, int, int, int, uint64_t, struct statfs 
*);
 void nfsrv_fillattr(struct nfsrv_descript *, struct nfsvattr *);
 void nfsrv_adj(mbuf_t, int, int);
 void nfsrv_postopattr(struct nfsrv_descript *, int, struct nfsvattr *);
@@ -387,8 +424,6 @@ int nfsrv_dissectace(struct nfsrv_descript *, struct a
     int *, int *, NFSPROC_T *);
 int nfsrv_buildacl(struct nfsrv_descript *, NFSACL_T *, enum vtype,
     NFSPROC_T *);
-int nfsrv_setacl(vnode_t, NFSACL_T *, struct ucred *,
-    NFSPROC_T *);
 int nfsrv_compareacl(NFSACL_T *, NFSACL_T *);
 
 /* nfs_clrpcops.c */
@@ -603,8 +638,8 @@ int ncl_flush(vnode_t, int, NFSPROC_T *, int, int);
 void ncl_invalcaches(vnode_t);
 
 /* nfs_nfsdport.c */
-int nfsvno_getattr(vnode_t, struct nfsvattr *, struct ucred *,
-    NFSPROC_T *, int);
+int nfsvno_getattr(vnode_t, struct nfsvattr *, struct nfsrv_descript *,
+    NFSPROC_T *, int, nfsattrbit_t *);
 int nfsvno_setattr(vnode_t, struct nfsvattr *, struct ucred *,
     NFSPROC_T *, struct nfsexstuff *);
 int nfsvno_getfh(vnode_t, fhandle_t *, NFSPROC_T *);
@@ -618,7 +653,7 @@ int nfsvno_readlink(vnode_t, struct ucred *, NFSPROC_T
     mbuf_t *, int *);
 int nfsvno_read(vnode_t, off_t, int, struct ucred *, NFSPROC_T *,
     mbuf_t *, mbuf_t *);
-int nfsvno_write(vnode_t, off_t, int, int, int, mbuf_t,
+int nfsvno_write(vnode_t, off_t, int, int, int *, mbuf_t,
     char *, struct ucred *, NFSPROC_T *);
 int nfsvno_createsub(struct nfsrv_descript *, struct nameidata *,
     vnode_t *, struct nfsvattr *, int *, int32_t *, NFSDEV_T, NFSPROC_T *,
@@ -647,7 +682,7 @@ void nfsvno_open(struct nfsrv_descript *, struct namei
     nfsv4stateid_t *, struct nfsstate *, int *, struct nfsvattr *, int32_t *,
     int, NFSACL_T *, nfsattrbit_t *, struct ucred *, NFSPROC_T *,
     struct nfsexstuff *, vnode_t *);
-int nfsvno_updfilerev(vnode_t, struct nfsvattr *, struct ucred *,
+int nfsvno_updfilerev(vnode_t, struct nfsvattr *, struct nfsrv_descript *,
     NFSPROC_T *);
 int nfsvno_fillattr(struct nfsrv_descript *, struct mount *, vnode_t,
     struct nfsvattr *, fhandle_t *, int, nfsattrbit_t *,
@@ -667,6 +702,17 @@ int nfsvno_testexp(struct nfsrv_descript *, struct nfs
 uint32_t nfsrv_hashfh(fhandle_t *);
 uint32_t nfsrv_hashsessionid(uint8_t *);
 void nfsrv_backupstable(void);
+int nfsrv_dsgetdevandfh(struct vnode *, NFSPROC_T *, int *, fhandle_t *,
+    char *);
+int nfsrv_dsgetsockmnt(struct vnode *, int, char *, int *, int *,
+    NFSPROC_T *, struct vnode **, fhandle_t *, char *, char *,
+    struct vnode **, struct nfsmount **, struct nfsmount *, int *, int *);
+int nfsrv_dscreate(struct vnode *, struct vattr *, struct vattr *,
+    fhandle_t *, struct pnfsdsfile *, struct pnfsdsattr *, char *,
+    struct ucred *, NFSPROC_T *, struct vnode **);
+int nfsrv_updatemdsattr(struct vnode *, struct nfsvattr *, NFSPROC_T *);
+void nfsrv_killrpcs(struct nfsmount *);
+int nfsrv_setacl(struct vnode *, NFSACL_T *, struct ucred *, NFSPROC_T *);
 
 /* nfs_commonkrpc.c */
 int newnfs_nmcancelreqs(struct nfsmount *);

Modified: head/sys/fs/nfs/nfsport.h
==============================================================================
--- head/sys/fs/nfs/nfsport.h   Tue Jun 12 19:26:25 2018        (r335011)
+++ head/sys/fs/nfs/nfsport.h   Tue Jun 12 19:36:32 2018        (r335012)
@@ -701,10 +701,18 @@ void nfsrvd_rcv(struct socket *, void *, int);
 #define        NFSSESSIONMUTEXPTR(s)   (&((s)->mtx))
 #define        NFSLOCKSESSION(s)       mtx_lock(&((s)->mtx))
 #define        NFSUNLOCKSESSION(s)     mtx_unlock(&((s)->mtx))
+#define        NFSLAYOUTMUTEXPTR(l)    (&((l)->mtx))
 #define        NFSLOCKLAYOUT(l)        mtx_lock(&((l)->mtx))
 #define        NFSUNLOCKLAYOUT(l)      mtx_unlock(&((l)->mtx))
+#define        NFSDDSMUTEXPTR          (&nfsrv_dslock_mtx)
 #define        NFSDDSLOCK()            mtx_lock(&nfsrv_dslock_mtx)
 #define        NFSDDSUNLOCK()          mtx_unlock(&nfsrv_dslock_mtx)
+#define        NFSDDONTLISTMUTEXPTR    (&nfsrv_dontlistlock_mtx)
+#define        NFSDDONTLISTLOCK()      mtx_lock(&nfsrv_dontlistlock_mtx)
+#define        NFSDDONTLISTUNLOCK()    mtx_unlock(&nfsrv_dontlistlock_mtx)
+#define        NFSDRECALLMUTEXPTR      (&nfsrv_recalllock_mtx)
+#define        NFSDRECALLLOCK()        mtx_lock(&nfsrv_recalllock_mtx)
+#define        NFSDRECALLUNLOCK()      mtx_unlock(&nfsrv_recalllock_mtx)
 
 /*
  * Use these macros to initialize/free a mutex.
@@ -1036,6 +1044,15 @@ struct nfsreq {
  * used in both places that call getnewvnode().
  */
 extern const char nfs_vnode_tag[];
+
+/*
+ * Check for the errors that indicate a DS should be disabled.
+ * ENXIO indicates that the krpc cannot do an RPC on the DS.
+ * EIO is returned by the RPC as an indication of I/O problems on the
+ * server.
+ * Are there other fatal errors?
+ */
+#define        nfsds_failerr(e)        ((e) == ENXIO || (e) == EIO)
 
 #endif /* _KERNEL */
 

Modified: head/sys/fs/nfs/nfsproto.h
==============================================================================
--- head/sys/fs/nfs/nfsproto.h  Tue Jun 12 19:26:25 2018        (r335011)
+++ head/sys/fs/nfs/nfsproto.h  Tue Jun 12 19:36:32 2018        (r335012)
@@ -260,6 +260,12 @@
 #define        NFSX_V4SETTIME          (NFSX_UNSIGNED + NFSX_V4TIME)
 #define        NFSX_V4SESSIONID        16
 #define        NFSX_V4DEVICEID         16
+#define        NFSX_V4PNFSFH           (sizeof(fhandle_t) + 1)
+#define        NFSX_V4FILELAYOUT       (4 * NFSX_UNSIGNED + NFSX_V4DEVICEID +  
\
+                                NFSX_HYPER + NFSM_RNDUP(NFSX_V4PNFSFH))
+#define        NFSX_V4FLEXLAYOUT(m)    (NFSX_HYPER + 3 * NFSX_UNSIGNED +       
        \
+    ((m) * (NFSX_V4DEVICEID + NFSX_STATEID + NFSM_RNDUP(NFSX_V4PNFSFH) +       
\
+    8 * NFSX_UNSIGNED)))
 
 /* sizes common to multiple NFS versions */
 #define        NFSX_FHMAX              (NFSX_V4FHMAX)
@@ -272,6 +278,11 @@
 /* variants for multiple versions */
 #define        NFSX_STATFS(v3)         ((v3) ? NFSX_V3STATFS : NFSX_V2STATFS)
 
+/*
+ * Beware.  NFSPROC_NULL and friends are defined in
+ * <rpcsvc/nfs_prot.h> as well and the numbers are different.
+ */
+#ifndef        NFSPROC_NULL
 /* nfs rpc procedure numbers (before version mapping) */
 #define        NFSPROC_NULL            0
 #define        NFSPROC_GETATTR         1
@@ -295,6 +306,7 @@
 #define        NFSPROC_FSINFO          19
 #define        NFSPROC_PATHCONF        20
 #define        NFSPROC_COMMIT          21
+#endif /* NFSPROC_NULL */
 
 /*
  * The lower numbers -> 21 are used by NFSv2 and v3. These define higher
@@ -652,6 +664,7 @@
 /* Flags for File Layout. */
 #define        NFSFLAYUTIL_DENSE               0x1
 #define        NFSFLAYUTIL_COMMIT_THRU_MDS     0x2
+#define        NFSFLAYUTIL_STRIPE_MASK         0xffffffc0
 
 /* Flags for Flex File Layout. */
 #define        NFSFLEXFLAG_NO_LAYOUTCOMMIT     0x00000001
@@ -668,6 +681,7 @@
 #define        NFSCDFS4_BACK           0x2
 #define        NFSCDFS4_BOTH           0x3
 
+#if defined(_KERNEL) || defined(KERNEL)
 /* Conversion macros */
 #define        vtonfsv2_mode(t,m)                                              
\
                txdr_unsigned(((t) == VFIFO) ? MAKEIMODE(VCHR, (m)) :   \
@@ -819,6 +833,7 @@ struct nfsv3_sattr {
        u_int32_t sa_mtimetype;
        nfstime3  sa_mtime;
 };
+#endif /* _KERNEL */
 
 /*
  * The attribute bits used for V4.
@@ -1046,7 +1061,8 @@ struct nfsv3_sattr {
        NFSATTRBM_MOUNTEDONFILEID |                                     \
        NFSATTRBM_QUOTAHARD |                                           \
        NFSATTRBM_QUOTASOFT |                                           \
-       NFSATTRBM_QUOTAUSED)
+       NFSATTRBM_QUOTAUSED |                                           \
+       NFSATTRBM_FSLAYOUTTYPE)
 
 
 #ifdef QUOTA
@@ -1062,7 +1078,11 @@ struct nfsv3_sattr {
 #define        NFSATTRBIT_SUPP1        NFSATTRBIT_S1
 #endif
 
-#define        NFSATTRBIT_SUPP2        NFSATTRBM_SUPPATTREXCLCREAT
+#define        NFSATTRBIT_SUPP2                                                
\
+       (NFSATTRBM_LAYOUTTYPE |                                         \
+       NFSATTRBM_LAYOUTBLKSIZE |                                       \
+       NFSATTRBM_LAYOUTALIGNMENT |                                     \
+       NFSATTRBM_SUPPATTREXCLCREAT)
 
 /*
  * NFSATTRBIT_SUPPSETONLY is the OR of NFSATTRBIT_TIMEACCESSSET and
@@ -1378,5 +1398,15 @@ struct nfsv4stateid {
        u_int32_t       other[NFSX_STATEIDOTHER / NFSX_UNSIGNED];
 };
 typedef struct nfsv4stateid nfsv4stateid_t;
+
+/* Notify bits and notify bitmap size. */
+#define        NFSV4NOTIFY_CHANGE      1
+#define        NFSV4NOTIFY_DELETE      2
+#define        NFSV4_NOTIFYBITMAP      1       /* # of 32bit values needed for 
bits */
+
+/* Layoutreturn kinds. */
+#define        NFSV4LAYOUTRET_FILE     1
+#define        NFSV4LAYOUTRET_FSID     2
+#define        NFSV4LAYOUTRET_ALL      3
 
 #endif /* _NFS_NFSPROTO_H_ */

Modified: head/sys/fs/nfs/nfsrvstate.h
==============================================================================
--- head/sys/fs/nfs/nfsrvstate.h        Tue Jun 12 19:26:25 2018        
(r335011)
+++ head/sys/fs/nfs/nfsrvstate.h        Tue Jun 12 19:36:32 2018        
(r335012)
@@ -31,6 +31,7 @@
 #ifndef _NFS_NFSRVSTATE_H_
 #define        _NFS_NFSRVSTATE_H_
 
+#if defined(_KERNEL) || defined(KERNEL)
 /*
  * Definitions for NFS V4 server state handling.
  */
@@ -46,6 +47,10 @@ LIST_HEAD(nfslockhead, nfslock);
 LIST_HEAD(nfslockhashhead, nfslockfile);
 LIST_HEAD(nfssessionhead, nfsdsession);
 LIST_HEAD(nfssessionhashhead, nfsdsession);
+TAILQ_HEAD(nfslayouthead, nfslayout);
+SLIST_HEAD(nfsdsdirhead, nfsdsdir);
+TAILQ_HEAD(nfsdevicehead, nfsdevice);
+LIST_HEAD(nfsdontlisthead, nfsdontlist);
 
 /*
  * List head for nfsusrgrp.
@@ -74,6 +79,13 @@ struct nfssessionhash {
 #define        NFSSESSIONHASH(f)                                               
\
        (&nfssessionhash[nfsrv_hashsessionid(f) % nfsrv_sessionhashsize])
 
+struct nfslayouthash {
+       struct mtx              mtx;
+       struct nfslayouthead    list;
+};
+#define        NFSLAYOUTHASH(f)                                                
\
+       (&nfslayouthash[nfsrv_hashfh(f) % nfsrv_layouthashsize])
+
 /*
  * Client server structure for V4. It is doubly linked into two lists.
  * The first is a hash table based on the clientid and the second is a
@@ -112,6 +124,31 @@ struct nfsclient {
 #define        CLOPS_RENEWOP           0x0004
 
 /*
+ * Structure for NFSv4.1 Layouts.
+ * Malloc'd to correct size for the lay_xdr.
+ */
+struct nfslayout {
+       TAILQ_ENTRY(nfslayout)  lay_list;
+       nfsv4stateid_t          lay_stateid;
+       nfsquad_t               lay_clientid;
+       fhandle_t               lay_fh;
+       fsid_t                  lay_fsid;
+       uint32_t                lay_layoutlen;
+       uint16_t                lay_mirrorcnt;
+       uint16_t                lay_trycnt;
+       uint16_t                lay_type;
+       uint16_t                lay_flags;
+       uint32_t                lay_xdr[0];
+};
+
+/* Flags for lay_flags. */
+#define        NFSLAY_READ     0x0001
+#define        NFSLAY_RW       0x0002
+#define        NFSLAY_RECALL   0x0004
+#define        NFSLAY_RETURNED 0x0008
+#define        NFSLAY_CALLB    0x0010
+
+/*
  * Structure for an NFSv4.1 session.
  * Locking rules for this structure.
  * To add/delete one of these structures from the lists, you must lock
@@ -290,9 +327,72 @@ struct nfsf_rec {
        u_int32_t       numboots;               /* Number of boottimes */
 };
 
-#if defined(_KERNEL) || defined(KERNEL)
 void nfsrv_cleanclient(struct nfsclient *, NFSPROC_T *);
 void nfsrv_freedeleglist(struct nfsstatehead *);
-#endif
+
+/*
+ * This structure is used to create the list of device info entries for
+ * a GetDeviceInfo operation and stores the DS server info.
+ * The nfsdev_addrandhost field has the fully qualified host domain name
+ * followed by the network address in XDR.
+ * It is allocated with nfsrv_dsdirsize nfsdev_dsdir[] entries.
+ */
+struct nfsdevice {
+       TAILQ_ENTRY(nfsdevice)  nfsdev_list;
+       vnode_t                 nfsdev_dvp;
+       struct nfsmount         *nfsdev_nmp;
+       char                    nfsdev_deviceid[NFSX_V4DEVICEID];
+       uint16_t                nfsdev_hostnamelen;
+       uint16_t                nfsdev_fileaddrlen;
+       uint16_t                nfsdev_flexaddrlen;
+       char                    *nfsdev_fileaddr;
+       char                    *nfsdev_flexaddr;
+       char                    *nfsdev_host;
+       uint32_t                nfsdev_nextdir;
+       vnode_t                 nfsdev_dsdir[0];
+};
+
+/*
+ * This structure holds the va_size, va_filerev, va_atime and va_mtime for the
+ * DS file and is stored in the metadata file's extended attribute 
pnfsd.dsattr.
+ */
+struct pnfsdsattr {
+       uint64_t        dsa_filerev;
+       uint64_t        dsa_size;
+       struct timespec dsa_atime;
+       struct timespec dsa_mtime;
+};
+
+/*
+ * This structure is a list element for a list the pNFS server uses to
+ * mark that the recovery of a mirror file is in progress.
+ */
+struct nfsdontlist {
+       LIST_ENTRY(nfsdontlist) nfsmr_list;
+       uint32_t                nfsmr_flags;
+       fhandle_t               nfsmr_fh;
+};
+
+/* nfsmr_flags bits. */
+#define        NFSMR_DONTLAYOUT        0x00000001
+
+#endif /* defined(_KERNEL) || defined(KERNEL) */
+
+/*
+ * This structure holds the information about the DS file and is stored
+ * in the metadata file's extended attribute called pnfsd.dsfile.
+ */
+#define        PNFS_FILENAME_LEN       (2 * sizeof(fhandle_t))
+struct pnfsdsfile {
+       fhandle_t       dsf_fh;
+       uint32_t        dsf_dir;
+       union {
+               struct sockaddr_in      sin;
+               struct sockaddr_in6     sin6;
+       } dsf_nam;
+       char            dsf_filename[PNFS_FILENAME_LEN + 1];
+};
+#define        dsf_sin         dsf_nam.sin
+#define        dsf_sin6        dsf_nam.sin6
 
 #endif /* _NFS_NFSRVSTATE_H_ */

Modified: head/sys/fs/nfsclient/nfs_clport.c
==============================================================================
--- head/sys/fs/nfsclient/nfs_clport.c  Tue Jun 12 19:26:25 2018        
(r335011)
+++ head/sys/fs/nfsclient/nfs_clport.c  Tue Jun 12 19:36:32 2018        
(r335012)
@@ -86,6 +86,7 @@ extern int nfs_numnfscbd;
 extern int nfscl_inited;
 struct mtx ncl_iod_mutex;
 NFSDLOCKMUTEX;
+extern struct mtx nfsrv_dslock_mtx;
 
 extern void (*ncl_call_invalcaches)(struct vnode *);
 
@@ -930,7 +931,7 @@ nfscl_fillsattr(struct nfsrv_descript *nd, struct vatt
                if (vap->va_mtime.tv_sec != VNOVAL)
                        NFSSETBIT_ATTRBIT(&attrbits, NFSATTRBIT_TIMEMODIFYSET);
                (void) nfsv4_fillattr(nd, vp->v_mount, vp, NULL, vap, NULL, 0,
-                   &attrbits, NULL, NULL, 0, 0, 0, 0, (uint64_t)0);
+                   &attrbits, NULL, NULL, 0, 0, 0, 0, (uint64_t)0, NULL);
                break;
        }
 }
@@ -1383,6 +1384,13 @@ nfssvc_nfscl(struct thread *td, struct nfssvc_args *ua
                                    0 && strcmp(mp->mnt_stat.f_fstypename,
                                    "nfs") == 0 && mp->mnt_data != NULL) {
                                        nmp = VFSTONFS(mp);
+                                       NFSDDSLOCK();
+                                       if (nfsv4_findmirror(nmp) != NULL) {
+                                               NFSDDSUNLOCK();
+                                               error = ENXIO;
+                                               nmp = NULL;
+                                               break;
+                                       }
                                        mtx_lock(&nmp->nm_mtx);
                                        if ((nmp->nm_privflag &
                                            NFSMNTP_FORCEDISM) == 0) {
@@ -1394,6 +1402,7 @@ nfssvc_nfscl(struct thread *td, struct nfssvc_args *ua
                                                mtx_unlock(&nmp->nm_mtx);
                                                nmp = NULL;
                                        }
+                                       NFSDDSUNLOCK();
                                        break;
                                }
                        }
@@ -1418,7 +1427,7 @@ nfssvc_nfscl(struct thread *td, struct nfssvc_args *ua
                                nmp->nm_privflag &= ~NFSMNTP_CANCELRPCS;
                                wakeup(nmp);
                                mtx_unlock(&nmp->nm_mtx);
-                       } else
+                       } else if (error == 0)
                                error = EINVAL;
                }
                free(buf, M_TEMP);

Modified: head/sys/fs/nfsclient/nfs_clrpcops.c
==============================================================================
--- head/sys/fs/nfsclient/nfs_clrpcops.c        Tue Jun 12 19:26:25 2018        
(r335011)
+++ head/sys/fs/nfsclient/nfs_clrpcops.c        Tue Jun 12 19:36:32 2018        
(r335012)
@@ -4620,7 +4620,7 @@ nfsrpc_setaclrpc(vnode_t vp, struct ucred *cred, NFSPR
        NFSZERO_ATTRBIT(&attrbits);
        NFSSETBIT_ATTRBIT(&attrbits, NFSATTRBIT_ACL);
        (void) nfsv4_fillattr(nd, vnode_mount(vp), vp, aclp, NULL, NULL, 0,
-           &attrbits, NULL, NULL, 0, 0, 0, 0, (uint64_t)0);
+           &attrbits, NULL, NULL, 0, 0, 0, 0, (uint64_t)0, NULL);
        error = nfscl_request(nd, vp, p, cred, stuff);
        if (error)
                return (error);

Modified: head/sys/fs/nfsclient/nfs_clstate.c
==============================================================================
--- head/sys/fs/nfsclient/nfs_clstate.c Tue Jun 12 19:26:25 2018        
(r335011)
+++ head/sys/fs/nfsclient/nfs_clstate.c Tue Jun 12 19:36:32 2018        
(r335012)
@@ -3373,7 +3373,7 @@ nfscl_docb(struct nfsrv_descript *nd, NFSPROC_T *p)
                        if (!error)
                                (void) nfsv4_fillattr(nd, NULL, NULL, NULL, &va,
                                    NULL, 0, &rattrbits, NULL, p, 0, 0, 0, 0,
-                                   (uint64_t)0);
+                                   (uint64_t)0, NULL);
                        break;
                case NFSV4OP_CBRECALL:
                        NFSCL_DEBUG(4, "cbrecall\n");

Modified: head/sys/fs/nfsclient/nfs_clvfsops.c
==============================================================================
--- head/sys/fs/nfsclient/nfs_clvfsops.c        Tue Jun 12 19:26:25 2018        
(r335011)
+++ head/sys/fs/nfsclient/nfs_clvfsops.c        Tue Jun 12 19:36:32 2018        
(r335012)
@@ -86,6 +86,7 @@ extern enum nfsiod_state ncl_iodwant[NFS_MAXASYNCDAEMO
 extern struct nfsmount *ncl_iodmount[NFS_MAXASYNCDAEMON];
 extern struct mtx ncl_iod_mutex;
 NFSCLSTATEMUTEX;
+extern struct mtx nfsrv_dslock_mtx;
 
 MALLOC_DEFINE(M_NEWNFSREQ, "newnfsclient_req", "NFS request header");
 MALLOC_DEFINE(M_NEWNFSMNT, "newnfsmnt", "NFS mount struct");
@@ -1672,6 +1673,7 @@ nfs_unmount(struct mount *mp, int mntflags)
        if (mntflags & MNT_FORCE)
                flags |= FORCECLOSE;
        nmp = VFSTONFS(mp);
+       error = 0;
        /*
         * Goes something like this..
         * - Call vflush() to clear out vnodes for this filesystem
@@ -1680,6 +1682,12 @@ nfs_unmount(struct mount *mp, int mntflags)
         */
        /* In the forced case, cancel any outstanding requests. */
        if (mntflags & MNT_FORCE) {
+               NFSDDSLOCK();
+               if (nfsv4_findmirror(nmp) != NULL)
+                       error = ENXIO;
+               NFSDDSUNLOCK();
+               if (error)
+                       goto out;
                error = newnfs_nmcancelreqs(nmp);
                if (error)
                        goto out;

Modified: head/sys/fs/nfsserver/nfs_nfsdkrpc.c
==============================================================================
--- head/sys/fs/nfsserver/nfs_nfsdkrpc.c        Tue Jun 12 19:26:25 2018        
(r335011)
+++ head/sys/fs/nfsserver/nfs_nfsdkrpc.c        Tue Jun 12 19:36:32 2018        
(r335012)
@@ -105,6 +105,7 @@ static int nfs_proc(struct nfsrv_descript *, u_int32_t
 extern u_long sb_max_adj;
 extern int newnfs_numnfsd;
 extern struct proc *nfsd_master_proc;
+extern time_t nfsdev_time;
 
 /*
  * NFS server system calls
@@ -495,6 +496,7 @@ nfsrvd_nfsd(struct thread *td, struct nfsd_nfsd_args *
         */
        NFSD_LOCK();
        if (newnfs_numnfsd == 0) {
+               nfsdev_time = time_second;
                p = td->td_proc;
                PROC_LOCK(p);
                p->p_flag2 |= P2_AST_SU;
@@ -502,31 +504,36 @@ nfsrvd_nfsd(struct thread *td, struct nfsd_nfsd_args *
                newnfs_numnfsd++;
 
                NFSD_UNLOCK();
-
-               /* An empty string implies AUTH_SYS only. */
-               if (principal[0] != '\0') {
-                       ret2 = rpc_gss_set_svc_name_call(principal,
-                           "kerberosv5", GSS_C_INDEFINITE, NFS_PROG, NFS_VER2);
-                       ret3 = rpc_gss_set_svc_name_call(principal,
-                           "kerberosv5", GSS_C_INDEFINITE, NFS_PROG, NFS_VER3);
-                       ret4 = rpc_gss_set_svc_name_call(principal,
-                           "kerberosv5", GSS_C_INDEFINITE, NFS_PROG, NFS_VER4);
-
-                       if (!ret2 || !ret3 || !ret4)
-                               printf("nfsd: can't register svc name\n");
+               error = nfsrv_createdevids(args, td);
+               if (error == 0) {
+                       /* An empty string implies AUTH_SYS only. */
+                       if (principal[0] != '\0') {
+                               ret2 = rpc_gss_set_svc_name_call(principal,
+                                   "kerberosv5", GSS_C_INDEFINITE, NFS_PROG,
+                                   NFS_VER2);
+                               ret3 = rpc_gss_set_svc_name_call(principal,
+                                   "kerberosv5", GSS_C_INDEFINITE, NFS_PROG,
+                                   NFS_VER3);
+                               ret4 = rpc_gss_set_svc_name_call(principal,
+                                   "kerberosv5", GSS_C_INDEFINITE, NFS_PROG,
+                                   NFS_VER4);
+       
+                               if (!ret2 || !ret3 || !ret4)
+                                       printf(
+                                           "nfsd: can't register svc name\n");
+                       }
+       
+                       nfsrvd_pool->sp_minthreads = args->minthreads;
+                       nfsrvd_pool->sp_maxthreads = args->maxthreads;
+                               
+                       svc_run(nfsrvd_pool);
+       
+                       if (principal[0] != '\0') {
+                               rpc_gss_clear_svc_name_call(NFS_PROG, NFS_VER2);
+                               rpc_gss_clear_svc_name_call(NFS_PROG, NFS_VER3);
+                               rpc_gss_clear_svc_name_call(NFS_PROG, NFS_VER4);
+                       }
                }
-
-               nfsrvd_pool->sp_minthreads = args->minthreads;
-               nfsrvd_pool->sp_maxthreads = args->maxthreads;
-                       
-               svc_run(nfsrvd_pool);
-
-               if (principal[0] != '\0') {
-                       rpc_gss_clear_svc_name_call(NFS_PROG, NFS_VER2);
-                       rpc_gss_clear_svc_name_call(NFS_PROG, NFS_VER3);
-                       rpc_gss_clear_svc_name_call(NFS_PROG, NFS_VER4);
-               }
-
                NFSD_LOCK();
                newnfs_numnfsd--;
                nfsrvd_init(1);
@@ -555,6 +562,7 @@ nfsrvd_init(int terminating)
        if (terminating) {
                nfsd_master_proc = NULL;
                NFSD_UNLOCK();
+               nfsrv_freealllayoutsanddevids();

*** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to