Hi Frank, list. Frank Filz wrote on Mon, Jul 27, 2015 at 09:33:54AM -0700: > FSAL_LUSTRE actually also uses fcntl to get POSIX locks, so it has the same > issues as FSAL_VFS.
LUSTRE would actually like one fd per open or equivalent if we can reap them efficiently, for the same reason as GPFS -- sharing a fd kills readahead behaviour and lustre needs the client to do readahead and has us disable readahead disk-side, so ganesha performs quite worse with multiple clients. Well, I guess that's close enough to what we're doing anyway. > Now I don't know about other FSALs, though I wouldn't be surprised if at > least some other FSALs might benefit from some mechanism to associate > SOMETHING with each lock owner per file. > > So I came up with the idea of the FSAL providing the size of the "thing" > (which I have currently named fsal_fd, but we can change the name if it > would make folks feel more comfortable) with each Ganesha stateid. And then > to bring NFS v3 locks and share reservations into the picture, I've made > them able to use stateids (well, state_t structures), which also cleaned up > part of the SAL lock interface (since NFS v3 can hide it's "state" value in > the state_t and stop overloading state_t *state... 9P has a state_t per fid (which will translate into per open), so that should work well for us as well. > Now each FSAL gets to define exactly what an fsal_fd actually is. At the > moment I have the open mode in the generic structure, but maybe it can move > into the FSAL private fsal_fd. > > Not all FSALs will use both open and lock fds, and we could provide separate > sizes for them so space would only be allocated when necessary (and if zero, > a NULL pointer is passed). sounds good. > For most operations, the object_handle, and both a share_fd and lock_fd are > passed. This allows the FSAL to decide exactly which ones it needs (if it > needs a generic "thing" for anonymous I/O it can stash a generic fsal_fd in > it's object_handle). They're passed with what we have and it's left to the FSAL function to open if it needs something? One of the problem we have at the moment in VFS is fstat() racing on the "generic thing", which can be closed/open in other threads. Do we leave it up to FSALs to protect their generic bits then ? (since we can't know which FSAL call will use which bit, doing locking ourselves will have to be too much for most) > The benefit of this interface is that the FSAL can leverage cache inode AVL > tree and SAL hash tables without making upcalls (that would be fraught with > locking issues) or duplicating hash tables in order to store information > entirely separately. It also may open possibilities to better hinting > between the layers so garbage collection can be improved. yay. (just need to make sure we "close" anything that has been left open on GC, so need a common level flag to say it's used maybe?) > In the meantime, the legacy interface is still available. If truly some > FSALs will never need this function, but they want to benefit from the > atomic open/create/setattr capability of FSAL open_fd method, we can make a > more generic version of that (well, actually, it just needs to be ok having > NULL passed for the fsal_fd). I think we should make FSALs so that it's always OK whatever is given, but that's a bit more work (esp. internal locking for generic fd as mentioned previously) Looks like a good draft anyway, I'm sure we can work the details. -- Dominique ------------------------------------------------------------------------------ _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel