GlusterFS also seem to be using d_off sent by back-end filesystem (ext4/xfs) but along with its server ID information encoded as cookie. Like in case of VFS, this d_off implies offset for next entry.
Thanks, Soumya On 03/22/2017 05:39 AM, Frank Filz wrote: > I am having challenges getting dirent chunking to work correctly in all > scenarios... > >>From the client side we have the following requirements: > > NFS client will send a READDIR request with a whence that may be non-zero > NFS client is returned entries, each entry has a "cookie" that may be used > as whence on a subsequent READDIR to start fetching entries starting with > the entry following the one the cookie was associated with > 9P seems to have the same requirements > > The above matches well with lseek and getdents (which is what FSAL_VFS uses) > > For FSAL_RGW, we would like the cookie to be the "address" of the entry > rather than the next entry > Which allows us to compute the cookie for an inserted dirent (from lookup, > create, link, or rename) > > For continuing to read a directory having read some number of chunks in, we > would like to use a whence that will find the next directory entry after the > last one in the previous chunk. > > Now here is one problem, for FSAL_VFS if we use the d_off as the cookie, > that is actually the "address" of the next entry AT THAT TIME. That means > that if we do a lseek to the last cookie in a chunk, we may NOT find the > actual next entry. There may also be an issue due to . and .. sorting > somewhere in the middle of the directory (at least on my ext4 filesystem, > the "address" of . is always 0x4c470ee8300a65ab (which means that will be > the d_off for whichever entry precedes .) and .. is always > 0x68ec4bc2e1982399. > > If we aren't trying to insert dirents, that may be ok. If so, we can > probably live with RGW cookies being the address of the entry while VFS > cookie are the address of the current next entry, and so long as those FSALs > which return cookie as the address of the entry, do indeed provide the NEXT > entry when we provide that cookie as whence on readdir, everything should > work. > > But I'm also trying to test the dirent insert using FSAL_VFS, and it isn't > working... > > The problem is an insert that becomes the new first directory entry, or an > insert that slips in just before the . or .. entries. > > In order to make a workable ability to insert dirents, FSAL_VFS readdir > COULD return the previous cookie as the cookie for an entry. In that case, > after doing an lseek, it would just have to skip the first entry. For ext4 > it MIGHT work to actually lseek to whence+1... > > FSAL_VFS compute_readdir_cookie would of course just return the d_off from > the entry prior to finding the named entry. > > Then one problem remains for FSAL_VFS. We can't get the actual "address" of > the very first dirent. This could be handled by the following mechanism: > > If we insert a new dirent, and compute_readdir_cookie returns 0 for it, we > must then call compute_readdir_cookie on the previous first entry (which > will return it's actual address now that it no longer is the first entry in > the directory), and move it in the AVL tree so we can now insert the new 0. > > It would really help to understand how Gluster and Ceph readdir with a > non-zero whence actually works, how do your cookies work? > > How do you feel about chunking possibly missing new entries in a directory > really is. Note that if we decide our current attributes are invalid, > refresh them, and detect mtime changes, then we will flush the dirents, so > this MAY not be that much of an issue. On the other hand, it also means that > even if we dump the dirent cache, a client that doesn't give up, and sends a > non-zero whence may miss entries that folks feel it should have found. > > Thanks > > Frank > > > --- > This email has been checked for viruses by Avast antivirus software. > https://www.avast.com/antivirus > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel