GlusterFS also seem to be using d_off sent by back-end filesystem 
(ext4/xfs) but along with its server ID information encoded as cookie. 
Like in case of VFS, this d_off implies offset for next entry.

Thanks,
Soumya

On 03/22/2017 05:39 AM, Frank Filz wrote:
> I am having challenges getting dirent chunking to work correctly in all
> scenarios...
>
>>From the client side we have the following requirements:
>
> NFS client will send a READDIR request with a whence that may be non-zero
> NFS client is returned entries, each entry has a "cookie" that may be used
> as whence on a subsequent READDIR to start fetching entries starting with
> the entry following the one the cookie was associated with
> 9P seems to have the same requirements
>
> The above matches well with lseek and getdents (which is what FSAL_VFS uses)
>
> For FSAL_RGW, we would like the cookie to be the "address" of the entry
> rather than the next entry
> Which allows us to compute the cookie for an inserted dirent (from lookup,
> create, link, or rename)
>
> For continuing to read a directory having read some number of chunks in, we
> would like to use a whence that will find the next directory entry after the
> last one in the previous chunk.
>
> Now here is one problem, for FSAL_VFS if we use the d_off as the cookie,
> that is actually the "address" of the next entry AT THAT TIME. That means
> that if we do a lseek to the last cookie in a chunk, we may NOT find the
> actual next entry. There may also be an issue due to . and .. sorting
> somewhere in the middle of the directory (at least on my ext4 filesystem,
> the "address" of . is always 0x4c470ee8300a65ab (which means that will be
> the d_off for whichever entry precedes .) and .. is always
> 0x68ec4bc2e1982399.
>
> If we aren't trying to insert dirents, that may be ok. If so, we can
> probably live with RGW cookies being the address of the entry while VFS
> cookie are the address of the current next entry, and so long as those FSALs
> which return cookie as the address of the entry, do indeed provide the NEXT
> entry when we provide that cookie as whence on readdir, everything should
> work.
>
> But I'm also trying to test the dirent insert using FSAL_VFS, and it isn't
> working...
>
> The problem is an insert that becomes the new first directory entry, or an
> insert that slips in just before the . or .. entries.
>
> In order to make a workable ability to insert dirents, FSAL_VFS readdir
> COULD return the previous cookie as the cookie for an entry. In that case,
> after doing an lseek, it would just have to skip the first entry. For ext4
> it MIGHT work to actually lseek to whence+1...
>
> FSAL_VFS compute_readdir_cookie would of course just return the d_off from
> the entry prior to finding the named entry.
>
> Then one problem remains for FSAL_VFS. We can't get the actual "address" of
> the very first dirent. This could be handled by the following mechanism:
>
> If we insert a new dirent, and compute_readdir_cookie returns 0 for it, we
> must then call compute_readdir_cookie on the previous first entry (which
> will return it's actual address now that it no longer is the first entry in
> the directory), and move it in the AVL tree so we can now insert the new 0.
>
> It would really help to understand how Gluster and Ceph readdir with a
> non-zero whence actually works, how do your cookies work?
>
> How do you feel about chunking possibly missing new entries in a directory
> really is. Note that if we decide our current attributes are invalid,
> refresh them, and detect mtime changes, then we will flush the dirents, so
> this MAY not be that much of an issue. On the other hand, it also means that
> even if we dump the dirent cache, a client that doesn't give up, and sends a
> non-zero whence may miss entries that folks feel it should have found.
>
> Thanks
>
> Frank
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to