On Thu, Nov 3, 2016 at 11:34 AM, Keiviw <kei...@163.com> wrote: > If GlusterFS does not support POSIX seekdir,what problems will user or > GlusterFS have? >
Glusterfs won't have any problem if we don't support seekdir. I am also not sure whether applications have real use-case for seekdir. But, however its a POSIX requirement. > > 发自网易邮箱大师 > On 11/03/2016 12:52, Raghavendra G <raghaven...@gluster.com> wrote: > > > > On Wed, Nov 2, 2016 at 9:38 AM, Raghavendra Gowdappa <rgowd...@redhat.com> > wrote: > >> >> >> ----- Original Message ----- >> > From: "Keiviw" <kei...@163.com> >> > To: gluster-devel@gluster.org >> > Sent: Tuesday, November 1, 2016 12:41:02 PM >> > Subject: [Gluster-devel] A question of GlusterFS dentries! >> > >> > Hi, >> > In GlusterFS distributed volumes, listing a non-empty directory was >> slow. >> > Then I read the dht codes and found the reasons. But I was confused that >> > GlusterFS dht travesed all the bricks(in the volume) sequentially,why >> not >> > use multi-thread to read dentries from multiple bricks simultaneously. >> > That's a question that's always puzzled me, Couly you please tell me >> > something about this??? >> >> readdir across subvols is sequential mostly because we have to support >> rewinddir(3). > > > Sorry. seekdir(3) is the more relevant function here. Since rewinddir > resets the dir stream to beginning, its not much of a difficulty to support > rewinddir with parallel readdirs across subvols. > > >> We need to maintain the mapping of offset and dentry across multiple >> invocations of readdir. In other words if someone did a rewinddir to an >> offset corresponding to earlier dentry, subsequent readdirs should return >> same set of dentries what the earlier invocation of readdir returned. For >> example, in an hypothetical scenario, readdir returned following dentries: >> >> 1. a, off=10 >> 2. b, off=2 >> 3. c, off=5 >> 4. d, off=15 >> 5. e, off=17 >> 6. f, off=13 >> >> Now if we did rewinddir to off 5 and issue readdir again we should get >> following dentries: >> (c, off=5), (d, off=15), (e, off=17), (f, off=13) >> >> Within a subvol backend filesystem provides rewinddir guarantee for the >> dentries present on that subvol. However, across subvols it is the >> responsibility of DHT to provide the above guarantee. Which means we >> should've some well defined order in which we send readdir calls (Note that >> order is not well defined if we do a parallel readdir across all subvols). >> So, DHT has sequential readdir which is a well defined order of reading >> dentries. >> >> To give an example if we have another subvol - subvol2 - (in addiction to >> the subvol above - say subvol1) with following listing: >> 1. g, off=16 >> 2. h, off=20 >> 3. i, off=3 >> 4. j, off=19 >> >> With parallel readdir we can have many ordering like - (a, b, g, h, i, c, >> d, e, f, j), (g, h, a, b, c, i, j, d, e, f) etc. Now if we do (with readdir >> done parallely): >> >> 1. A complete listing of the directory (which can be any one of 10P1 = 10 >> ways - I hope math is correct here). >> 2. Do rewinddir (20) >> >> We cannot predict what are the set of dentries that come _after_ offset >> 20. However, if we do a readdir sequentially across subvols there is only >> one directory listing i.e, (a, b, c, d, e, f, g, h, i, j). So, its easier >> to support rewinddir. >> >> If there is no POSIX requirement for rewinddir support, I think a >> parallel readdir can easily be implemented (which improves performance >> too). But unfortunately rewinddir is still a POSIX requirement. This also >> opens up another possibility of a "no-rewinddir-support" option in DHT, >> which if enabled results in parallel readdirs across subvols. What I am not >> sure is how many users still use rewinddir? If there is a critical mass >> which wants performance with a tradeoff of no rewinddir support this can be >> a good feature. >> >> +gluster-users to get an opinion on this. >> >> regards, >> Raghavendra >> >> > >> > >> > >> > >> > >> > >> > _______________________________________________ >> > Gluster-devel mailing list >> > Gluster-devel@gluster.org >> > http://www.gluster.org/mailman/listinfo/gluster-devel >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-devel >> > > > > -- > Raghavendra G > > > > > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > -- Raghavendra G
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel