What is rewinddir() used for ? In other words, What are the situations in which we use rewinddir??
At 2016-11-02 12:08:46, "Raghavendra Gowdappa" <rgowd...@redhat.com> wrote: > > >----- Original Message ----- >> From: "Keiviw" <kei...@163.com> >> To: gluster-devel@gluster.org >> Sent: Tuesday, November 1, 2016 12:41:02 PM >> Subject: [Gluster-devel] A question of GlusterFS dentries! >> >> Hi, >> In GlusterFS distributed volumes, listing a non-empty directory was slow. >> Then I read the dht codes and found the reasons. But I was confused that >> GlusterFS dht travesed all the bricks(in the volume) sequentially,why not >> use multi-thread to read dentries from multiple bricks simultaneously. >> That's a question that's always puzzled me, Couly you please tell me >> something about this??? > >readdir across subvols is sequential mostly because we have to support >rewinddir(3). We need to maintain the mapping of offset and dentry across >multiple invocations of readdir. In other words if someone did a rewinddir to >an offset corresponding to earlier dentry, subsequent readdirs should return >same set of dentries what the earlier invocation of readdir returned. For >example, in an hypothetical scenario, readdir returned following dentries: > >1. a, off=10 >2. b, off=2 >3. c, off=5 >4. d, off=15 >5. e, off=17 >6. f, off=13 > >Now if we did rewinddir to off 5 and issue readdir again we should get >following dentries: >(c, off=5), (d, off=15), (e, off=17), (f, off=13) > >Within a subvol backend filesystem provides rewinddir guarantee for the >dentries present on that subvol. However, across subvols it is the >responsibility of DHT to provide the above guarantee. Which means we should've >some well defined order in which we send readdir calls (Note that order is not >well defined if we do a parallel readdir across all subvols). So, DHT has >sequential readdir which is a well defined order of reading dentries. > >To give an example if we have another subvol - subvol2 - (in addiction to the >subvol above - say subvol1) with following listing: >1. g, off=16 >2. h, off=20 >3. i, off=3 >4. j, off=19 > >With parallel readdir we can have many ordering like - (a, b, g, h, i, c, d, >e, f, j), (g, h, a, b, c, i, j, d, e, f) etc. Now if we do (with readdir done >parallely): > >1. A complete listing of the directory (which can be any one of 10P1 = 10 ways >- I hope math is correct here). >2. Do rewinddir (20) > >We cannot predict what are the set of dentries that come _after_ offset 20. >However, if we do a readdir sequentially across subvols there is only one >directory listing i.e, (a, b, c, d, e, f, g, h, i, j). So, its easier to >support rewinddir. > >If there is no POSIX requirement for rewinddir support, I think a parallel >readdir can easily be implemented (which improves performance too). But >unfortunately rewinddir is still a POSIX requirement. This also opens up >another possibility of a "no-rewinddir-support" option in DHT, which if >enabled results in parallel readdirs across subvols. What I am not sure is how >many users still use rewinddir? If there is a critical mass which wants >performance with a tradeoff of no rewinddir support this can be a good feature. > >+gluster-users to get an opinion on this. > >regards, >Raghavendra > >> >> >> >> >> >> >> _______________________________________________ >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-devel
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://www.gluster.org/mailman/listinfo/gluster-devel