Hi Niklas, Out of interest have you tried testing performance with performance.stat-prefetch enabled?
-- Sam McLeod @s_mcleod https://smcleod.net > On 14 Sep 2017, at 10:42 pm, Niklas Hambüchen <[email protected]> wrote: > > Hi, > > I have a gluster 3.10 volume with a dir with ~1 million small files in > them, say mounted at /mnt/dir with FUSE, and I'm observing something weird: > > When I list and stat them all using rsync, then the lstat() calls that > rsync does are incredibly fast (23 microseconds per call on average, > definitely faster than a network roundtrip between my 3-machine bricks > connected via Ethernet). > But when I try to back them up with the `bup` tool > (https://github.com/bup/bup), which (at least according to strace) does > the same syscalls as rsync to stat all files, it takes 1700 microseconds > per lstat() call, and the total time to go over all files is 50x higher. > > These are my strace statistics: > > rsync strace: > > $ strace -f -w -c rsync -a --dry-run /mnt/dir /tmp/nonexisting > > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 52.35 69.682773 26617 2618 1 select > 39.60 52.712292 1056 49907 getdents > 8.00 10.655949 11 998102 lstat > 0.02 0.022890 12 1900 read > 0.01 0.017219 21 829 write > 0.01 0.012093 14 868 munmap > 0.01 0.006656 11 606 mmap > 0.00 0.004168 1389 3 readlink > 0.00 0.001796 1796 1 chdir > 0.00 0.001019 510 2 clone > 0.00 0.000841 44 19 13 open > > Took ~50 seconds real time to complete. > > bup strace (I interrupted it after a while): > > strace: Process 10749 attached > strace: Process 10750 attached > Indexing: 25600 (566 paths/s) > ^C > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 89.55 1140.837325 1700 671016 lstat > 3.92 49.875376 934 53387 getdents > 3.58 45.655985 52238 874 read > 2.14 27.293944 789 34588 5799 open > 0.57 7.266342 141 51384 llistxattr > 0.09 1.090689 42 25692 25692 getxattr > 0.06 0.780977 26 29825 1019 close > 0.05 0.601806 23 25739 25722 ioctl > 0.03 0.373894 1851 202 select > 0.00 0.055953 14 3879 brk > > real 20m52.150s > user 0m2.608s > sys 0m11.456s > > Note I passed `-c -w` to strace to measure wall time of the syscalls > spend, not system CPU time. > > Using > > time strace -f -c -w ls -lU /mnt/dir > /dev/null > > shows that ls is as fast as rsync (also in 50 seconds). > > (Aside: I've filed the large number of getdents() as > https://bugzilla.redhat.com/show_bug.cgi?id=1478411 but that's not the > core question of this email.) > > My volume info: > > # gluster vol info > > Volume Name: myvol > Type: Replicate > Volume ID: ... > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: 10.0.0.1:/data/glusterfs/brick > Brick2: 10.0.0.2:/data/glusterfs/brick > Brick3: 10.0.0.3:/data/glusterfs/brick > Options Reconfigured: > changelog.capture-del-path: on > changelog.changelog: on > storage.build-pgfid: on > nfs.disable: on > transport.address-family: inet > client.ssl: on > server.ssl: on > storage.linux-aio: on > performance.io-thread-count: 64 > performance.readdir-ahead: on > server.event-threads: 32 > client.event-threads: 32 > server.outstanding-rpc-limit: 64 > cluster.lookup-unhashed: auto > performance.flush-behind: on > performance.strict-write-ordering: off > performance.high-prio-threads: 64 > performance.normal-prio-threads: 64 > performance.low-prio-threads: 64 > performance.write-behind-window-size: 10MB > cluster.ensure-durability: on > performance.lazy-open: yes > cluster.use-compound-fops: off > performance.open-behind: on > features.cache-invalidation: off > performance.quick-read: off > performance.read-ahead: off > performance.stat-prefetch: off > changelog.rollover-time: 1 > cluster.self-heal-daemon: enable > > Questions: > > What could explain why bup's lstat()s are slow and rsync's lstat()s are > fast? > > Also, how comes rsync's lstat()s can be faster than a network roundtrip > at all, given that I have all caching disabled (e.g. stat-prefetch: off, > cache-invalidation: off)? > > Is there some caching going on? Might bup issue its syscalls in an order > that might destroy this caching, while rsync and ls use a favourable order? > > Thanks! > _______________________________________________ > Gluster-users mailing list > [email protected] > http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
