Hi, Sorry for the late response. No, so eager-lock experiment was more to see if the implementation had any new bugs. It doesn't look like it does. I think having it on would be the right thing to do. It will reduce the number of fops having to go over the network.
Coming to the performance drop, I compared the volume profile output for stripe and 32MB shard again. The only thing that is striking is the number of xattrops and inodelks, which is only 2-4 for striped volume whereas the number is much bigger in the case of sharded volume. This is unfortunately likely with sharding because the optimizations eager-locking and delayed post-op will now only be applicable on a per-shard basis. Larger the shard size, the better, to work around this issue. Meanwhile, let me think about how we can get this fixed in code. -Krutika On Mon, Jul 10, 2017 at 7:59 PM, <[email protected]> wrote: > Hi Krutika, > > > > May I kindly ping to you and ask that If you have any idea yet or figured > out whats the issue may? > > > > I am awaiting your reply with four eyes :) > > > > Apologies for the ping :) > > > > -Gencer. > > > > *From:* [email protected] [mailto:gluster-users-bounces@ > gluster.org] *On Behalf Of *[email protected] > *Sent:* Thursday, July 6, 2017 11:06 AM > > *To:* 'Krutika Dhananjay' <[email protected]> > *Cc:* 'gluster-user' <[email protected]> > *Subject:* Re: [Gluster-users] Very slow performance on Sharded GlusterFS > > > > Hi Krutika, > > > > I also did one more test. I re-created another volume (single volume. Old > one destroyed-deleted) then do 2 dd tests. One for 1GB other for 2GB. Both > are 32MB shard and eager-lock off. > > > > Samples: > > > > sr:~# gluster volume profile testvol start > > Starting volume profile on testvol has been successful > > sr:~# dd if=/dev/zero of=/testvol/dtestfil0xb bs=1G count=1 > > 1+0 records in > > 1+0 records out > > 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 12.2708 s, 87.5 MB/s > > sr:~# gluster volume profile testvol info > /32mb_shard_and_1gb_dd.log > > sr:~# gluster volume profile testvol stop > > Stopping volume profile on testvol has been successful > > sr:~# gluster volume profile testvol start > > Starting volume profile on testvol has been successful > > sr:~# dd if=/dev/zero of=/testvol/dtestfil0xb bs=1G count=2 > > 2+0 records in > > 2+0 records out > > 2147483648 bytes (2.1 GB, 2.0 GiB) copied, 23.5457 s, 91.2 MB/s > > sr:~# gluster volume profile testvol info > /32mb_shard_and_2gb_dd.log > > sr:~# gluster volume profile testvol stop > > Stopping volume profile on testvol has been successful > > > > Also here is volume info: > > > > sr:~# gluster volume info testvol > > > > Volume Name: testvol > > Type: Distributed-Replicate > > Volume ID: 3cc06d95-06e9-41f8-8b26-e997886d7ba1 > > Status: Started > > Snapshot Count: 0 > > Number of Bricks: 10 x 2 = 20 > > Transport-type: tcp > > Bricks: > > Brick1: sr-09-loc-50-14-18:/bricks/brick1 > > Brick2: sr-10-loc-50-14-18:/bricks/brick1 > > Brick3: sr-09-loc-50-14-18:/bricks/brick2 > > Brick4: sr-10-loc-50-14-18:/bricks/brick2 > > Brick5: sr-09-loc-50-14-18:/bricks/brick3 > > Brick6: sr-10-loc-50-14-18:/bricks/brick3 > > Brick7: sr-09-loc-50-14-18:/bricks/brick4 > > Brick8: sr-10-loc-50-14-18:/bricks/brick4 > > Brick9: sr-09-loc-50-14-18:/bricks/brick5 > > Brick10: sr-10-loc-50-14-18:/bricks/brick5 > > Brick11: sr-09-loc-50-14-18:/bricks/brick6 > > Brick12: sr-10-loc-50-14-18:/bricks/brick6 > > Brick13: sr-09-loc-50-14-18:/bricks/brick7 > > Brick14: sr-10-loc-50-14-18:/bricks/brick7 > > Brick15: sr-09-loc-50-14-18:/bricks/brick8 > > Brick16: sr-10-loc-50-14-18:/bricks/brick8 > > Brick17: sr-09-loc-50-14-18:/bricks/brick9 > > Brick18: sr-10-loc-50-14-18:/bricks/brick9 > > Brick19: sr-09-loc-50-14-18:/bricks/brick10 > > Brick20: sr-10-loc-50-14-18:/bricks/brick10 > > Options Reconfigured: > > cluster.eager-lock: off > > features.shard-block-size: 32MB > > features.shard: on > > transport.address-family: inet > > nfs.disable: on > > > > See attached results and sorry for the multiple e-mails. I just want to > make sure that I provided correct results for the tests. > > > > Thanks, > > Gencer. > > > > *From:* [email protected] [mailto:gluster-users-bounces@ > gluster.org <[email protected]>] *On Behalf Of * > [email protected] > *Sent:* Thursday, July 6, 2017 10:34 AM > *To:* 'Krutika Dhananjay' <[email protected]> > *Cc:* 'gluster-user' <[email protected]> > *Subject:* Re: [Gluster-users] Very slow performance on Sharded GlusterFS > > > > Krutika, I’m sorry I forgot to add logs. I attached them now. > > > > Thanks, > > Gencer. > > > > > > > > *From:* [email protected] [mailto:gluster-users-bounces@ > gluster.org <[email protected]>] *On Behalf Of * > [email protected] > *Sent:* Thursday, July 6, 2017 10:27 AM > *To:* 'Krutika Dhananjay' <[email protected]> > *Cc:* 'gluster-user' <[email protected]> > *Subject:* Re: [Gluster-users] Very slow performance on Sharded GlusterFS > > > > Ki Krutika, > > > > After that setting: > > > > $ dd if=/dev/zero of=/mnt/ddfile bs=1G count=1 > > 1+0 records in > > 1+0 records out > > 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 11.7351 s, 91.5 MB/s > > > > $ dd if=/dev/zero of=/mnt/ddfile2 bs=2G count=1 > > 0+1 records in > > 0+1 records out > > 2147479552 bytes (2.1 GB, 2.0 GiB) copied, 23.7351 s, 90.5 MB/s > > > > $ dd if=/dev/zero of=/mnt/ddfile3 bs=1G count=1 > > 1+0 records in > > 1+0 records out > > 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 12.1202 s, 88.6 MB/s > > > > $ dd if=/dev/zero of=/mnt/ddfile4 bs=1G count=2 > > 2+0 records in > > 2+0 records out > > 2147483648 bytes (2.1 GB, 2.0 GiB) copied, 24.7695 s, 86.7 MB/s > > > > I see improvements (from 70-75mb to 90-100mb per second) after eager-lock > off setting. Also, I monitoring the bandwidth between two nodes. I see up > to 102MB/s. > > > > Is there anything I can do to optimize more? Or is it last stop? > > > > Note: I deleted all files again and reformat then re-create volume with > shard then mount it. Tried with 16MB, 32MB and 512MB shard sizes. Results > are equal. > > > > Thanks, > > Gencer. > > > > *From:* Krutika Dhananjay [mailto:[email protected] > <[email protected]>] > *Sent:* Thursday, July 6, 2017 3:30 AM > *To:* [email protected] > *Cc:* gluster-user <[email protected]> > *Subject:* Re: [Gluster-users] Very slow performance on Sharded GlusterFS > > > > What if you disabled eager lock and run your test again on the sharded > configuration along with the profile output? > > # gluster volume set <VOL> cluster.eager-lock off > > -Krutika > > > > On Tue, Jul 4, 2017 at 9:03 PM, Krutika Dhananjay <[email protected]> > wrote: > > Thanks. I think reusing the same volume was the cause of lack of IO > distribution. > > The latest profile output looks much more realistic and in line with i > would expect. > > Let me analyse the numbers a bit and get back. > > > > -Krutika > > > > On Tue, Jul 4, 2017 at 12:55 PM, <[email protected]> wrote: > > Hi Krutika, > > > > Thank you so much for myour reply. Let me answer all: > > > > 1. I have no idea why it did not get distributed over all bricks. > 2. Hm.. This is really weird. > > > > And others; > > > > No. I use only one volume. When I tested sharded and striped volumes, I > manually stopped volume, deleted volume, purged data (data inside of > bricks/disks) and re-create by using this command: > > > > sudo gluster volume create testvol replica 2 sr-09-loc-50-14-18:/bricks/brick1 > sr-10-loc-50-14-18:/bricks/brick1 sr-09-loc-50-14-18:/bricks/brick2 > sr-10-loc-50-14-18:/bricks/brick2 sr-09-loc-50-14-18:/bricks/brick3 > sr-10-loc-50-14-18:/bricks/brick3 sr-09-loc-50-14-18:/bricks/brick4 > sr-10-loc-50-14-18:/bricks/brick4 sr-09-loc-50-14-18:/bricks/brick5 > sr-10-loc-50-14-18:/bricks/brick5 sr-09-loc-50-14-18:/bricks/brick6 > sr-10-loc-50-14-18:/bricks/brick6 sr-09-loc-50-14-18:/bricks/brick7 > sr-10-loc-50-14-18:/bricks/brick7 sr-09-loc-50-14-18:/bricks/brick8 > sr-10-loc-50-14-18:/bricks/brick8 sr-09-loc-50-14-18:/bricks/brick9 > sr-10-loc-50-14-18:/bricks/brick9 sr-09-loc-50-14-18:/bricks/brick10 > sr-10-loc-50-14-18:/bricks/brick10 force > > > > and of course after that volume start executed. If shard enabled, I enable > that feature BEFORE I start the sharded volume than mount. > > > > I tried converting from one to another but then I saw documentation says > clean voluje should be better. So I tried clean method. Still same > performance. > > > > Testfile grows from 1GB to 5GB. And tests are dd. See this example: > > > > dd if=/dev/zero of=/mnt/testfile bs=1G count=5 > > 5+0 records in > > 5+0 records out > > 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 66.7978 s, 80.4 MB/s > > > > > > >> dd if=/dev/zero of=/mnt/testfile bs=5G count=1 > > This also gives same result. (bs and count reversed) > > > > > > And this example have generated a profile which I also attached to this > e-mail. > > > > Is there anything that I can try? I am open to all kind of suggestions. > > > > Thanks, > > Gencer. > > > > *From:* Krutika Dhananjay [mailto:[email protected]] > *Sent:* Tuesday, July 4, 2017 9:39 AM > > > *To:* [email protected] > *Cc:* gluster-user <[email protected]> > *Subject:* Re: [Gluster-users] Very slow performance on Sharded GlusterFS > > > > Hi Gencer, > > I just checked the volume-profile attachments. > > Things that seem really odd to me as far as the sharded volume is > concerned: > > 1. Only the replica pair having bricks 5 and 6 on both nodes 09 and 10 > seems to have witnessed all the IO. No other bricks witnessed any write > operations. This is unacceptable for a volume that has 8 other replica > sets. Why didn't the shards get distributed across all of these sets? > > > > 2. For replica set consisting of bricks 5 and 6 of node 09, I see that the > brick 5 is spending 99% of its time in FINODELK fop, when the fop that > should have dominated its profile should have been in fact WRITE. > > Could you throw some more light on your setup from gluster standpoint? > * For instance, are you using two different gluster volumes to gather > these numbers - one distributed-replicated-striped and another > distributed-replicated-sharded? Or are you merely converting a single > volume from one type to another? > > > > * And if there are indeed two volumes, could you share both their `volume > info` outputs to eliminate any confusion? > > * If there's just one volume, are you taking care to remove all data from > the mount point of this volume before converting it? > > * What is the size the test file grew to? > > * These attached profiles are against dd runs? Or the file download test? > > > > -Krutika > > > > > > On Mon, Jul 3, 2017 at 8:42 PM, <[email protected]> wrote: > > Hi Krutika, > > > > Have you be able to look out my profiles? Do you have any clue, idea or > suggestion? > > > > Thanks, > > -Gencer > > > > *From:* Krutika Dhananjay [mailto:[email protected]] > *Sent:* Friday, June 30, 2017 3:50 PM > > > *To:* [email protected] > *Cc:* gluster-user <[email protected]> > *Subject:* Re: [Gluster-users] Very slow performance on Sharded GlusterFS > > > > Just noticed that the way you have configured your brick order during > volume-create makes both replicas of every set reside on the same machine. > > That apart, do you see any difference if you change shard-block-size to > 512MB? Could you try that? > > If it doesn't help, could you share the volume-profile output for both the > tests (separate)? > > Here's what you do: > > 1. Start profile before starting your test - it could be dd or it could be > file download. > > # gluster volume profile <VOL> start > > 2. Run your test - again either dd or file-download. > > 3. Once the test has completed, run `gluster volume profile <VOL> info` > and redirect its output to a tmp file. > > 4. Stop profile > > # gluster volume profile <VOL> stop > > And attach the volume-profile output file that you saved at a temporary > location in step 3. > > -Krutika > > > > On Fri, Jun 30, 2017 at 5:33 PM, <[email protected]> wrote: > > Hi Krutika, > > > > Sure, here is volume info: > > > > root@sr-09-loc-50-14-18:/# gluster volume info testvol > > > > Volume Name: testvol > > Type: Distributed-Replicate > > Volume ID: 30426017-59d5-4091-b6bc-279a905b704a > > Status: Started > > Snapshot Count: 0 > > Number of Bricks: 10 x 2 = 20 > > Transport-type: tcp > > Bricks: > > Brick1: sr-09-loc-50-14-18:/bricks/brick1 > > Brick2: sr-09-loc-50-14-18:/bricks/brick2 > > Brick3: sr-09-loc-50-14-18:/bricks/brick3 > > Brick4: sr-09-loc-50-14-18:/bricks/brick4 > > Brick5: sr-09-loc-50-14-18:/bricks/brick5 > > Brick6: sr-09-loc-50-14-18:/bricks/brick6 > > Brick7: sr-09-loc-50-14-18:/bricks/brick7 > > Brick8: sr-09-loc-50-14-18:/bricks/brick8 > > Brick9: sr-09-loc-50-14-18:/bricks/brick9 > > Brick10: sr-09-loc-50-14-18:/bricks/brick10 > > Brick11: sr-10-loc-50-14-18:/bricks/brick1 > > Brick12: sr-10-loc-50-14-18:/bricks/brick2 > > Brick13: sr-10-loc-50-14-18:/bricks/brick3 > > Brick14: sr-10-loc-50-14-18:/bricks/brick4 > > Brick15: sr-10-loc-50-14-18:/bricks/brick5 > > Brick16: sr-10-loc-50-14-18:/bricks/brick6 > > Brick17: sr-10-loc-50-14-18:/bricks/brick7 > > Brick18: sr-10-loc-50-14-18:/bricks/brick8 > > Brick19: sr-10-loc-50-14-18:/bricks/brick9 > > Brick20: sr-10-loc-50-14-18:/bricks/brick10 > > Options Reconfigured: > > features.shard-block-size: 32MB > > features.shard: on > > transport.address-family: inet > > nfs.disable: on > > > > -Gencer. > > > > *From:* Krutika Dhananjay [mailto:[email protected]] > *Sent:* Friday, June 30, 2017 2:50 PM > *To:* [email protected] > *Cc:* gluster-user <[email protected]> > *Subject:* Re: [Gluster-users] Very slow performance on Sharded GlusterFS > > > > Could you please provide the volume-info output? > > -Krutika > > > > On Fri, Jun 30, 2017 at 4:23 PM, <[email protected]> wrote: > > Hi, > > > > I have an 2 nodes with 20 bricks in total (10+10). > > > > First test: > > > > 2 Nodes with Distributed – Striped – Replicated (2 x 2) > > 10GbE Speed between nodes > > > > “dd” performance: 400mb/s and higher > > Downloading a large file from internet and directly to the gluster: > 250-300mb/s > > > > Now same test without Stripe but with sharding. This results are same when > I set shard size 4MB or 32MB. (Again 2x Replica here) > > > > Dd performance: 70mb/s > > Download directly to the gluster performance : 60mb/s > > > > Now, If we do this test twice at the same time (two dd or two doewnload at > the same time) it goes below 25/mb each or slower. > > > > I thought sharding is at least equal or a little slower (maybe?) but these > results are terribly slow. > > > > I tried tuning (cache, window-size etc..). Nothing helps. > > > > GlusterFS 3.11 and Debian 9 used. Kernel also tuned. Disks are “xfs” and > 4TB each. > > > > Is there any tweak/tuning out there to make it fast? > > > > Or is this an expected behavior? If its, It is unacceptable. So slow. I > cannot use this on production as it is terribly slow. > > > > The reason behind I use shard instead of stripe is i would like to > eleminate files that bigger than brick size. > > > > Thanks, > > Gencer. > > > _______________________________________________ > Gluster-users mailing list > [email protected] > http://lists.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > >
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
