[Gluster-users] Sharding on 7.x - file sizes are wrong after a large copy.

2020-07-08 Thread Claus Jeppesen
In April of this year I reported the problem using sharding on gluster 7.4:


We're using GlusterFS in a replicated brick setup with 2 bricks with
sharding turned on (shardsize 128MB).

There is something funny going on as we can see that if we copy large VM
files to the volume we can end up with files that are a bit larger than the
source files DEPENDING on the speed with which we copied the files - e.g.:

   dd if=SOURCE bs=1M | pv -L NNm | ssh gluster_server "dd
of=/gluster/VOL_NAME/TARGET
bs=1M"

It seems that if NN is <= 25 (i.e. 25 MB/s) the size of SOURCE and TARGET
will be the same.

If we crank NN to, say, 50 we sometimes risk that a 25G file ends up having
a slightly larger size, e.g. 26844413952 or 26844233728 - larger than the
expected 26843545600.
Unfortunately this is not an illusion ! If we dd the files out of Gluster
we will receive the amount of data that 'ls' showed us.

In the brick directory (incl .shard directory) we have the expected amount
of shards for a 25G files (200) with size precisely equal to 128MB - but
there is an additional 0 size shard file created.

Has anyone else seen a phenomenon like this ?


After upgrade to 7.6 we're still seeing this problem - now,  the extra
bytes that are appearing can be removed using truncate in the mounted
gluster volume, and md5sum can confirm that after truncate the content is
identical to the source - however, it may point to an underlying issue.

I hope someone can reproduce this behaviour,

Thanx,

Claus.

-- 
*Claus Jeppesen*
Manager, Network Services
Datto, Inc.
p +45 6170 5901 | Copenhagen Office
www.datto.com




Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] "Mismatching layouts" in glusterfs client logs after new brick addition and rebalance

2020-07-08 Thread Strahil Nikolov
At least for EL 7 ,there are 2  modules for sosreport:
gluster & gluster_block

Best Regards,
Strahil Nikolov

На 8 юли 2020 г. 9:02:10 GMT+03:00, Artem Russakovskii  
написа:
>I think it'd be extremely helpful if gluster had a feature to grab all
>the
>necessary logs/debug info (maybe a few variations depending on the bug)
>so
>that all the user would have to do is enter a simple command and have
>gluster generate the whole bug report, ready to be sent to to the
>gluster
>team.
>
>Sincerely,
>Artem
>
>--
>Founder, Android Police , APK Mirror
>, Illogical Robot LLC
>beerpla.net | @ArtemR 
>
>
>On Tue, Jul 7, 2020 at 1:47 AM Shreyansh Shah
>
>wrote:
>
>> Sounds good, thank you.
>>
>> On Tue, Jul 7, 2020 at 2:12 PM Barak Sason Rofman
>
>> wrote:
>>
>>> Thanks Shreyansh,
>>>
>>> I'll look into it, however I'll likely need some help from more
>senior
>>> team members to perform RCA.
>>> I'll update once I have new insights.
>>>
>>> My regards,
>>>
>>> On Tue, Jul 7, 2020 at 11:40 AM Shreyansh Shah <
>>> shreyansh.s...@alpha-grep.com> wrote:
>>>
 Hi Barak,
 Thanks for looking into this and helping me out,
 The fix-layout was successful, and I ran a rebalance after
>completion of
 fix-layout.
 The rebalance status though did show failure for 3 nodes.

 On Tue, Jul 7, 2020 at 2:07 PM Barak Sason Rofman
>
 wrote:

> Greetings again Shreyansh,
>
> I'm indeed seeing a lot of errors in the log file - still unsure
>about
> the RC.
> You mentioned that prior to running rebalance you ran fix-layout,
>was
> the fix-layout successful?
> Another question - did you wait until fix-layout was completed
>before
> running rebalance?
>
> My thanks,
>
> On Mon, Jul 6, 2020 at 9:33 PM Shreyansh Shah <
> shreyansh.s...@alpha-grep.com> wrote:
>
>> Hi,
>> Attaching rebalance logs
>> FYI, we ran "gluster rebalance fix-layout" followed by "gluster
>> rebalance" on 20200701 and today we again ran "gluster rebalance
>fix-layout"
>>
>>
>> PFA
>>
>> On Mon, Jul 6, 2020 at 11:08 PM Barak Sason Rofman <
>> bsaso...@redhat.com> wrote:
>>
>>> I think it would be best.
>>> As I can't say at this point where the problem is originating
>from,
>>> brick logs might also be necessary (I assume I would have a
>better picture
>>> once I have the rebalance logs).
>>>
>>> Cheers,
>>>
>>> On Mon, Jul 6, 2020 at 8:16 PM Shreyansh Shah <
>>> shreyansh.s...@alpha-grep.com> wrote:
>>>
 Hi Barak,
 Can provide the rebalance logs. Do you require all the brick
>logs
 (14 in total)?

 On Mon, Jul 6, 2020 at 10:43 PM Barak Sason Rofman <
 bsaso...@redhat.com> wrote:

> Greetings Shreyansh,
>
> Off-hand I can't come up with a reason for these failures.
> In order to start looking into this, access to the full
>rebalance
> logs is required (possibly brick logs as well).
> Can you provide those?
>
> My regards,
>
>
> On Mon, Jul 6, 2020 at 11:41 AM Shreyansh Shah <
> shreyansh.s...@alpha-grep.com> wrote:
>
>> Hi,
>> Did anyone get a chance to look into this?
>>
>> On Thu, Jul 2, 2020 at 8:09 PM Shreyansh Shah <
>> shreyansh.s...@alpha-grep.com> wrote:
>>
>>> Hi All,
>>>
>>> *We are facing "Mismatching layouts for ,gfid =
>"
>>> errors.*
>>>
>>> We have a distributed glusterfs 5.10, no replication, 2
>bricks
>>> (4TB each) on each node, 7 nodes in total. We added new
>bricks yesterday to
>>> the existing setup.
>>> Post that we did a rebalance fix-layout and then a rebalance
>>> (which is currently still in progress). The status shows
>"failed" on
>>> certain bricks but "in progress" for others. Adding output
>for gluster
>>> rebalance status below.
>>>
>>> The glusterfs client logs are flooded with "Mismatching
>layouts
>>> for ,gfid = "
>>> The performance too seems to have degraded due to this, even
>>> basic commands like `cd` and `ls` are taking more than a
>minute compared to
>>> sub-second number before brick addition.
>>> Apart from that we also experienced many binaries and files
>>> giving error stale file handle error even though the files
>were present.
>>>
>>>
>>> *gluster rebalance status :*
>>>
>>> Node Rebalanced-files  size   scanned 
>failures
>>> skipped   status  run time in h:m:s
>>> -  ---   ---   ---
>>> ---   --- 
>--
>>> localhost 

Re: [Gluster-users] Problems with qemu and disperse volumes (live merge)

2020-07-08 Thread Strahil Nikolov


See  my comments inline.


На 8 юли 2020 г. 0:46:21 GMT+03:00, Marco Fais  написа:
>Hi Strahil
>
>first of all thanks a million for your help -- really appreciate it.
>Thanks also for the pointers on the debug. I have tried it, and while I
>can't interpret the results I think I might have found something.
>
>There is a lot of information so hopefully this is relevant. During the
>snapshot creation and deletion, I can see the following errors in the
>client log:
>
>[2020-07-07 21:23:06.837381] W [MSGID: 122019]
>[ec-helpers.c:401:ec_loc_gfid_check] 0-SSD_Storage-disperse-0:
>Mismatching
>GFID's in loc
>[2020-07-07 21:23:06.837387] D [MSGID: 0]
>[defaults.c:1328:default_mknod_cbk] 0-stack-trace: stack-address:
>0x7f0dc0001a78, SSD_Storage-disperse-0 returned -1 error: Input/output
>error [Input/output error]


You have to check brick logs for  the first brick in the  volume  list.


>[2020-07-07 21:23:06.837392] W [MSGID: 109002]
>[dht-rename.c:1019:dht_rename_links_create_cbk] 0-SSD_Storage-dht:
>link/file
>/8d49207e-f6b9-41d1-8d35-f6e0fb121980/images/4802e66e-a7e3-42df-a570-7155135566ad/b51133ee-54e0-4001-ab4b-9f0dc1e5c6fc.meta

Check the meta file. There was a problem with Gluster where it healed it before 
the other replica  has  come up (in your case is  a little bit 
different.Usually only the timestamp inside the file is changed,  so you can 
force gluster to update it  by changing the timestamp inside.


>on SSD_Storage-disperse-0 failed [Input/output error]

Already mentioned it.

>[2020-07-07 21:23:06.837850] D [MSGID: 0] [stack.h:502:copy_frame]
>0-stack:
>groups is null (ngrps: 0) [Invalid argument]
>[2020-07-07 21:23:06.839252] D [dict.c:1168:data_to_uint32]
>(-->/lib64/libglusterfs.so.0(dict_foreach_match+0x77) [0x7f0ddb1855e7]
>-->/usr/lib64/glusterfs/7.5/xlator/cluster/disperse.so(+0x384cf)
>[0x7f0dd23c54cf] -->/lib64/libglusterfs.so.0(data_to_uint32+0x8e)
>[0x7f0ddb184f2e] ) 0-dict: key null, unsigned integer type asked, has
>integer type [Invalid argument]
>[2020-07-07 21:23:06.839272] D [MSGID: 0]
>[dht-common.c:6674:dht_readdirp_cbk] 0-SSD_Storage-dht: Processing
>entries
>from SSD_Storage-disperse-0
>[2020-07-07 21:23:06.839281] D [MSGID: 0]
>[dht-common.c:6681:dht_readdirp_cbk] 0-SSD_Storage-dht:
>SSD_Storage-disperse-0: entry = ., type = 4
>[2020-07-07 21:23:06.839291] D [MSGID: 0]
>[dht-common.c:6813:dht_readdirp_cbk] 0-SSD_Storage-dht:
>SSD_Storage-disperse-0: Adding entry = .
>[2020-07-07 21:23:06.839297] D [MSGID: 0]
>[dht-common.c:6681:dht_readdirp_cbk] 0-SSD_Storage-dht:
>SSD_Storage-disperse-0: entry = .., type = 4
>[2020-07-07 21:23:06.839324] D [MSGID: 0]
>[client-rpc-fops_v2.c:2641:client4_0_lookup_cbk] 0-stack-trace:
>stack-address: 0x7f0dc0034598, SSD_Storage-client-6 returned -1 error:
>Stale file handle [Stale file handle]


I see multiple of these, but as the message is not 'W' or 'E' ,  I assume it 
could happen and it's normal.

>[2020-07-07 21:23:06.839327] D [dict.c:1800:dict_get_int32]
>(-->/usr/lib64/glusterfs/7.5/xlator/cluster/disperse.so(+0x227d6)
>[0x7f0dd23af7d6]
>-->/usr/lib64/glusterfs/7.5/xlator/cluster/disperse.so(+0x17661)
>[0x7f0dd23a4661] -->/lib64/libglusterfs.so.0(dict_get_int32+0x107)
>[0x7f0ddb186437] ) 0-dict: key glusterfs.inodelk-count, integer type
>asked,
>has unsigned integer type [Invalid argument]
>[2020-07-07 21:23:06.839361] D [MSGID: 0]
>[client-rpc-fops_v2.c:2641:client4_0_lookup_cbk] 0-stack-trace:
>stack-address: 0x7f0dc0034598, SSD_Storage-client-11 returned -1 error:
>Stale file handle [Stale file handle]
>[2020-07-07 21:23:06.839395] D [MSGID: 0]
>[client-rpc-fops_v2.c:2641:client4_0_lookup_cbk] 0-stack-trace:
>stack-address: 0x7f0dc00395a8, SSD_Storage-client-15 returned -1 error:
>Stale file handle [Stale file handle]
>[2020-07-07 21:23:06.839419] D [MSGID: 0]
>[client-rpc-fops_v2.c:2641:client4_0_lookup_cbk] 0-stack-trace:
>stack-address: 0x7f0dc0034598, SSD_Storage-client-9 returned -1 error:
>Stale file handle [Stale file handle]
>[2020-07-07 21:23:06.839473] D [MSGID: 0]
>[client-rpc-fops_v2.c:2641:client4_0_lookup_cbk] 0-stack-trace:
>stack-address: 0x7f0dc009c108, SSD_Storage-client-18 returned -1 error:
>Stale file handle [Stale file handle]
>[2020-07-07 21:23:06.839471] D [MSGID: 0]
>[client-rpc-fops_v2.c:2641:client4_0_lookup_cbk] 0-stack-trace:
>stack-address: 0x7f0dc0034598, SSD_Storage-client-10 returned -1 error:
>Stale file handle [Stale file handle]
>[2020-07-07 21:23:06.839491] D [dict.c:1800:dict_get_int32]
>(-->/usr/lib64/glusterfs/7.5/xlator/cluster/disperse.so(+0x256ad)
>[0x7f0dd23b26ad]
>-->/usr/lib64/glusterfs/7.5/xlator/cluster/disperse.so(+0x17661)
>[0x7f0dd23a4661] -->/lib64/libglusterfs.so.0(dict_get_int32+0x107)
>[0x7f0ddb186437] ) 0-dict: key glusterfs.inodelk-count, integer type
>asked,
>has unsigned integer type [Invalid argument]
>[2020-07-07 21:23:06.839512] D [MSGID: 0]
>[client-rpc-fops_v2.c:2641:client4_0_lookup_cbk] 0-stack-trace:
>stack-address: 0x7f0dc0034598, SSD_Storage-client-7 returned 

Re: [Gluster-users] "Mismatching layouts" in glusterfs client logs after new brick addition and rebalance

2020-07-08 Thread Artem Russakovskii
I think it'd be extremely helpful if gluster had a feature to grab all the
necessary logs/debug info (maybe a few variations depending on the bug) so
that all the user would have to do is enter a simple command and have
gluster generate the whole bug report, ready to be sent to to the gluster
team.

Sincerely,
Artem

--
Founder, Android Police , APK Mirror
, Illogical Robot LLC
beerpla.net | @ArtemR 


On Tue, Jul 7, 2020 at 1:47 AM Shreyansh Shah 
wrote:

> Sounds good, thank you.
>
> On Tue, Jul 7, 2020 at 2:12 PM Barak Sason Rofman 
> wrote:
>
>> Thanks Shreyansh,
>>
>> I'll look into it, however I'll likely need some help from more senior
>> team members to perform RCA.
>> I'll update once I have new insights.
>>
>> My regards,
>>
>> On Tue, Jul 7, 2020 at 11:40 AM Shreyansh Shah <
>> shreyansh.s...@alpha-grep.com> wrote:
>>
>>> Hi Barak,
>>> Thanks for looking into this and helping me out,
>>> The fix-layout was successful, and I ran a rebalance after completion of
>>> fix-layout.
>>> The rebalance status though did show failure for 3 nodes.
>>>
>>> On Tue, Jul 7, 2020 at 2:07 PM Barak Sason Rofman 
>>> wrote:
>>>
 Greetings again Shreyansh,

 I'm indeed seeing a lot of errors in the log file - still unsure about
 the RC.
 You mentioned that prior to running rebalance you ran fix-layout, was
 the fix-layout successful?
 Another question - did you wait until fix-layout was completed before
 running rebalance?

 My thanks,

 On Mon, Jul 6, 2020 at 9:33 PM Shreyansh Shah <
 shreyansh.s...@alpha-grep.com> wrote:

> Hi,
> Attaching rebalance logs
> FYI, we ran "gluster rebalance fix-layout" followed by "gluster
> rebalance" on 20200701 and today we again ran "gluster rebalance 
> fix-layout"
>
>
> PFA
>
> On Mon, Jul 6, 2020 at 11:08 PM Barak Sason Rofman <
> bsaso...@redhat.com> wrote:
>
>> I think it would be best.
>> As I can't say at this point where the problem is originating from,
>> brick logs might also be necessary (I assume I would have a better 
>> picture
>> once I have the rebalance logs).
>>
>> Cheers,
>>
>> On Mon, Jul 6, 2020 at 8:16 PM Shreyansh Shah <
>> shreyansh.s...@alpha-grep.com> wrote:
>>
>>> Hi Barak,
>>> Can provide the rebalance logs. Do you require all the brick logs
>>> (14 in total)?
>>>
>>> On Mon, Jul 6, 2020 at 10:43 PM Barak Sason Rofman <
>>> bsaso...@redhat.com> wrote:
>>>
 Greetings Shreyansh,

 Off-hand I can't come up with a reason for these failures.
 In order to start looking into this, access to the full rebalance
 logs is required (possibly brick logs as well).
 Can you provide those?

 My regards,


 On Mon, Jul 6, 2020 at 11:41 AM Shreyansh Shah <
 shreyansh.s...@alpha-grep.com> wrote:

> Hi,
> Did anyone get a chance to look into this?
>
> On Thu, Jul 2, 2020 at 8:09 PM Shreyansh Shah <
> shreyansh.s...@alpha-grep.com> wrote:
>
>> Hi All,
>>
>> *We are facing "Mismatching layouts for ,gfid = "
>> errors.*
>>
>> We have a distributed glusterfs 5.10, no replication, 2 bricks
>> (4TB each) on each node, 7 nodes in total. We added new bricks 
>> yesterday to
>> the existing setup.
>> Post that we did a rebalance fix-layout and then a rebalance
>> (which is currently still in progress). The status shows "failed" on
>> certain bricks but "in progress" for others. Adding output for 
>> gluster
>> rebalance status below.
>>
>> The glusterfs client logs are flooded with "Mismatching layouts
>> for ,gfid = "
>> The performance too seems to have degraded due to this, even
>> basic commands like `cd` and `ls` are taking more than a minute 
>> compared to
>> sub-second number before brick addition.
>> Apart from that we also experienced many binaries and files
>> giving error stale file handle error even though the files were 
>> present.
>>
>>
>> *gluster rebalance status :*
>>
>> Node Rebalanced-files  size   scanned  failures
>> skipped   status  run time in h:m:s
>> -  ---   ---   ---
>> ---   ---  --
>> localhost  176 3.5GB 12790
>>   0  8552  in progress   21:36:01
>> 10.132.0.72 8232   394.8GB 19995
>>2126   failed   14:50:30
>> 10.132.0.44