Re: [Gluster-users] gluster 5.3: transport endpoint gets disconnected - Assertion failed: GF_MEM_TRAILER_MAGIC

2019-02-06 Thread Nithya Balachandran
On Wed, 6 Feb 2019 at 14:34, Hu Bert  wrote:

> Hi there,
>
> just curious - from man mount.glusterfs:
>
>lru-limit=N
>  Set fuse module's limit for number of inodes kept in LRU
> list to N [default: 0]
>

Sorry, that is a bug in the man page and we will fix that. The current
default is 131072:
{

.key = {"lru-limit"},

.type = GF_OPTION_TYPE_INT,

.default_value = "131072",

.min = 0,

.description = "makes glusterfs invalidate kernel inodes after "

   "reaching this limit (0 means 'unlimited')",

},




>
> This seems to be the default already? Set it explicitly?
>
> Regards,
> Hubert
>
> Am Mi., 6. Feb. 2019 um 09:26 Uhr schrieb Nithya Balachandran
> :
> >
> > Hi,
> >
> > The client logs indicates that the mount process has crashed.
> > Please try mounting the volume with the volume option lru-limit=0 and
> see if it still crashes.
> >
> > Thanks,
> > Nithya
> >
> > On Thu, 24 Jan 2019 at 12:47, Hu Bert  wrote:
> >>
> >> Good morning,
> >>
> >> we currently transfer some data to a new glusterfs volume; to check
> >> the throughput of the new volume/setup while the transfer is running i
> >> decided to create some files on one of the gluster servers with dd in
> >> loop:
> >>
> >> while true; do dd if=/dev/urandom of=/shared/private/1G.file bs=1M
> >> count=1024; rm /shared/private/1G.file; done
> >>
> >> /shared/private is the mount point of the glusterfs volume. The dd
> >> should run for about an hour. But now it happened twice that during
> >> this loop the transport endpoint gets disconnected:
> >>
> >> dd: failed to open '/shared/private/1G.file': Transport endpoint is
> >> not connected
> >> rm: cannot remove '/shared/private/1G.file': Transport endpoint is not
> connected
> >>
> >> In the /var/log/glusterfs/shared-private.log i see:
> >>
> >> [2019-01-24 07:03:28.938745] W [MSGID: 108001]
> >> [afr-transaction.c:1062:afr_handle_quorum] 0-persistent-replicate-0:
> >> 7212652e-c437-426c-a0a9-a47f5972fffe: Failing WRITE as quorum i
> >> s not met [Transport endpoint is not connected]
> >> [2019-01-24 07:03:28.939280] E [mem-pool.c:331:__gf_free]
> >>
> (-->/usr/lib/x86_64-linux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be8c)
> >> [0x7eff84248e8c] -->/usr/lib/x86_64-lin
> >> ux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be18)
> >> [0x7eff84248e18]
> >> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0xf6)
> >> [0x7eff8a9485a6] ) 0-: Assertion failed:
> >> GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size)
> >> [snip]
> >>
> >> The whole output can be found here: https://pastebin.com/qTMmFxx0
> >> gluster volume info here: https://pastebin.com/ENTWZ7j3
> >>
> >> After umount + mount the transport endpoint is connected again - until
> >> the next disconnect. A /core file gets generated. Maybe someone wants
> >> to have a look at this file?
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> https://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster 5.3: transport endpoint gets disconnected - Assertion failed: GF_MEM_TRAILER_MAGIC

2019-02-06 Thread Hu Bert
Hi there,

just curious - from man mount.glusterfs:

   lru-limit=N
 Set fuse module's limit for number of inodes kept in LRU
list to N [default: 0]

This seems to be the default already? Set it explicitly?

Regards,
Hubert

Am Mi., 6. Feb. 2019 um 09:26 Uhr schrieb Nithya Balachandran
:
>
> Hi,
>
> The client logs indicates that the mount process has crashed.
> Please try mounting the volume with the volume option lru-limit=0 and see if 
> it still crashes.
>
> Thanks,
> Nithya
>
> On Thu, 24 Jan 2019 at 12:47, Hu Bert  wrote:
>>
>> Good morning,
>>
>> we currently transfer some data to a new glusterfs volume; to check
>> the throughput of the new volume/setup while the transfer is running i
>> decided to create some files on one of the gluster servers with dd in
>> loop:
>>
>> while true; do dd if=/dev/urandom of=/shared/private/1G.file bs=1M
>> count=1024; rm /shared/private/1G.file; done
>>
>> /shared/private is the mount point of the glusterfs volume. The dd
>> should run for about an hour. But now it happened twice that during
>> this loop the transport endpoint gets disconnected:
>>
>> dd: failed to open '/shared/private/1G.file': Transport endpoint is
>> not connected
>> rm: cannot remove '/shared/private/1G.file': Transport endpoint is not 
>> connected
>>
>> In the /var/log/glusterfs/shared-private.log i see:
>>
>> [2019-01-24 07:03:28.938745] W [MSGID: 108001]
>> [afr-transaction.c:1062:afr_handle_quorum] 0-persistent-replicate-0:
>> 7212652e-c437-426c-a0a9-a47f5972fffe: Failing WRITE as quorum i
>> s not met [Transport endpoint is not connected]
>> [2019-01-24 07:03:28.939280] E [mem-pool.c:331:__gf_free]
>> (-->/usr/lib/x86_64-linux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be8c)
>> [0x7eff84248e8c] -->/usr/lib/x86_64-lin
>> ux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be18)
>> [0x7eff84248e18]
>> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0xf6)
>> [0x7eff8a9485a6] ) 0-: Assertion failed:
>> GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size)
>> [snip]
>>
>> The whole output can be found here: https://pastebin.com/qTMmFxx0
>> gluster volume info here: https://pastebin.com/ENTWZ7j3
>>
>> After umount + mount the transport endpoint is connected again - until
>> the next disconnect. A /core file gets generated. Maybe someone wants
>> to have a look at this file?
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster 5.3: transport endpoint gets disconnected - Assertion failed: GF_MEM_TRAILER_MAGIC

2019-02-06 Thread Nithya Balachandran
Hi,

The client logs indicates that the mount process has crashed.
Please try mounting the volume with the volume option lru-limit=0 and see
if it still crashes.

Thanks,
Nithya

On Thu, 24 Jan 2019 at 12:47, Hu Bert  wrote:

> Good morning,
>
> we currently transfer some data to a new glusterfs volume; to check
> the throughput of the new volume/setup while the transfer is running i
> decided to create some files on one of the gluster servers with dd in
> loop:
>
> while true; do dd if=/dev/urandom of=/shared/private/1G.file bs=1M
> count=1024; rm /shared/private/1G.file; done
>
> /shared/private is the mount point of the glusterfs volume. The dd
> should run for about an hour. But now it happened twice that during
> this loop the transport endpoint gets disconnected:
>
> dd: failed to open '/shared/private/1G.file': Transport endpoint is
> not connected
> rm: cannot remove '/shared/private/1G.file': Transport endpoint is not
> connected
>
> In the /var/log/glusterfs/shared-private.log i see:
>
> [2019-01-24 07:03:28.938745] W [MSGID: 108001]
> [afr-transaction.c:1062:afr_handle_quorum] 0-persistent-replicate-0:
> 7212652e-c437-426c-a0a9-a47f5972fffe: Failing WRITE as quorum i
> s not met [Transport endpoint is not connected]
> [2019-01-24 07:03:28.939280] E [mem-pool.c:331:__gf_free]
>
> (-->/usr/lib/x86_64-linux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be8c)
> [0x7eff84248e8c] -->/usr/lib/x86_64-lin
> ux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be18)
> [0x7eff84248e18]
> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0xf6)
> [0x7eff8a9485a6] ) 0-: Assertion failed:
> GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size)
> [snip]
>
> The whole output can be found here: https://pastebin.com/qTMmFxx0
> gluster volume info here: https://pastebin.com/ENTWZ7j3
>
> After umount + mount the transport endpoint is connected again - until
> the next disconnect. A /core file gets generated. Maybe someone wants
> to have a look at this file?
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] gluster 5.3: transport endpoint gets disconnected - Assertion failed: GF_MEM_TRAILER_MAGIC

2019-01-23 Thread Amar Tumballi Suryanarayan
On Thu, Jan 24, 2019 at 12:47 PM Hu Bert  wrote:

> Good morning,
>
> we currently transfer some data to a new glusterfs volume; to check
> the throughput of the new volume/setup while the transfer is running i
> decided to create some files on one of the gluster servers with dd in
> loop:
>
> while true; do dd if=/dev/urandom of=/shared/private/1G.file bs=1M
> count=1024; rm /shared/private/1G.file; done
>
> /shared/private is the mount point of the glusterfs volume. The dd
> should run for about an hour. But now it happened twice that during
> this loop the transport endpoint gets disconnected:
>
> dd: failed to open '/shared/private/1G.file': Transport endpoint is
> not connected
> rm: cannot remove '/shared/private/1G.file': Transport endpoint is not
> connected
>
> In the /var/log/glusterfs/shared-private.log i see:
>
> [2019-01-24 07:03:28.938745] W [MSGID: 108001]
> [afr-transaction.c:1062:afr_handle_quorum] 0-persistent-replicate-0:
> 7212652e-c437-426c-a0a9-a47f5972fffe: Failing WRITE as quorum i
> s not met [Transport endpoint is not connected]
> [2019-01-24 07:03:28.939280] E [mem-pool.c:331:__gf_free]
>
> (-->/usr/lib/x86_64-linux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be8c)
> [0x7eff84248e8c] -->/usr/lib/x86_64-lin
> ux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be18)
> [0x7eff84248e18]
> -->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0xf6)
> [0x7eff8a9485a6] ) 0-: Assertion failed:
> GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size)
> [snip]
>
> The whole output can be found here: https://pastebin.com/qTMmFxx0
> gluster volume info here: https://pastebin.com/ENTWZ7j3
>
> After umount + mount the transport endpoint is connected again - until
> the next disconnect. A /core file gets generated. Maybe someone wants
> to have a look at this file?
> _


Hi Hu Bert,

Thanks for these logs, and report. 'Transport end point not connected' on a
mount comes for 2 reasons.

1. When the brick (in case of replica all the bricks) having the file is
not reachable, or are down. This gets to normal state when the bricks are
restarted.
2. When the client process crashes/asserts. In this case, /dev/fuse
wouldn't be connected to a process, but mount will still have a reference.
This needs 'umount' and mount again to work.

We will see what is this issue and get back.

Regards,
Amar


> __
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>


-- 
Amar Tumballi (amarts)
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] gluster 5.3: transport endpoint gets disconnected - Assertion failed: GF_MEM_TRAILER_MAGIC

2019-01-23 Thread Hu Bert
Good morning,

we currently transfer some data to a new glusterfs volume; to check
the throughput of the new volume/setup while the transfer is running i
decided to create some files on one of the gluster servers with dd in
loop:

while true; do dd if=/dev/urandom of=/shared/private/1G.file bs=1M
count=1024; rm /shared/private/1G.file; done

/shared/private is the mount point of the glusterfs volume. The dd
should run for about an hour. But now it happened twice that during
this loop the transport endpoint gets disconnected:

dd: failed to open '/shared/private/1G.file': Transport endpoint is
not connected
rm: cannot remove '/shared/private/1G.file': Transport endpoint is not connected

In the /var/log/glusterfs/shared-private.log i see:

[2019-01-24 07:03:28.938745] W [MSGID: 108001]
[afr-transaction.c:1062:afr_handle_quorum] 0-persistent-replicate-0:
7212652e-c437-426c-a0a9-a47f5972fffe: Failing WRITE as quorum i
s not met [Transport endpoint is not connected]
[2019-01-24 07:03:28.939280] E [mem-pool.c:331:__gf_free]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be8c)
[0x7eff84248e8c] -->/usr/lib/x86_64-lin
ux-gnu/glusterfs/5.3/xlator/cluster/replicate.so(+0x5be18)
[0x7eff84248e18]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(__gf_free+0xf6)
[0x7eff8a9485a6] ) 0-: Assertion failed:
GF_MEM_TRAILER_MAGIC == *(uint32_t *)((char *)free_ptr + header->size)
[snip]

The whole output can be found here: https://pastebin.com/qTMmFxx0
gluster volume info here: https://pastebin.com/ENTWZ7j3

After umount + mount the transport endpoint is connected again - until
the next disconnect. A /core file gets generated. Maybe someone wants
to have a look at this file?
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users