Re: [Gluster-users] gluster 3.12.8 fuse consume huge memory

2018-08-31 Thread huting3






If I just mount the client,I the memory will not rise. But when I read and write a lot of files(billions), the client will consume huge memory. You can check it in this way.






  










huting3







huti...@corp.netease.com








签名由
网易邮箱大师
定制

 


On 08/31/2018 23:11,Darrell Budic wrote: 


I’m not seeing any leaks myself, been on 3.12.13 for about 38 hours now, still small.You did restart that node, or at least put it into maintenance (if it’s ovirt) to be sure you restarted the glusterfs processes after updating? That’s a lot of run time unless it’s really busy, so figured I’d check.From: huting3 
Subject: Re: [Gluster-users] gluster 3.12.8 fuse consume huge memory
Date: August 30, 2018 at 10:02:31 PM CDT
To: Darrell Budic
Cc: gluster-users@gluster.org
Thanks for your reply, I also test gluster 3.12.13 and found the client also consumes huge memory:PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND180095 root      20   0 4752256 4.091g   4084 S  43.5  1.6  17:54.70 glusterfsI read and write some files on the gluster fuse client, the client consume 4g memory and it keeps arising.Does it really fixed in 3.12.13?huting3huti...@corp.netease.com签名由 网易邮箱大师 定制On 08/30/2018 22:37,Darrell Budic wrote: It’s probably https://bugzilla.redhat.com/show_bug.cgi?id=1593826, although I did not encounter it in 3.12.8, only 3.12.9 - 12.It’s fixed in 3.12.13.From: huting3 Subject: [Gluster-users] gluster 3.12.8 fuse consume huge memoryDate: August 30, 2018 at 2:02:01 AM CDTTo: gluster-users@gluster.orgThe version of glusterfs I installed is 3.12.8, and I find it`s client also consume huge memory.I dumped the statedump file, and I found the size of a variable is extreamly huge like below:[mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage]  49805 size=4250821416  49806 num_allocs=1  49807 max_size=4294960048  49808 max_num_allocs=3  49809 total_allocs=12330719Is it means a memory leak exist in glusterfs client?huting3huti...@corp.netease.com签名由 网易邮箱大师 定制___Gluster-users mailing listGluster-users@gluster.orghttps://lists.gluster.org/mailman/listinfo/gluster-users



___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Transport endpoint is not connected : issue

2018-08-31 Thread Atin Mukherjee
Can you please pass all the gluster log files from the server where the
transport end point not connected error is reported? As restarting glusterd
didn’t solve this issue, I believe this isn’t a stale port problem but
something else. Also please provide the output of ‘gluster v info ’

(@cc Ravi, Karthik)

On Fri, 31 Aug 2018 at 23:24, Johnson, Tim  wrote:

> Hello all,
>
>
>
>   We have a gluster replicate (with arbiter)  volumes that we are
> getting “Transport endpoint is not connected” with on a rotating basis
>  from each of the two file servers, and a third host that has the arbiter
> bricks on.
>
> This is happening when trying to run a heal on all the volumes on the
> gluster hosts   When I get the status of all the volumes all looks good.
>
>This behavior seems to be a forshadowing of the gluster volumes
> becoming unresponsive to our vm cluster.  As well as one of the file
> servers have two processes for each of the volumes instead of one per
> volume. Eventually the affected file server
>
> will drop off the listed peers. Restarting glusterd/glusterfsd on the
> affected file server does not take care of the issue, we have to bring down
> both file
>
> Servers due to the volumes not being seen by the vm cluster after the
> errors start occurring. I had seen that there were bug reports about the
> “Transport endpoint is not connected” on earlier versions of Gluster
> however had thought that
>
> It had been addressed.
>
>  Dmesg did have some entries for “a possible syn flood on port *”
> which we changed the  sysctl to “net.ipv4.tcp_max_syn_backlog = 2048” which
> seemed to help the syn flood messages but not the underlying volume issues.
>
> I have put the versions of all the Gluster packages installed below as
> well as the   “Heal” and “Status” commands showing the volumes are
>
>
>
>This has just started happening but cannot definitively say if this
> started occurring after an update or not.
>
>
>
>
>
> Thanks for any assistance.
>
>
>
>
>
> Running Heal  :
>
>
>
> gluster volume heal ovirt_engine info
>
> Brick 1.rrc.local:/bricks/brick0/ovirt_engine
>
> Status: Connected
>
> Number of entries: 0
>
>
>
> Brick 3.rrc.local:/bricks/brick0/ovirt_engine
>
> Status: Transport endpoint is not connected
>
> Number of entries: -
>
>
>
> Brick *3.rrc.local:/bricks/arb-brick/ovirt_engine
>
> Status: Transport endpoint is not connected
>
> Number of entries: -
>
>
>
>
>
> Running status :
>
>
>
> gluster volume status ovirt_engine
>
> Status of volume: ovirt_engine
>
> Gluster process TCP Port  RDMA Port  Online
> Pid
>
>
> --
>
> Brick*.rrc.local:/bricks/brick0/ov
>
> irt_engine  49152 0  Y
> 5521
>
> Brick fs2-tier3.rrc.local:/bricks/brick0/ov
>
> irt_engine  49152 0  Y
> 6245
>
> Brick .rrc.local:/bricks/arb-b
>
> rick/ovirt_engine   49152 0  Y
> 3526
>
> Self-heal Daemon on localhost   N/A   N/AY
> 5509
>
> Self-heal Daemon on ***.rrc.local N/A   N/AY   6218
>
> Self-heal Daemon on ***.rrc.local   N/A   N/AY   3501
>
> Self-heal Daemon on .rrc.local N/A   N/AY   3657
>
> Self-heal Daemon on *.rrc.local   N/A   N/AY   3753
>
> Self-heal Daemon on .rrc.local N/A   N/AY   17284
>
>
>
> Task Status of Volume ovirt_engine
>
>
> --
>
> There are no active volume tasks
>
>
>
>
>
>
>
>
>
> /etc/glusterd.vol.   :
>
>
>
>
>
> volume management
>
> type mgmt/glusterd
>
> option working-directory /var/lib/glusterd
>
> option transport-type socket,rdma
>
> option transport.socket.keepalive-time 10
>
> option transport.socket.keepalive-interval 2
>
> option transport.socket.read-fail-log off
>
> option ping-timeout 0
>
> option event-threads 1
>
> option rpc-auth-allow-insecure on
>
> #   option transport.address-family inet6
>
> #   option base-port 49152
>
> end-volume
>
>
>
>
>
>
>
>
>
>
>
> rpm -qa |grep gluster
>
> glusterfs-3.12.13-1.el7.x86_64
>
> glusterfs-gnfs-3.12.13-1.el7.x86_64
>
> glusterfs-api-3.12.13-1.el7.x86_64
>
> glusterfs-cli-3.12.13-1.el7.x86_64
>
> glusterfs-client-xlators-3.12.13-1.el7.x86_64
>
> glusterfs-fuse-3.12.13-1.el7.x86_64
>
> centos-release-gluster312-1.0-2.el7.centos.noarch
>
> glusterfs-rdma-3.12.13-1.el7.x86_64
>
> glusterfs-libs-3.12.13-1.el7.x86_64
>
> glusterfs-server-3.12.13-1.el7.x86_64
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

-- 
- Atin (atinm)
___
Gluster-users mailing list

[Gluster-users] Transport endpoint is not connected : issue

2018-08-31 Thread Johnson, Tim
Hello all,

  We have a gluster replicate (with arbiter)  volumes that we are getting 
“Transport endpoint is not connected” with on a rotating basis  from each of 
the two file servers, and a third host that has the arbiter bricks on.
This is happening when trying to run a heal on all the volumes on the gluster 
hosts   When I get the status of all the volumes all looks good.
   This behavior seems to be a forshadowing of the gluster volumes becoming 
unresponsive to our vm cluster.  As well as one of the file servers have two 
processes for each of the volumes instead of one per volume. Eventually the 
affected file server
will drop off the listed peers. Restarting glusterd/glusterfsd on the affected 
file server does not take care of the issue, we have to bring down both file
Servers due to the volumes not being seen by the vm cluster after the errors 
start occurring. I had seen that there were bug reports about the “Transport 
endpoint is not connected” on earlier versions of Gluster however had thought 
that
It had been addressed.
 Dmesg did have some entries for “a possible syn flood on port *” which we 
changed the  sysctl to “net.ipv4.tcp_max_syn_backlog = 2048” which seemed to 
help the syn flood messages but not the underlying volume issues.
I have put the versions of all the Gluster packages installed below as well 
as the   “Heal” and “Status” commands showing the volumes are

   This has just started happening but cannot definitively say if this 
started occurring after an update or not.


Thanks for any assistance.


Running Heal  :

gluster volume heal ovirt_engine info
Brick 1.rrc.local:/bricks/brick0/ovirt_engine
Status: Connected
Number of entries: 0

Brick 3.rrc.local:/bricks/brick0/ovirt_engine
Status: Transport endpoint is not connected
Number of entries: -

Brick *3.rrc.local:/bricks/arb-brick/ovirt_engine
Status: Transport endpoint is not connected
Number of entries: -


Running status :

gluster volume status ovirt_engine
Status of volume: ovirt_engine
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick*.rrc.local:/bricks/brick0/ov
irt_engine  49152 0  Y   5521
Brick fs2-tier3.rrc.local:/bricks/brick0/ov
irt_engine  49152 0  Y   6245
Brick .rrc.local:/bricks/arb-b
rick/ovirt_engine   49152 0  Y   3526
Self-heal Daemon on localhost   N/A   N/AY   5509
Self-heal Daemon on ***.rrc.local N/A   N/AY   6218
Self-heal Daemon on ***.rrc.local   N/A   N/AY   3501
Self-heal Daemon on .rrc.local N/A   N/AY   3657
Self-heal Daemon on *.rrc.local   N/A   N/AY   3753
Self-heal Daemon on .rrc.local N/A   N/AY   17284

Task Status of Volume ovirt_engine
--
There are no active volume tasks




/etc/glusterd.vol.   :


volume management
type mgmt/glusterd
option working-directory /var/lib/glusterd
option transport-type socket,rdma
option transport.socket.keepalive-time 10
option transport.socket.keepalive-interval 2
option transport.socket.read-fail-log off
option ping-timeout 0
option event-threads 1
option rpc-auth-allow-insecure on
#   option transport.address-family inet6
#   option base-port 49152
end-volume





rpm -qa |grep gluster
glusterfs-3.12.13-1.el7.x86_64
glusterfs-gnfs-3.12.13-1.el7.x86_64
glusterfs-api-3.12.13-1.el7.x86_64
glusterfs-cli-3.12.13-1.el7.x86_64
glusterfs-client-xlators-3.12.13-1.el7.x86_64
glusterfs-fuse-3.12.13-1.el7.x86_64
centos-release-gluster312-1.0-2.el7.centos.noarch
glusterfs-rdma-3.12.13-1.el7.x86_64
glusterfs-libs-3.12.13-1.el7.x86_64
glusterfs-server-3.12.13-1.el7.x86_64
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Bug with hardlink limitation in 3.12.13 ?

2018-08-31 Thread Reiner Keller
Hello,

Am 31.08.2018 um 13:59 schrieb Shyam Ranganathan:
> I suspect you have hit this:
> https://bugzilla.redhat.com/show_bug.cgi?id=1602262#c5
>
> I further suspect your older setup was 3.10 based and not 3.12 based.
>
> There is an additional feature added in 3.12 that stores GFID to path
> conversion details using xattrs (see "GFID to path" in
> https://docs.gluster.org/en/latest/release-notes/3.12.0/#major-changes-and-features
> )
>
> Due to which xattr storage limit is reached/breached on ext4 based bricks.
>
> To check if you are facing similar issue to the one in the bug provided
> above, I would check if the brick logs throw up the no space error on a
> gfid2path set failure.

thanks for the hint.

>From log output (= no gfid2path errors) it seems to be not the problem
although the old
gluster volume was setup with version 3.10.x (or even 3.8.x i think).

I wrote I could reproduce it on new ext4  and on old xfs gluster volumes
with version
3.12.13 while it was running fine with ~ 3.12.8 (half year ago) without
problems.

But just saw that my old main volume wasn't/isn't xfs but also ext4.
Digging into logs I could see that I was running in January still 3.10.8
/ 3.10.9
and initial switched in April to 3.12.9 / 3.12 version branch.

>From entry sizes/differences your suggestion would fit:

    https://manpages.debian.org/testing/manpages/xattr.7.en.html or
    http://man7.org/linux/man-pages/man5/attr.5.html

  In the current ext2, ext3, and ext4 filesystem implementations, the
   total bytes used by the names and values of all of a file's extended
   attributes must fit in a single filesystem block (1024, 2048 or 4096
   bytes, depending on the block size specified when the filesystem was
   created).

because I can see differences by volume setup type:

* with ext4 setup "defaults" i got error after 44 successful links:

/etc/mke2fs.conf:

[defaults]
    base_features =
sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr
    default_mntopts = acl,user_xattr
    enable_periodic_fsck = 0
    blocksize = 4096
    inode_size = 256
    inode_ratio = 16384

[fs_types]
    ext3 = {
    features = has_journal
    }
    ext4 = {
    features =

has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize
    inode_size = 256
    }
...

* with ext4 setup "small" with enhanced settings back to inode_size=256
while I formatted it I could setup only 10 successful links:

    small = {
    blocksize = 1024
    inode_size = 128   # in my
volume case also 256
    inode_ratio = 4096
    }

which would match the blocksize limitation - here in default ext4 fs:

# attr -l test
Attribute "gfid2path.3951a8fec4234683" has a 41 byte value for test
Attribute "gfid" has a 16 byte value for test
Attribute "afr.dirty" has a 12 byte value for test
Attribute "gfid2path.003214300fcd4d34" has a 44 byte value for test
...
Attribute "gfid2path.fe4d3e4d0bc31351" has a 44 byte value for test
# attr -l test | grep gfid2path | wc -l
46

41 + 16 + 12 + 45 * 44 = 2049 (+ 256 inode_size + ???  )  <= 4096

with 1k blocksize I got only:

# attr -l test
Attribute "gfid2path.7a3f0fa0e8f7eba3" has a 41 byte value for test
Attribute "gfid" has a 16 byte value for test
Attribute "afr.dirty" has a 12 byte value for test
Attribute "gfid2path.13e24c98a492d7f1" has a 43 byte value for test
Attribute "gfid2path.1efa5641f9785d6c" has a 43 byte value for test
Attribute "gfid2path.551dfafc5d4a7bda" has a 43 byte value for test
Attribute "gfid2path.578dc56f20801437" has a 43 byte value for test
Attribute "gfid2path.8e983883502e3c57" has a 43 byte value for test
Attribute "gfid2path.94b700e1c7f156e3" has a 43 byte value for test
Attribute "gfid2path.cbeb1108f9a34dac" has a 43 byte value for test
Attribute "gfid2path.cd6ba60f624abc2b" has a 43 byte value for test
Attribute "gfid2path.dbf95647d59cd047" has a 43 byte value for test
Attribute "gfid2path.ec6198adc227befe" has a 44 byte value for test

* 41 + 16 + 12 + 9 * 43 + 44 = 500 (+256 inode_size + ???) <= 1024

whatever the unknown missing (different) size is needed for.


But in log I can see only this error which is not very helpful (here
tested on another volume with ext4 "default" settings):

[2018-08-31 13:21:11.306022] W [MSGID: 114031]
[client-rpc-fops.c:2701:client3_3_link_cbk]
0-staging-prudsys-client-0: remote operation failed: (/test/test-45
-> /test/test-46) [No space left on device]
[2018-08-31 13:21:11.306420] W [MSGID: 114031]
[client-rpc-fops.c:2701:client3_3_link_cbk]
0-staging-prudsys-client-2: remote operation failed: (/test/test-45
-> /test/test-46) [No space left on device]
  

Re: [Gluster-users] gluster 3.12.8 fuse consume huge memory

2018-08-31 Thread Darrell Budic
I’m not seeing any leaks myself, been on 3.12.13 for about 38 hours now, still 
small.

You did restart that node, or at least put it into maintenance (if it’s ovirt) 
to be sure you restarted the glusterfs processes after updating? That’s a lot 
of run time unless it’s really busy, so figured I’d check.
> From: huting3 
> Subject: Re: [Gluster-users] gluster 3.12.8 fuse consume huge memory
> Date: August 30, 2018 at 10:02:31 PM CDT
> To: Darrell Budic
> Cc: gluster-users@gluster.org
> 
> Thanks for your reply, I also test gluster 3.12.13 and found the client also 
> consumes huge memory:
> 
> PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
> 180095 root  20   0 4752256 4.091g   4084 S  43.5  1.6  17:54.70 glusterfs
> 
> I read and write some files on the gluster fuse client, the client consume 4g 
> memory and it keeps arising.
> 
> Does it really fixed in 3.12.13?
> 
>   
> huting3
> 
> huti...@corp.netease.com
>  
> 
> 签名由 网易邮箱大师  定制
> 
> On 08/30/2018 22:37,Darrell Budic 
>  wrote: 
> It’s probably https://bugzilla.redhat.com/show_bug.cgi?id=1593826 
> , although I did not 
> encounter it in 3.12.8, only 3.12.9 - 12.
> 
> It’s fixed in 3.12.13.
> 
>> From: huting3 mailto:huti...@corp.netease.com>>
>> Subject: [Gluster-users] gluster 3.12.8 fuse consume huge memory
>> Date: August 30, 2018 at 2:02:01 AM CDT
>> To: gluster-users@gluster.org 
>> 
>> The version of glusterfs I installed is 3.12.8, and I find it`s client also 
>> consume huge memory.
>> 
>> 
>> 
>> I dumped the statedump file, and I found the size of a variable is extreamly 
>> huge like below:
>> 
>> 
>> 
>> [mount/fuse.fuse - usage-type gf_fuse_mt_iov_base memusage]
>> 
>>   49805 size=4250821416
>> 
>>   49806 num_allocs=1
>> 
>>   49807 max_size=4294960048
>> 
>>   49808 max_num_allocs=3
>> 
>>   49809 total_allocs=12330719
>> 
>> 
>> 
>> Is it means a memory leak exist in glusterfs client?
>> 
>> 
>>  
>> huting3
>> 
>> huti...@corp.netease.com
>>  
>> 
>> 签名由 网易邮箱大师  定制
>> 
>> ___
>> Gluster-users mailing list
>> Gluster-users@gluster.org 
>> https://lists.gluster.org/mailman/listinfo/gluster-users 
>> 
> 

___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: file metadata operations performance - gluster 4.1

2018-08-31 Thread Davide Obbi
it didnt make a difference. I will try to re-configure with a 2x3 config

On Fri, Aug 31, 2018 at 1:48 PM Raghavendra Gowdappa 
wrote:

> another relevant option is setting cluster.lookup-optimize on.
>
> On Fri, Aug 31, 2018 at 3:22 PM, Davide Obbi 
> wrote:
>
>> #gluster vol set VOLNAME group nl-cache --> didn't know there are groups
>> of options, after this command i got set the following:
>> performance.nl-cache-timeout: 600
>> performance.nl-cache: on
>> performance.parallel-readdir: on
>> performance.io-thread-count: 64
>> network.inode-lru-limit: 20
>>
>> to note that i had network.inode-lru-limit set to max and got reduced to
>> 20
>>
>> then i added
>> performance.nl-cache-positive-entry: on
>>
>> The volume options:
>> Options Reconfigured:
>> performance.nl-cache-timeout: 600
>> performance.nl-cache: on
>> performance.nl-cache-positive-entry: on
>> performance.parallel-readdir: on
>> performance.io-thread-count: 64
>> network.inode-lru-limit: 20
>> nfs.disable: on
>> transport.address-family: inet
>> performance.readdir-ahead: on
>> features.cache-invalidation-timeout: 600
>> features.cache-invalidation: on
>> performance.md-cache-timeout: 600
>> performance.stat-prefetch: on
>> performance.cache-invalidation: on
>> performance.cache-size: 10GB
>> network.ping-timeout: 5
>> diagnostics.client-log-level: WARNING
>> diagnostics.brick-log-level: WARNING
>> features.quota: off
>> features.inode-quota: off
>> performance.quick-read: on
>>
>> untar completed in 08mins 30secs
>>
>> increasing network.inode-lru-limit to 1048576 untar completed in the same
>> time
>> I have attached the gluster profile results of the last test, with
>> network.inode-lru-limit to 1048576
>>
>> I guess the next test will be creating more bricks for the same volume to
>> have a 2x3. Since i do not see bottlenecks at the disk level and i have
>> limited hw ATM i will just carve out the bricks from LVs from the same 1
>> disk VG.
>>
>> Also i have tried to look for a complete list of options/description
>> unsuccessfully can you point at one?
>>
>> thanks
>> Davide
>>
>> On Thu, Aug 30, 2018 at 5:47 PM Poornima Gurusiddaiah <
>> pguru...@redhat.com> wrote:
>>
>>> To enable nl-cache please use group option instead of single volume set:
>>>
>>> #gluster vol set VOLNAME group nl-cache
>>>
>>> This sets few other things including time out, invalidation etc.
>>>
>>> For enabling the option Raghavendra mentioned, you ll have to execute it
>>> explicitly, as it's not part of group option yet:
>>>
>>> #gluster vol set VOLNAME performance.nl-cache-positive-entry on
>>>
>>> Also from the past experience, setting the below option has helped in
>>> performance:
>>>
>>> # gluster vol set VOLNAME network.inode-lru-limit 20
>>>
>>> Regards,
>>> Poornima
>>>
>>>
>>> On Thu, Aug 30, 2018, 8:49 PM Raghavendra Gowdappa 
>>> wrote:
>>>


 On Thu, Aug 30, 2018 at 8:38 PM, Davide Obbi 
 wrote:

> yes "performance.parallel-readdir on and 1x3 replica
>

 That's surprising. I thought performance.parallel-readdir will help
 only when distribute count is fairly high. This is something worth
 investigating further.


> On Thu, Aug 30, 2018 at 5:00 PM Raghavendra Gowdappa <
> rgowd...@redhat.com> wrote:
>
>>
>>
>> On Thu, Aug 30, 2018 at 8:08 PM, Davide Obbi > > wrote:
>>
>>> Thanks Amar,
>>>
>>> i have enabled the negative lookups cache on the volume:
>>>
>>
 I think enabling nl-cache-positive-entry might help for untarring or
 git clone into glusterfs. Its disabled by default. can you let us know the
 results?

 Option: performance.nl-cache-positive-entry
 Default Value: (null)
 Description: enable/disable storing of entries that were lookedup and
 found to be present in the volume, thus lookup on non existent file is
 served from the cache


>>> To deflate a tar archive (not compressed) of 1.3GB it takes aprox
>>> 9mins which can be considered a slight improvement from the previous 
>>> 12-15
>>> however still not fast enough compared to local disk. The tar is 
>>> present on
>>> the gluster share/volume and deflated inside the same folder structure.
>>>
>>
>> I am assuming this is with parallel-readdir enabled, right?
>>
>>
>>> Running the operation twice (without removing the already deflated
>>> files) also did not reduce the time spent.
>>>
>>> Running the operation with the tar archive on local disk made no
>>> difference
>>>
>>> What really made a huge difference while git cloning was setting
>>> "performance.parallel-readdir on". During the phase "Receiving objects" 
>>> ,
>>> as i enabled the xlator it bumped up from 3/4MBs to 27MBs
>>>
>>
>> What is the distribute count? Is it 1x3 replica?
>>
>>
>>> So in conclusion i'm trying to make the untar operation working at
>>> 

Re: [Gluster-users] Bug with hardlink limitation in 3.12.13 ?

2018-08-31 Thread Shyam Ranganathan
On 08/31/2018 07:15 AM, Reiner Keller wrote:
> Hello,
> 
> I got yesterday unexpected error "No space left on device" on my new
> gluster volume caused by too many hardlinks.
> This happened while I done "rsync --aAHXxv ..." replication from old
> gluster to new gluster servers - each running latest version 3.12.13
> (for changing volume schema from 2x2 to 3x1 with quorum and a fresh
> Debian Stretch setup instead Jessie).

I suspect you have hit this:
https://bugzilla.redhat.com/show_bug.cgi?id=1602262#c5

I further suspect your older setup was 3.10 based and not 3.12 based.

There is an additional feature added in 3.12 that stores GFID to path
conversion details using xattrs (see "GFID to path" in
https://docs.gluster.org/en/latest/release-notes/3.12.0/#major-changes-and-features
)

Due to which xattr storage limit is reached/breached on ext4 based bricks.

To check if you are facing similar issue to the one in the bug provided
above, I would check if the brick logs throw up the no space error on a
gfid2path set failure.

To get around the problem, I would suggest using xfs as the backing FS
for the brick (considering you have close to 250 odd hardlinks to a
file). I would not attempt to disable the gfid2path feature, as that is
useful in getting to the real file just given a GFID and is already part
of core on disk Gluster metadata (It can be shut off, but I would
refrain from it).

> 
> When I deduplicated it around half a year ago with "rdfind" hardlinking
> was working fine (I think that was glusterfs around version 3.12.8 -
> 3.12.10 ?)
> 
> My search for documentation found only the parameter
> "storage.max-hardlinks" with default of 100 for version 4.0.
> I checked it in my gluster 3.12.13 but here the parameter is not yet
> implemented.
> 
> I tested/proofed it by running my small test on underlaying ext4
> filesystem brick directly and on gluster volume using same ext4
> filesystem of the brick:
> 
> Testline for it:
>             mkdir test; cd test; echo "hello" > test; for I in $(seq 1
> 100); do ln test test-$I ; done
> 
> * on ext4 fs (old brick: xfs) I could do 100 hardlinks without problems
> (from documentation I found ext has 65.000 hardlinks compiled in )
> * on actual GlusterFS (same on my old and new gluster volumes) I could
> do only up to 45 hardlinks now
> 
> But from deduplication around 6 months ago I could find e.g. a file with
> 240 hardlinks setup and there is no problem using these referenced files
> (caused by multiple languages / multiple uploads per language ,
> production/staging system cloned... ).
> 
> My actual workaround has to be using duplicated content but it would be
> great if this could be fixed in next versions ;)
> 
> (Saltstack didn't support yet successful setup of glusterfs 4.0
> peers/volumes; something in output of "gluster --xml --mode=script" call
> must be weird but I haven't seen any differences so far)
> 
> Bests
> 
> 
> Reiner
> 
> 
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
> 
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: file metadata operations performance - gluster 4.1

2018-08-31 Thread Raghavendra Gowdappa
another relevant option is setting cluster.lookup-optimize on.

On Fri, Aug 31, 2018 at 3:22 PM, Davide Obbi 
wrote:

> #gluster vol set VOLNAME group nl-cache --> didn't know there are groups
> of options, after this command i got set the following:
> performance.nl-cache-timeout: 600
> performance.nl-cache: on
> performance.parallel-readdir: on
> performance.io-thread-count: 64
> network.inode-lru-limit: 20
>
> to note that i had network.inode-lru-limit set to max and got reduced to
> 20
>
> then i added
> performance.nl-cache-positive-entry: on
>
> The volume options:
> Options Reconfigured:
> performance.nl-cache-timeout: 600
> performance.nl-cache: on
> performance.nl-cache-positive-entry: on
> performance.parallel-readdir: on
> performance.io-thread-count: 64
> network.inode-lru-limit: 20
> nfs.disable: on
> transport.address-family: inet
> performance.readdir-ahead: on
> features.cache-invalidation-timeout: 600
> features.cache-invalidation: on
> performance.md-cache-timeout: 600
> performance.stat-prefetch: on
> performance.cache-invalidation: on
> performance.cache-size: 10GB
> network.ping-timeout: 5
> diagnostics.client-log-level: WARNING
> diagnostics.brick-log-level: WARNING
> features.quota: off
> features.inode-quota: off
> performance.quick-read: on
>
> untar completed in 08mins 30secs
>
> increasing network.inode-lru-limit to 1048576 untar completed in the same
> time
> I have attached the gluster profile results of the last test, with
> network.inode-lru-limit to 1048576
>
> I guess the next test will be creating more bricks for the same volume to
> have a 2x3. Since i do not see bottlenecks at the disk level and i have
> limited hw ATM i will just carve out the bricks from LVs from the same 1
> disk VG.
>
> Also i have tried to look for a complete list of options/description
> unsuccessfully can you point at one?
>
> thanks
> Davide
>
> On Thu, Aug 30, 2018 at 5:47 PM Poornima Gurusiddaiah 
> wrote:
>
>> To enable nl-cache please use group option instead of single volume set:
>>
>> #gluster vol set VOLNAME group nl-cache
>>
>> This sets few other things including time out, invalidation etc.
>>
>> For enabling the option Raghavendra mentioned, you ll have to execute it
>> explicitly, as it's not part of group option yet:
>>
>> #gluster vol set VOLNAME performance.nl-cache-positive-entry on
>>
>> Also from the past experience, setting the below option has helped in
>> performance:
>>
>> # gluster vol set VOLNAME network.inode-lru-limit 20
>>
>> Regards,
>> Poornima
>>
>>
>> On Thu, Aug 30, 2018, 8:49 PM Raghavendra Gowdappa 
>> wrote:
>>
>>>
>>>
>>> On Thu, Aug 30, 2018 at 8:38 PM, Davide Obbi 
>>> wrote:
>>>
 yes "performance.parallel-readdir on and 1x3 replica

>>>
>>> That's surprising. I thought performance.parallel-readdir will help only
>>> when distribute count is fairly high. This is something worth investigating
>>> further.
>>>
>>>
 On Thu, Aug 30, 2018 at 5:00 PM Raghavendra Gowdappa <
 rgowd...@redhat.com> wrote:

>
>
> On Thu, Aug 30, 2018 at 8:08 PM, Davide Obbi 
> wrote:
>
>> Thanks Amar,
>>
>> i have enabled the negative lookups cache on the volume:
>>
>
>>> I think enabling nl-cache-positive-entry might help for untarring or git
>>> clone into glusterfs. Its disabled by default. can you let us know the
>>> results?
>>>
>>> Option: performance.nl-cache-positive-entry
>>> Default Value: (null)
>>> Description: enable/disable storing of entries that were lookedup and
>>> found to be present in the volume, thus lookup on non existent file is
>>> served from the cache
>>>
>>>
>> To deflate a tar archive (not compressed) of 1.3GB it takes aprox
>> 9mins which can be considered a slight improvement from the previous 
>> 12-15
>> however still not fast enough compared to local disk. The tar is present 
>> on
>> the gluster share/volume and deflated inside the same folder structure.
>>
>
> I am assuming this is with parallel-readdir enabled, right?
>
>
>> Running the operation twice (without removing the already deflated
>> files) also did not reduce the time spent.
>>
>> Running the operation with the tar archive on local disk made no
>> difference
>>
>> What really made a huge difference while git cloning was setting
>> "performance.parallel-readdir on". During the phase "Receiving objects" ,
>> as i enabled the xlator it bumped up from 3/4MBs to 27MBs
>>
>
> What is the distribute count? Is it 1x3 replica?
>
>
>> So in conclusion i'm trying to make the untar operation working at an
>> acceptable level, not expecting local disks speed but at least being 
>> within
>> the 4mins
>>
>> I have attached the profiles collected at the end of the untar
>> operations with the archive on the mount and outside
>>
>> thanks
>> Davide
>>
>>
>> On Tue, Aug 28, 2018 at 

[Gluster-users] Bug with hardlink limitation in 3.12.13 ?

2018-08-31 Thread Reiner Keller
Hello,

I got yesterday unexpected error "No space left on device" on my new
gluster volume caused by too many hardlinks.
This happened while I done "rsync --aAHXxv ..." replication from old
gluster to new gluster servers - each running latest version 3.12.13
(for changing volume schema from 2x2 to 3x1 with quorum and a fresh
Debian Stretch setup instead Jessie).

When I deduplicated it around half a year ago with "rdfind" hardlinking
was working fine (I think that was glusterfs around version 3.12.8 -
3.12.10 ?)

My search for documentation found only the parameter
"storage.max-hardlinks" with default of 100 for version 4.0.
I checked it in my gluster 3.12.13 but here the parameter is not yet
implemented.

I tested/proofed it by running my small test on underlaying ext4
filesystem brick directly and on gluster volume using same ext4
filesystem of the brick:

Testline for it:
            mkdir test; cd test; echo "hello" > test; for I in $(seq 1
100); do ln test test-$I ; done

* on ext4 fs (old brick: xfs) I could do 100 hardlinks without problems
(from documentation I found ext has 65.000 hardlinks compiled in )
* on actual GlusterFS (same on my old and new gluster volumes) I could
do only up to 45 hardlinks now

But from deduplication around 6 months ago I could find e.g. a file with
240 hardlinks setup and there is no problem using these referenced files
(caused by multiple languages / multiple uploads per language ,
production/staging system cloned... ).

My actual workaround has to be using duplicated content but it would be
great if this could be fixed in next versions ;)

(Saltstack didn't support yet successful setup of glusterfs 4.0
peers/volumes; something in output of "gluster --xml --mode=script" call
must be weird but I haven't seen any differences so far)

Bests


Reiner



___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [External] Re: file metadata operations performance - gluster 4.1

2018-08-31 Thread Davide Obbi
#gluster vol set VOLNAME group nl-cache --> didn't know there are groups of
options, after this command i got set the following:
performance.nl-cache-timeout: 600
performance.nl-cache: on
performance.parallel-readdir: on
performance.io-thread-count: 64
network.inode-lru-limit: 20

to note that i had network.inode-lru-limit set to max and got reduced to
20

then i added
performance.nl-cache-positive-entry: on

The volume options:
Options Reconfigured:
performance.nl-cache-timeout: 600
performance.nl-cache: on
performance.nl-cache-positive-entry: on
performance.parallel-readdir: on
performance.io-thread-count: 64
network.inode-lru-limit: 20
nfs.disable: on
transport.address-family: inet
performance.readdir-ahead: on
features.cache-invalidation-timeout: 600
features.cache-invalidation: on
performance.md-cache-timeout: 600
performance.stat-prefetch: on
performance.cache-invalidation: on
performance.cache-size: 10GB
network.ping-timeout: 5
diagnostics.client-log-level: WARNING
diagnostics.brick-log-level: WARNING
features.quota: off
features.inode-quota: off
performance.quick-read: on

untar completed in 08mins 30secs

increasing network.inode-lru-limit to 1048576 untar completed in the same
time
I have attached the gluster profile results of the last test, with
network.inode-lru-limit to 1048576

I guess the next test will be creating more bricks for the same volume to
have a 2x3. Since i do not see bottlenecks at the disk level and i have
limited hw ATM i will just carve out the bricks from LVs from the same 1
disk VG.

Also i have tried to look for a complete list of options/description
unsuccessfully can you point at one?

thanks
Davide

On Thu, Aug 30, 2018 at 5:47 PM Poornima Gurusiddaiah 
wrote:

> To enable nl-cache please use group option instead of single volume set:
>
> #gluster vol set VOLNAME group nl-cache
>
> This sets few other things including time out, invalidation etc.
>
> For enabling the option Raghavendra mentioned, you ll have to execute it
> explicitly, as it's not part of group option yet:
>
> #gluster vol set VOLNAME performance.nl-cache-positive-entry on
>
> Also from the past experience, setting the below option has helped in
> performance:
>
> # gluster vol set VOLNAME network.inode-lru-limit 20
>
> Regards,
> Poornima
>
>
> On Thu, Aug 30, 2018, 8:49 PM Raghavendra Gowdappa 
> wrote:
>
>>
>>
>> On Thu, Aug 30, 2018 at 8:38 PM, Davide Obbi 
>> wrote:
>>
>>> yes "performance.parallel-readdir on and 1x3 replica
>>>
>>
>> That's surprising. I thought performance.parallel-readdir will help only
>> when distribute count is fairly high. This is something worth investigating
>> further.
>>
>>
>>> On Thu, Aug 30, 2018 at 5:00 PM Raghavendra Gowdappa <
>>> rgowd...@redhat.com> wrote:
>>>


 On Thu, Aug 30, 2018 at 8:08 PM, Davide Obbi 
 wrote:

> Thanks Amar,
>
> i have enabled the negative lookups cache on the volume:
>

>> I think enabling nl-cache-positive-entry might help for untarring or git
>> clone into glusterfs. Its disabled by default. can you let us know the
>> results?
>>
>> Option: performance.nl-cache-positive-entry
>> Default Value: (null)
>> Description: enable/disable storing of entries that were lookedup and
>> found to be present in the volume, thus lookup on non existent file is
>> served from the cache
>>
>>
> To deflate a tar archive (not compressed) of 1.3GB it takes aprox
> 9mins which can be considered a slight improvement from the previous 12-15
> however still not fast enough compared to local disk. The tar is present 
> on
> the gluster share/volume and deflated inside the same folder structure.
>

 I am assuming this is with parallel-readdir enabled, right?


> Running the operation twice (without removing the already deflated
> files) also did not reduce the time spent.
>
> Running the operation with the tar archive on local disk made no
> difference
>
> What really made a huge difference while git cloning was setting
> "performance.parallel-readdir on". During the phase "Receiving objects" ,
> as i enabled the xlator it bumped up from 3/4MBs to 27MBs
>

 What is the distribute count? Is it 1x3 replica?


> So in conclusion i'm trying to make the untar operation working at an
> acceptable level, not expecting local disks speed but at least being 
> within
> the 4mins
>
> I have attached the profiles collected at the end of the untar
> operations with the archive on the mount and outside
>
> thanks
> Davide
>
>
> On Tue, Aug 28, 2018 at 8:41 AM Amar Tumballi 
> wrote:
>
>> One of the observation we had with git clone like work load was,
>> nl-cache (negative-lookup cache), helps here.
>>
>> Try 'gluster volume set $volume-name nl-cache enable'.
>>
>> Also sharing the 'profile info' during this performance observations
>> also 

Re: [Gluster-users] Was: Upgrade to 4.1.2 geo-replication does not work Now: Upgraded to 4.1.3 geo node Faulty

2018-08-31 Thread Kotresh Hiremath Ravishankar
Hi Marcus,

Could you attach full logs? Is the same trace back happening repeatedly? It
will be helpful you attach the corresponding mount log as well.
What's the rsync version, you are using?

Thanks,
Kotresh HR

On Fri, Aug 31, 2018 at 12:16 PM, Marcus Pedersén 
wrote:

> Hi all,
>
> I had problems with stopping sync after upgrade to 4.1.2.
>
> I upgraded to 4.1.3 and it ran fine for one day, but now one of the master
> nodes shows faulty.
>
> Most of the sync jobs have return code 23, how do I resolve this?
>
> I see messages like:
>
> _GMaster: Sucessfully fixed all entry ops with gfid mismatch
>
> Will this resolve error code 23?
>
> There is also a python error.
>
> The python error was a selinux problem, turning off selinux made node go
> to active again.
>
> See log below.
>
>
> CentOS 7, installed through SIG Gluster (OS updated to latest at the same
> time)
>
> Master cluster: 2 x (2 + 1) distributed, replicated
>
> Client cluster: 1 x (2 + 1) replicated
>
>
> Many thanks in advance!
>
>
> Best regards
>
> Marcus Pedersén
>
>
>
> gsyncd.log from Faulty node:
>
> [2018-08-31 06:25:51.375267] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.8099 num_files=57job=3
> return_code=23
> [2018-08-31 06:25:51.465895] I [master(worker /urd-gds/gluster):1944:syncjob]
> Syncer: Sync Time Taken   duration=0.0904 num_files=3 job=3
> return_code=23
> [2018-08-31 06:25:52.562107] E [repce(worker /urd-gds/gluster):197:__call__]
> RepceClient: call failed   call=30069:139655665837888:1535696752.35
> method=entry_opserror=OSError
> [2018-08-31 06:25:52.562346] E [syncdutils(worker
> /urd-gds/gluster):332:log_raise_exception] : FAIL:
> Traceback (most recent call last):
>   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, in
> main
> func(args)
>   File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, in
> subcmd_worker
> local.service_loop(remote)
>   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1288,
> in service_loop
> g3.crawlwrap(oneshot=True)
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, in
> crawlwrap
> self.crawl()
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1545,
> in crawl
> self.changelogs_batch_process(changes)
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1445,
> in changelogs_batch_process
> self.process(batch)
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1280,
> in process
> self.process_change(change, done, retry)
>   File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1179,
> in process_change
> failures = self.slave.server.entry_ops(entries)
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in
> __call__
> return self.ins(self.meth, *a)
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in
> __call__
> raise res
> OSError: [Errno 13] Permission denied
> [2018-08-31 06:25:52.578367] I [repce(agent /urd-gds/gluster):80:service_loop]
> RepceServer: terminating on reaching EOF.
> [2018-08-31 06:25:53.558765] I [monitor(monitor):279:monitor] Monitor:
> worker died in startup phase brick=/urd-gds/gluster
> [2018-08-31 06:25:53.569777] I [gsyncdstatus(monitor):244:set_worker_status]
> GeorepStatus: Worker Status Change status=Faulty
> [2018-08-31 06:26:03.593161] I [monitor(monitor):158:monitor] Monitor:
> starting gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
> [2018-08-31 06:26:03.636452] I [gsyncd(agent /urd-gds/gluster):297:main]
> : Using session config file   path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-08-31 06:26:03.636810] I [gsyncd(worker /urd-gds/gluster):297:main]
> : Using session config file  path=/var/lib/glusterd/geo-
> replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
> [2018-08-31 06:26:03.637486] I [changelogagent(agent
> /urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
> [2018-08-31 06:26:03.650330] I [resource(worker 
> /urd-gds/gluster):1377:connect_remote]
> SSH: Initializing SSH connection between master and slave...
> [2018-08-31 06:26:05.296473] I [resource(worker 
> /urd-gds/gluster):1424:connect_remote]
> SSH: SSH connection between master and slave established.
> duration=1.6457
> [2018-08-31 06:26:05.297904] I [resource(worker 
> /urd-gds/gluster):1096:connect]
> GLUSTER: Mounting gluster volume locally...
> [2018-08-31 06:26:06.396939] I [resource(worker 
> /urd-gds/gluster):1119:connect]
> GLUSTER: Mounted gluster volume duration=1.0985
> [2018-08-31 06:26:06.397691] I [subcmds(worker 
> /urd-gds/gluster):70:subcmd_worker]
> : Worker spawn successful. Acknowledging back to monitor
> [2018-08-31 06:26:16.815566] I [master(worker /urd-gds/gluster):1593:register]
> _GMaster: Working dirpath=/var/lib/misc/gluster/
> 

Re: [Gluster-users] Geo-Replication Faulty.

2018-08-31 Thread Tiemen Ruiten
I had the same issue a few weeks ago and found this thread which helped me
resolve it:
https://lists.gluster.org/pipermail/gluster-users/2018-July/034465.html

On 28 August 2018 at 08:50, Krishna Verma  wrote:

> Hi All,
>
>
>
> I need a help in setup geo replication as its getting faulty.
>
>
>
> I did the following :
>
>
>
>1. Setup 2 node gluster with replicated volume.
>2. Setup single node slave with gluster volume.
>3. Setup geo-replication with master and slave but its status is
>getting faulty.
>
>
>
> I have installed “glusterfs 4.1.2” on all the nodes.
>
>
>
> In logs I was getting below error:
>
>
>
> 2018-08-28 04:39:00.639724] E [syncdutils(worker
> /data/gluster/gv0):753:logerr] Popen: ssh> failure: execution of
> "/usr/local/sbin/glusterfs" failed with ENOENT (No such file or directory)
>
> My gluster binaries are in /use/sbin, So I did :
>
>
>
> gluster volume geo-replication glusterep gluster-poc-sj::glusterep config
> gluster_command_dir /usr/sbin/
>
> gluster volume geo-replication glusterep gluster-poc-sj::glusterep config
> slave_gluster_command_dir /usr/sbin/
>
>
>
> I also created the links as below :
>
>
>
> ln -s /usr/sbin/gluster /usr/local/sbin/gluster
>
> ln -s /usr/sbin/glusterfs /usr/local/sbin/glusterfs
>
>
>
> But status is still faulty after restarted glusterd and created a session
> again.
>
>
>
> MASTER NODE  MASTER VOLMASTER BRICK SLAVE USER
> SLAVESLAVE NODESTATUSCRAWL STATUS
> LAST_SYNCED
>
> 
> 
> -
>
> gluster-poc-noidaglusterep /data/gluster/gv0root
> gluster-poc-sj::glusterepN/A   FaultyN/A N/A
>
> noi-poc-gluster  glusterep /data/gluster/gv0root
> gluster-poc-sj::glusterepN/A   FaultyN/A N/A
>
>
>
> And now I am getting errors in logs for libraries.
>
>
>
> 
>
> OSError: libgfchangelog.so: cannot open shared object file: No such file
> or directory
>
> [2018-08-28 06:46:46.667423] E [repce(worker /data/gluster/gv0):197:__call__]
> RepceClient: call failed  call=19929:140516964480832:1535438806.66
> method=init error=OSError
>
> [2018-08-28 06:46:46.667567] E [syncdutils(worker
> /data/gluster/gv0):330:log_raise_exception] : FAIL:
>
> Traceback (most recent call last):
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, in
> main
>
> func(args)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, in
> subcmd_worker
>
> local.service_loop(remote)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1236,
> in service_loop
>
> changelog_agent.init()
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in
> __call__
>
> return self.ins(self.meth, *a)
>
>   File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in
> __call__
>
> raise res
>
> OSError: libgfchangelog.so: cannot open shared object file: No such file
> or directory
>
> [2018-08-28 06:46:46.678463] I [repce(agent 
> /data/gluster/gv0):80:service_loop]
> RepceServer: terminating on reaching EOF.
>
> [2018-08-28 06:46:47.662086] I [monitor(monitor):272:monitor] Monitor:
> worker died in startup phase brick=/data/gluster/gv0
>
>
>
>
>
> Any help please.
>
>
>
>
>
> /Krish
>
>
>
>
>
>
>
>
>
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>



-- 
Tiemen Ruiten
Systems Engineer
R Media
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Was: Upgrade to 4.1.2 geo-replication does not work Now: Upgraded to 4.1.3 geo node Faulty

2018-08-31 Thread Marcus Pedersén
Hi all,

I had problems with stopping sync after upgrade to 4.1.2.

I upgraded to 4.1.3 and it ran fine for one day, but now one of the master 
nodes shows faulty.

Most of the sync jobs have return code 23, how do I resolve this?

I see messages like:

_GMaster: Sucessfully fixed all entry ops with gfid mismatch

Will this resolve error code 23?

There is also a python error.

The python error was a selinux problem, turning off selinux made node go to 
active again.

See log below.


CentOS 7, installed through SIG Gluster (OS updated to latest at the same time)

Master cluster: 2 x (2 + 1) distributed, replicated

Client cluster: 1 x (2 + 1) replicated


Many thanks in advance!


Best regards

Marcus Pedersén



gsyncd.log from Faulty node:

[2018-08-31 06:25:51.375267] I [master(worker /urd-gds/gluster):1944:syncjob] 
Syncer: Sync Time Taken   duration=0.8099 num_files=57job=3   return_code=23
[2018-08-31 06:25:51.465895] I [master(worker /urd-gds/gluster):1944:syncjob] 
Syncer: Sync Time Taken   duration=0.0904 num_files=3 job=3   return_code=23
[2018-08-31 06:25:52.562107] E [repce(worker /urd-gds/gluster):197:__call__] 
RepceClient: call failed   call=30069:139655665837888:1535696752.35
method=entry_opserror=OSError
[2018-08-31 06:25:52.562346] E [syncdutils(worker 
/urd-gds/gluster):332:log_raise_exception] : FAIL:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, in main
func(args)
  File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, in 
subcmd_worker
local.service_loop(remote)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1288, in 
service_loop
g3.crawlwrap(oneshot=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, in 
crawlwrap
self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1545, in crawl
self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1445, in 
changelogs_batch_process
self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1280, in 
process
self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1179, in 
process_change
failures = self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in 
__call__
return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in 
__call__
raise res
OSError: [Errno 13] Permission denied
[2018-08-31 06:25:52.578367] I [repce(agent /urd-gds/gluster):80:service_loop] 
RepceServer: terminating on reaching EOF.
[2018-08-31 06:25:53.558765] I [monitor(monitor):279:monitor] Monitor: worker 
died in startup phase brick=/urd-gds/gluster
[2018-08-31 06:25:53.569777] I [gsyncdstatus(monitor):244:set_worker_status] 
GeorepStatus: Worker Status Change status=Faulty
[2018-08-31 06:26:03.593161] I [monitor(monitor):158:monitor] Monitor: starting 
gsyncd worker   brick=/urd-gds/gluster  slave_node=urd-gds-geo-000
[2018-08-31 06:26:03.636452] I [gsyncd(agent /urd-gds/gluster):297:main] : 
Using session config file   
path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-08-31 06:26:03.636810] I [gsyncd(worker /urd-gds/gluster):297:main] 
: Using session config file  
path=/var/lib/glusterd/geo-replication/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/gsyncd.conf
[2018-08-31 06:26:03.637486] I [changelogagent(agent 
/urd-gds/gluster):72:__init__] ChangelogAgent: Agent listining...
[2018-08-31 06:26:03.650330] I [resource(worker 
/urd-gds/gluster):1377:connect_remote] SSH: Initializing SSH connection between 
master and slave...
[2018-08-31 06:26:05.296473] I [resource(worker 
/urd-gds/gluster):1424:connect_remote] SSH: SSH connection between master and 
slave established.duration=1.6457
[2018-08-31 06:26:05.297904] I [resource(worker /urd-gds/gluster):1096:connect] 
GLUSTER: Mounting gluster volume locally...
[2018-08-31 06:26:06.396939] I [resource(worker /urd-gds/gluster):1119:connect] 
GLUSTER: Mounted gluster volume duration=1.0985
[2018-08-31 06:26:06.397691] I [subcmds(worker 
/urd-gds/gluster):70:subcmd_worker] : Worker spawn successful. 
Acknowledging back to monitor
[2018-08-31 06:26:16.815566] I [master(worker /urd-gds/gluster):1593:register] 
_GMaster: Working dir
path=/var/lib/misc/gluster/gsyncd/urd-gds-volume_urd-gds-geo-001_urd-gds-volume/urd-gds-gluster
[2018-08-31 06:26:16.816423] I [resource(worker 
/urd-gds/gluster):1282:service_loop] GLUSTER: Register time time=1535696776
[2018-08-31 06:26:16.888772] I [gsyncdstatus(worker 
/urd-gds/gluster):277:set_active] GeorepStatus: Worker Status Change
status=Active
[2018-08-31 06:26:16.892049] I [gsyncdstatus(worker 
/urd-gds/gluster):249:set_worker_crawl_status] GeorepStatus: Crawl Status 
Change  

Re: [Gluster-users] Gluter 3.12.12: performance during heal and in general

2018-08-31 Thread Hu Bert
Hi Pranith,

i just wanted to ask if you were able to get any feedback from your
colleagues :-)

btw.: we migrated some stuff (static resources, small files) to a nfs
server that we actually wanted to replace by glusterfs. Load and cpu
usage has gone down a bit, but still is asymmetric on the 3 gluster
servers.


2018-08-28 9:24 GMT+02:00 Hu Bert :
> Hm, i noticed that in the shared.log (volume log file) on gluster11
> and gluster12 (but not on gluster13) i now see these warnings:
>
> [2018-08-28 07:18:57.224367] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 0-shared-dht: no subvolume for
> hash (value) = 3054593291
> [2018-08-28 07:19:17.733625] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 0-shared-dht: no subvolume for
> hash (value) = 2595205890
> [2018-08-28 07:19:27.950355] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 0-shared-dht: no subvolume for
> hash (value) = 3105728076
> [2018-08-28 07:19:42.519010] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 0-shared-dht: no subvolume for
> hash (value) = 3740415196
> [2018-08-28 07:19:48.194774] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 0-shared-dht: no subvolume for
> hash (value) = 2922795043
> [2018-08-28 07:19:52.506135] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 0-shared-dht: no subvolume for
> hash (value) = 2841655539
> [2018-08-28 07:19:55.466352] W [MSGID: 109011]
> [dht-layout.c:186:dht_layout_search] 0-shared-dht: no subvolume for
> hash (value) = 3049465001
>
> Don't know if that could be related.
>
>
> 2018-08-28 8:54 GMT+02:00 Hu Bert :
>> a little update after about 2 hours of uptime: still/again high cpu
>> usage by one brick processes. server load >30.
>>
>> gluster11: high cpu; brick /gluster/bricksdd1/; no hdd exchange so far
>> gluster12: normal cpu; brick /gluster/bricksdd1_new/; hdd change /dev/sdd
>> gluster13: high cpu; brick /gluster/bricksdd1_new/; hdd change /dev/sdd
>>
>> The process for brick bricksdd1 consumes almost all 12 cores.
>> Interestingly there are more threads for the bricksdd1 process than
>> for the other bricks. Counted with "ps huH p  | wc
>> -l"
>>
>> gluster11:
>> bricksda1 59 threads, bricksdb1 65 threads, bricksdc1 68 threads,
>> bricksdd1 85 threads
>> gluster12:
>> bricksda1 65 threads, bricksdb1 60 threads, bricksdc1 61 threads,
>> bricksdd1_new 58 threads
>> gluster13:
>> bricksda1 61 threads, bricksdb1 60 threads, bricksdc1 61 threads,
>> bricksdd1_new 82 threads
>>
>> Don't know if that could be relevant.
>>
>> 2018-08-28 7:04 GMT+02:00 Hu Bert :
>>> Good Morning,
>>>
>>> today i update + rebooted all gluster servers, kernel update to
>>> 4.9.0-8 and gluster to 3.12.13. Reboots went fine, but on one of the
>>> gluster servers (gluster13) one of the bricks did come up at the
>>> beginning but then lost connection.
>>>
>>> OK:
>>>
>>> Status of volume: shared
>>> Gluster process TCP Port  RDMA Port  Online  Pid
>>> --
>>> [...]
>>> Brick gluster11:/gluster/bricksdd1/shared 49155 0
>>> Y   2506
>>> Brick gluster12:/gluster/bricksdd1_new/shared49155 0
>>> Y   2097
>>> Brick gluster13:/gluster/bricksdd1_new/shared49155 0
>>> Y   2136
>>>
>>> Lost connection:
>>>
>>> Brick gluster11:/gluster/bricksdd1/shared  49155 0
>>>  Y   2506
>>> Brick gluster12:/gluster/bricksdd1_new/shared 49155 0
>>> Y   2097
>>> Brick gluster13:/gluster/bricksdd1_new/shared N/A   N/A
>>> N   N/A
>>>
>>> gluster volume heal shared info:
>>> Brick gluster13:/gluster/bricksdd1_new/shared
>>> Status: Transport endpoint is not connected
>>> Number of entries: -
>>>
>>> reboot was at 06:15:39; brick then worked for a short period, but then
>>> somehow disconnected.
>>>
>>> from gluster13:/var/log/glusterfs/glusterd.log:
>>>
>>> [2018-08-28 04:27:36.944608] I [MSGID: 106005]
>>> [glusterd-handler.c:6071:__glusterd_brick_rpc_notify] 0-management:
>>> Brick gluster13:/gluster/bricksdd1_new/shared has disconnected from
>>> glusterd.
>>> [2018-08-28 04:28:57.869666] I
>>> [glusterd-utils.c:6056:glusterd_brick_start] 0-management: starting a
>>> fresh brick process for brick /gluster/bricksdd1_new/shared
>>> [2018-08-28 04:35:20.732666] I [MSGID: 106143]
>>> [glusterd-pmap.c:295:pmap_registry_bind] 0-pmap: adding brick
>>> /gluster/bricksdd1_new/shared on port 49157
>>>
>>> After 'gluster volume start shared force' (then with new port 49157):
>>>
>>> Brick gluster11:/gluster/bricksdd1/shared   49155 0
>>>   Y   2506
>>> Brick gluster12:/gluster/bricksdd1_new/shared  49155 0
>>>  Y   2097
>>> Brick gluster13:/gluster/bricksdd1_new/shared  49157 0
>>>  Y   3994
>>>
>>> from /var/log/syslog:
>>>
>>> Aug 28 06:27:36 gluster13 gluster-bricksdd1_new-shared[2136]: pending 
>>> frames:
>>> Aug 28 06:27:36 gluster13 

Re: [Gluster-users] Upgrade to 4.1.2 geo-replication does not work

2018-08-31 Thread Krishna Verma
Hi Kotresh,

I have tested the geo replication over distributed volumes with 2*2 gluster 
setup.

[root@gluster-poc-noida ~]# gluster volume geo-replication glusterdist 
gluster-poc-sj::glusterdist status

MASTER NODE  MASTER VOL MASTER BRICK  SLAVE USER
SLAVE  SLAVE NODE STATUSCRAWL STATUS   
LAST_SYNCED
-
gluster-poc-noidaglusterdist/data/gluster-dist/distvolroot  
gluster-poc-sj::glusterdistgluster-poc-sj ActiveChangelog Crawl
2018-08-31 10:28:19
noi-poc-gluster  glusterdist/data/gluster-dist/distvolroot  
gluster-poc-sj::glusterdistgluster-poc-sj2ActiveHistory Crawl  
N/A
[root@gluster-poc-noida ~]#

Not at client I copied a 848MB file from local disk to master mounted volume 
and it took only 1 minute and 15 seconds. Its great….

But even after waited for 2 hrs I was unable to see that file at slave site. 
Then I again erased the indexing by doing “gluster volume set glusterdist  
indexing off” and restart the session. Magically I received the file instantly 
at slave after doing this.

Why I need to do “indexing off” every time to reflect data at slave site? Is 
there any fix/workaround of it?

/Krishna


From: Kotresh Hiremath Ravishankar 
Sent: Friday, August 31, 2018 10:10 AM
To: Krishna Verma 
Cc: Sunny Kumar ; Gluster Users 
Subject: Re: [Gluster-users] Upgrade to 4.1.2 geo-replication does not work

EXTERNAL MAIL


On Thu, Aug 30, 2018 at 3:51 PM, Krishna Verma 
mailto:kve...@cadence.com>> wrote:
Hi Kotresh,

Yes, this include the time take  to write 1GB file to master. geo-rep was not 
stopped while the data was copying to master.

This way, you can't really measure how much time geo-rep took.


But now I am trouble, My putty session was timed out while copying data to 
master and geo replication was active. After I restart putty session My Master 
data is not syncing with slave. Its Last_synced time is  1hrs behind the 
current time.

I restart the geo rep and also delete and again create the session but its  
“LAST_SYNCED” time is same.

Unless, geo-rep is Faulty, it would be processing/syncing. You should check 
logs for any errors.


Please help in this.

…. It's better if gluster volume has more distribute count like  3*3 or 4*3 :- 
Are you refereeing to create a distributed volume with 3 master node and 3 
slave node?

Yes,  that's correct. Please do the test with this. I recommend you to run the 
actual workload for which you are planning to use gluster instead of copying 
1GB file and testing.



/krishna

From: Kotresh Hiremath Ravishankar 
mailto:khire...@redhat.com>>
Sent: Thursday, August 30, 2018 3:20 PM

To: Krishna Verma mailto:kve...@cadence.com>>
Cc: Sunny Kumar mailto:sunku...@redhat.com>>; Gluster 
Users mailto:gluster-users@gluster.org>>
Subject: Re: [Gluster-users] Upgrade to 4.1.2 geo-replication does not work

EXTERNAL MAIL


On Thu, Aug 30, 2018 at 1:52 PM, Krishna Verma 
mailto:kve...@cadence.com>> wrote:
Hi Kotresh,

After fix the library link on node "noi-poc-gluster ", the status of one mater 
node is “Active” and another is “Passive”. Can I setup both the master as 
“Active” ?

Nope, since it's replica, it's redundant to sync same files from two nodes. 
Both replicas can't be Active.


Also, when I copy a 1GB size of file from gluster client to master gluster 
volume which is replicated with the slave volume, it tooks 35 minutes and 49 
seconds. Is there any way to reduce its time taken to rsync data.

How did you measure this time? Does this include the time take for you to write 
1GB file to master?
There are two aspects to consider while measuring this.

1. Time to write 1GB to master
2. Time for geo-rep to transfer 1GB to slave.

In your case, since the setup is 1*2 and only one geo-rep worker is Active, 
Step2 above equals to time for step1 + network transfer time.

You can measure time in two scenarios
1. If geo-rep is started while the data is still being written to master. It's 
one way.
2. Or stop geo-rep until the 1GB file is written to master and then start 
geo-rep to get actual geo-rep time.

To improve replicating speed,
1. You can play around with rsync options depending on the kind of I/O
and configure the same for geo-rep as it also uses rsync internally.
2. It's better if gluster volume has more distribute count like  3*3 or 4*3
It will help in two ways.
   1. The files gets distributed on master to multiple bricks
   2. So above will help geo-rep as files on multiple bricks are synced in 
parallel (multiple Actives)

NOTE: Gluster master server and one client is in Noida, India Location.
 Gluster Slave server and one client is in USA.

Our approach is to transfer data from Noida gluster client will reach 

Re: [Gluster-users] gluster connection interrupted during transfer

2018-08-31 Thread Raghavendra Gowdappa
On Fri, Aug 31, 2018 at 11:11 AM, Richard Neuboeck 
wrote:

> On 08/31/2018 03:50 AM, Raghavendra Gowdappa wrote:
> > +Mohit. +Milind
> >
> > @Mohit/Milind,
> >
> > Can you check logs and see whether you can find anything relevant?
>
> From glances at the system logs nothing out of the ordinary
> occurred. However I'll start another rsync and take a closer look.
> It will take a few days.
>
> >
> > On Thu, Aug 30, 2018 at 7:04 PM, Richard Neuboeck
> > mailto:h...@tbi.univie.ac.at>> wrote:
> >
> > Hi,
> >
> > I'm attaching a shortened version since the whole is about 5.8GB of
> > the client mount log. It includes the initial mount messages and the
> > last two minutes of log entries.
> >
> > It ends very anticlimactic without an obvious error. Is there
> > anything specific I should be looking for?
> >
> >
> > Normally I look logs around disconnect msgs to find out the reason.
> > But as you said, sometimes one can see just disconnect msgs without
> > any reason. That normally points to reason for disconnect in the
> > network rather than a Glusterfs initiated disconnect.
>
> The rsync source is serving our homes currently so there are NFS
> connections 24/7. There don't seem to be any network related
> interruptions


Can you set diagnostics.client-log-level and diagnostics.brick-log-level to
TRACE and check logs of both ends of connections - client and brick? To
reduce the logsize, I would suggest to logrotate existing logs and start
with fresh logs when you are about to start so that only relevant logs are
captured. Also, can you take strace of client and brick process using:

strace -o  -ff -v -p 

attach both logs and strace. Let's trace through what syscalls on socket
return and then decide whether to inspect tcpdump or not. If you don't want
to repeat tests again, please capture tcpdump too (on both ends of
connection) and send them to us.


- a co-worker would be here faster than I could check
> the logs if the connection to home would be broken ;-)
> The three gluster machines are due to this problem reduced to only
> testing so there is nothing else running.
>
>
> >
> > Cheers
> > Richard
> >
> > On 08/30/2018 02:40 PM, Raghavendra Gowdappa wrote:
> > > Normally client logs will give a clue on why the disconnections are
> > > happening (ping-timeout, wrong port etc). Can you look into client
> > > logs to figure out what's happening? If you can't find anything,
> can
> > > you send across client logs?
> > >
> > > On Wed, Aug 29, 2018 at 6:11 PM, Richard Neuboeck
> > > mailto:h...@tbi.univie.ac.at>
> > >>
> > wrote:
> > >
> > > Hi Gluster Community,
> > >
> > > I have problems with a glusterfs 'Transport endpoint not
> > connected'
> > > connection abort during file transfers that I can
> > replicate (all the
> > > time now) but not pinpoint as to why this is happening.
> > >
> > > The volume is set up in replica 3 mode and accessed with
> > the fuse
> > > gluster client. Both client and server are running CentOS
> > and the
> > > supplied 3.12.11 version of gluster.
> > >
> > > The connection abort happens at different times during
> > rsync but
> > > occurs every time I try to sync all our files (1.1TB) to
> > the empty
> > > volume.
> > >
> > > Client and server side I don't find errors in the gluster
> > log files.
> > > rsync logs the obvious transfer problem. The only log that
> > shows
> > > anything related is the server brick log which states that the
> > > connection is shutting down:
> > >
> > > [2018-08-18 22:40:35.502510] I [MSGID: 115036]
> > > [server.c:527:server_rpc_notify] 0-home-server: disconnecting
> > > connection from
> > > brax-110405-2018/08/16-08:36:28:575972-home-client-0-0-0
> > > [2018-08-18 22:40:35.502620] W
> > > [inodelk.c:499:pl_inodelk_log_cleanup] 0-home-server:
> > releasing lock
> > > on eaeb0398-fefd-486d-84a7-f13744d1cf10 held by
> > > {client=0x7f83ec0b3ce0, pid=110423 lk-owner=d0fd5ffb427f}
> > > [2018-08-18 22:40:35.502692] W
> > > [entrylk.c:864:pl_entrylk_log_cleanup] 0-home-server:
> > releasing lock
> > > on faa93f7b-6c46-4251-b2b2-abcd2f2613e1 held by
> > > {client=0x7f83ec0b3ce0, pid=110423 lk-owner=703dd4cc407f}
> > > [2018-08-18 22:40:35.502719] W
> > > [entrylk.c:864:pl_entrylk_log_cleanup] 0-home-server:
> > releasing lock
> > > on faa93f7b-6c46-4251-b2b2-abcd2f2613e1 held by
> > > {client=0x7f83ec0b3ce0, pid=110423 lk-owner=703dd4cc407f}
> > > [2018-08-18 22:40:35.505950] I [MSGID: 101055]
> > > [client_t.c:443:gf_client_unref] 0-home-server: Shutting down
> > > connection
> >