Re: [Gluster-users] Cannot list `.snaps/` directory

2018-07-24 Thread Riccardo Murri
...and here's the statedump of the client, snapd and brick snap
processes from the 4.1 test cluster.
(File names are as outputted from the GlusterD processes, so it looks
like the `snapd` daemon
has an off-by-one error in the statedump.)

Thanks,
Riccardo


glusterdump.1726.dump.1532446976.gz
Description: application/gzip


run-gluster-snaps-39f9f186c67f418a989f783c3289d166-brick3.5446.dump.1532446840.gz
Description: application/gzip


napd-glusterfs.5482.dump.1532446843.gz
Description: application/gzip
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Cannot list `.snaps/` directory

2018-07-24 Thread Rafi Kavungal Chundattu Parambil
Let me try out the reproducible. By the time, can you try taking statedump of 
the client process, snapd process, snapshot brick process. Please refer to 
documentation [1] in case if you have any trouble in performing statedump 
operation.

[1] : https://docs.gluster.org/en/v3/Troubleshooting/statedump/

Rafi KC

- Original Message -
From: "Riccardo Murri" 
To: "Mohammed Rafi K C" 
Cc: gluster-users@gluster.org
Sent: Tuesday, July 24, 2018 6:42:44 PM
Subject: Re: [Gluster-users] Cannot list `.snaps/` directory

Hello,

I have set up a test cluster with GlusterFS 4.1 and Ubuntu 16.04.5 and
I get the same behavior:
`ls .snaps/test/` hangs indefinitely in a getdents() system call.  I
can mount and list the snapshot
just fine with `mount -t glusterfs`, it's just the USS feature that is
not working.

Is this a known bug?  Any hints on how to work it around?

Thanks,
Riccardo
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Cannot list `.snaps/` directory

2018-07-24 Thread Riccardo Murri
Hello,

I have set up a test cluster with GlusterFS 4.1 and Ubuntu 16.04.5 and
I get the same behavior:
`ls .snaps/test/` hangs indefinitely in a getdents() system call.  I
can mount and list the snapshot
just fine with `mount -t glusterfs`, it's just the USS feature that is
not working.

Is this a known bug?  Any hints on how to work it around?

Thanks,
Riccardo
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] georeplication woes

2018-07-24 Thread Maarten van Baarsel

On 24-07-2018 06:12:25, Kotresh Hiremath Ravishankar wrote:

Hi Kotresh,


Looks like gsyncd on slave is failing for some reason.

Please run the below cmd on the master.

#ssh -i /var/lib/glusterd/geo-replication/secret.pemĀ  georep@gluster-4.glstr

It should run gsyncd on the slave. If there is error, it should be fixed.
Please share the output of above cmd.


here we go:

root@gluster-3:/home/mrten# ssh -i 
/var/lib/glusterd/geo-replication/secret.pem  georep@gluster-4.glstr

usage: gsyncd.py [-h]

{monitor-status,monitor,worker,agent,slave,status,config-check,config-get,config-set,config-reset,voluuidget,delete}
 ...
gsyncd.py: error: too few arguments
Connection to gluster-4.glstr closed.


Maarten.
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluter 3.12.12: performance during heal and in general

2018-07-24 Thread Pranith Kumar Karampuri
On Mon, Jul 23, 2018 at 4:16 PM, Hu Bert  wrote:

> Well, over the weekend about 200GB were copied, so now there are
> ~400GB copied to the brick. That's far beyond a speed of 10GB per
> hour. If I copied the 1.6 TB directly, that would be done within max 2
> days. But with the self heal this will take at least 20 days minimum.
>
> Why is the performance that bad? No chance of speeding this up?
>

What kind of data do you have?
How many directories in the filesystem?
On average how many files per directory?
What is the depth of your directory hierarchy on average?
What is average filesize?

Based on this data we can see if anything can be improved. Or if there are
some
enhancements that need to be implemented in gluster to address this kind of
data layout

>
> 2018-07-20 9:41 GMT+02:00 Hu Bert :
> > hmm... no one any idea?
> >
> > Additional question: the hdd on server gluster12 was changed, so far
> > ~220 GB were copied. On the other 2 servers i see a lot of entries in
> > glustershd.log, about 312.000 respectively 336.000 entries there
> > yesterday, most of them (current log output) looking like this:
> >
> > [2018-07-20 07:30:49.757595] I [MSGID: 108026]
> > [afr-self-heal-common.c:1724:afr_log_selfheal] 0-shared-replicate-3:
> > Completed data selfheal on 0d863a62-0dd8-401c-b699-2b642d9fd2b6.
> > sources=0 [2]  sinks=1
> > [2018-07-20 07:30:49.992398] I [MSGID: 108026]
> > [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
> > 0-shared-replicate-3: performing metadata selfheal on
> > 0d863a62-0dd8-401c-b699-2b642d9fd2b6
> > [2018-07-20 07:30:50.243551] I [MSGID: 108026]
> > [afr-self-heal-common.c:1724:afr_log_selfheal] 0-shared-replicate-3:
> > Completed metadata selfheal on 0d863a62-0dd8-401c-b699-2b642d9fd2b6.
> > sources=0 [2]  sinks=1
> >
> > or like this:
> >
> > [2018-07-20 07:38:41.726943] I [MSGID: 108026]
> > [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
> > 0-shared-replicate-3: performing metadata selfheal on
> > 9276097a-cdac-4d12-9dc6-04b1ea4458ba
> > [2018-07-20 07:38:41.855737] I [MSGID: 108026]
> > [afr-self-heal-common.c:1724:afr_log_selfheal] 0-shared-replicate-3:
> > Completed metadata selfheal on 9276097a-cdac-4d12-9dc6-04b1ea4458ba.
> > sources=[0] 2  sinks=1
> > [2018-07-20 07:38:44.755800] I [MSGID: 108026]
> > [afr-self-heal-entry.c:887:afr_selfheal_entry_do]
> > 0-shared-replicate-3: performing entry selfheal on
> > 9276097a-cdac-4d12-9dc6-04b1ea4458ba
> >
> > is this behaviour normal? I'd expect these messages on the server with
> > the failed brick, not on the other ones.
> >
> > 2018-07-19 8:31 GMT+02:00 Hu Bert :
> >> Hi there,
> >>
> >> sent this mail yesterday, but somehow it didn't work? Wasn't archived,
> >> so please be indulgent it you receive this mail again :-)
> >>
> >> We are currently running a replicate setup and are experiencing a
> >> quite poor performance. It got even worse when within a couple of
> >> weeks 2 bricks (disks) crashed. Maybe some general information of our
> >> setup:
> >>
> >> 3 Dell PowerEdge R530 (Xeon E5-1650 v3 Hexa-Core, 64 GB DDR4, OS on
> >> separate disks); each server has 4 10TB disks -> each is a brick;
> >> replica 3 setup (see gluster volume status below). Debian stretch,
> >> kernel 4.9.0, gluster version 3.12.12. Servers and clients are
> >> connected via 10 GBit ethernet.
> >>
> >> About a month ago and 2 days ago a disk died (on different servers);
> >> disk were replaced, were brought back into the volume and full self
> >> heal started. But the speed for this is quite... disappointing. Each
> >> brick has ~1.6TB of data on it (mostly the infamous small files). The
> >> full heal i started yesterday copied only ~50GB within 24 hours (48
> >> hours: about 100GB) - with
> >> this rate it would take weeks until the self heal finishes.
> >>
> >> After the first heal (started on gluster13 about a month ago, took
> >> about 3 weeks) finished we had a terrible performance; CPU on one or
> >> two of the nodes (gluster11, gluster12) was up to 1200%, consumed by
> >> the brick process of the former crashed brick (bricksdd1),
> >> interestingly not on the server with the failed this, but on the other
> >> 2 ones...
> >>
> >> Well... am i doing something wrong? Some options wrongly configured?
> >> Terrible setup? Anyone got an idea? Any additional information needed?
> >>
> >>
> >> Thx in advance :-)
> >>
> >> gluster volume status
> >>
> >> Volume Name: shared
> >> Type: Distributed-Replicate
> >> Volume ID: e879d208-1d8c-4089-85f3-ef1b3aa45d36
> >> Status: Started
> >> Snapshot Count: 0
> >> Number of Bricks: 4 x 3 = 12
> >> Transport-type: tcp
> >> Bricks:
> >> Brick1: gluster11:/gluster/bricksda1/shared
> >> Brick2: gluster12:/gluster/bricksda1/shared
> >> Brick3: gluster13:/gluster/bricksda1/shared
> >> Brick4: gluster11:/gluster/bricksdb1/shared
> >> Brick5: gluster12:/gluster/bricksdb1/shared
> >> Brick6: gluster13:/gluster/bricksdb1/shared
> >> Brick7: gluster11:/gluster/bricksdc1/shared
> >> Brick8: