Here you can find 10 stack trace samples from glusterd. I wait 10
seconds between each trace.
https://www.dropbox.com/s/9f36goq5xn3p1yt/glusterd_pstack.zip?dl=0
Content of the first stack trace is here:
Thread 8 (Thread 0x7f7a8cd4e700 (LWP 43069)):
#0 0x003aa5c0f00d in nanosleep () from /lib
Hi,
On Thu, Aug 24, 2017 at 2:13 AM, WK wrote:
> The default timeout for most OS versions is 30 seconds and the Gluster
> timeout is 42, so yes you can trigger an RO event.
I get read-only mount within approximately 2 seconds after failed IO.
> Though it is easy enough to raise as Pavel mention
Unlikely. In your case only the afr.dirty is set, not the
afr.volname-client-xx xattr.
`gluster volume set myvolume diagnostics.client-log-level DEBUG` is right.
On 08/23/2017 10:31 PM, mabi wrote:
I just saw the following bug which was fixed in 3.8.15:
https://bugzilla.redhat.com/show_bug.c
Hi,
I have a fresh glusterFS 3.11 installation on Ubuntu 16.04 that does not
work. I followed instructions from
http://gluster.readthedocs.io/en/latest/Install-Guide/Configure/
Everything went well (peer connectivity, volume creation, and volume
startup, glusterfs mount). But the joy stopped
That really isnt an arbiter issue or for that matter a Gluster issue. We
have seen that with vanilla NAS servers that had some issue or another.
Arbiter simply makes it less likely to be an issue than replica 2 but in
turn arbiter is less 'safe' than replica 3.
However, in regards to Gluster
Hi,
I have a gluster cluster running with geo-replication. The volume that is being
geo-replicated is running on a LVM thin pool. The thin pool overflowed causing
a crash. I extended the thin pool LV and remounted which brought the volume
back online and healthy. When I try to restart the geo-
I remember seeing errors like "Transport endpoint not connected" in
the client logs after ping timeout even with arbiter. Arbiter does not
prevent this.
And if you end up in situation when arbiter blames the only running
brick for given file, you are doomed.
-ps
On Wed, Aug 23, 2017 at 9:26 PM,
Really ? I can't see why. But I've never used arbiter so you probably
know more about this than I do.
In any case, with replica 3, never had a problem.
On Wed, Aug 23, 2017 at 09:13:28PM +0200, Pavel Szalbot wrote:
> Hi, I believe it is not that simple. Even replica 2 + arbiter volume
> with defa
Hi, I believe it is not that simple. Even replica 2 + arbiter volume
with default network.ping-timeout will cause the underlying VM to
remount filesystem as read-only (device error will occur) unless you
tune mount options in VM's fstab.
-ps
On Wed, Aug 23, 2017 at 6:59 PM, wrote:
> What he is
On 8/21/2017 1:09 PM, Gionatan Danti wrote:
Hi all,
I would like to ask if, and with how much success, you are using
GlusterFS for virtual machine storage.
My plan: I want to setup a 2-node cluster, where VM runs on the nodes
themselves and can be live-migrated on demand.
I have some que
excuse me I do like your command but in my email it was displaced.
On Wed, Aug 23, 2017 at 8:59 PM, Niels de Vos wrote:
> On Wed, Aug 23, 2017 at 08:06:18PM +0430, Tahereh Fattahi wrote:
> > Hi
> > How can I turn off readdirp?
> > I saw some solution but they don't work.
> >
> > I tried server
Could you be able to provide the pstack dump of the glusterd process?
On Wed, 23 Aug 2017 at 20:22, Atin Mukherjee wrote:
> Not yet. Gaurav will be taking a look at it tomorrow.
>
> On Wed, 23 Aug 2017 at 20:14, Serkan Çoban wrote:
>
>> Hi Atin,
>>
>> Do you have time to check the logs?
>>
>> O
I just saw the following bug which was fixed in 3.8.15:
https://bugzilla.redhat.com/show_bug.cgi?id=1471613
Is it possible that the problem I described in this post is related to that bug?
> Original Message
> Subject: Re: [Gluster-users] self-heal not working
> Local Time: Aug
What he is saying is that, on a two node volume, upgrading a node will
cause the volume to go down. That's nothing weird, you really should use
3 nodes.
On Wed, Aug 23, 2017 at 06:51:55PM +0200, Gionatan Danti wrote:
> Il 23-08-2017 18:14 Pavel Szalbot ha scritto:
> > Hi, after many VM crashes dur
Il 23-08-2017 18:14 Pavel Szalbot ha scritto:
Hi, after many VM crashes during upgrades of Gluster, losing network
connectivity on one node etc. I would advise running replica 2 with
arbiter.
Hi Pavel, this is bad news :(
So, in your case at least, Gluster was not stable? Something as simple
a
On Wed, Aug 23, 2017 at 08:06:18PM +0430, Tahereh Fattahi wrote:
> Hi
> How can I turn off readdirp?
> I saw some solution but they don't work.
>
> I tried server side
> gluster volume vol performance.force-readdirp off
> gluster volume vol dht.force-readdirp off
> gluster volume vol perforamnce.
Hi, after many VM crashes during upgrades of Gluster, losing network
connectivity on one node etc. I would advise running replica 2 with
arbiter.
I once even managed to break this setup (with arbiter) due to network
partitioning - one data node never healed and I had to restore from
backups (it wa
On Mon, Aug 21, 2017 at 10:09:20PM +0200, Gionatan Danti wrote:
> Hi all,
> I would like to ask if, and with how much success, you are using
> GlusterFS for virtual machine storage.
Hi, we have similar clusters.
>
> My plan: I want to setup a 2-node cluster, where VM runs on the nodes
> themse
Hi all,
I would like to ask if, and with how much success, you are using
GlusterFS for virtual machine storage.
My plan: I want to setup a 2-node cluster, where VM runs on the nodes
themselves and can be live-migrated on demand.
I have some questions:
- do you use GlusterFS for similar setup
[from http://blog.nixpanic.net/2017/08/last-update-for-gluster-38.html
and also available on https://planet.gluster.org/ ]
GlusterFS 3.8.15 is available, likely the last 3.8 update
The next Long-Term-Maintenance release for Gluster is around the
corner. Once GlusterFS-3.12 is available, the o
Hi Niels,
On Fri, Aug 11, 2017 at 2:33 PM, Niels de Vos wrote:
> On Fri, Aug 11, 2017 at 05:50:47PM +0530, Ravishankar N wrote:
[...]
>> To me it looks like fadvise (mm/fadvise.c) affects only the linux page cache
>> behavior and is decoupled from the filesystem itself. What this means for
>> fus
the heal info command shows perfect consistency between nodes; that's what
confused me. At the moment, the physical partitions (lvm partitions) that
gluster is using are different sizes, but I expected to see the "least
common denominator" for the total size, and I expected to see it consistant
ac
Hi
How can I turn off readdirp?
I saw some solution but they don't work.
I tried server side
gluster volume vol performance.force-readdirp off
gluster volume vol dht.force-readdirp off
gluster volume vol perforamnce.read-ahead off
and in client side fuse mount
mount -t glusterfs server:/vol /mnt
Not yet. Gaurav will be taking a look at it tomorrow.
On Wed, 23 Aug 2017 at 20:14, Serkan Çoban wrote:
> Hi Atin,
>
> Do you have time to check the logs?
>
> On Wed, Aug 23, 2017 at 10:02 AM, Serkan Çoban
> wrote:
> > Same thing happens with 3.12.rc0. This time perf top shows hanging in
> > li
Hi Atin,
Do you have time to check the logs?
On Wed, Aug 23, 2017 at 10:02 AM, Serkan Çoban wrote:
> Same thing happens with 3.12.rc0. This time perf top shows hanging in
> libglusterfs.so and below is the glusterd logs, which are different
> from 3.10.
> With 3.10.5, after 60-70 minutes CPU usa
In your case for 5500 bricks cli is failing the command saying "Total brick
list is larger than a request. Can take (brick_count )".
RCA..
gluster cli while parsing the command takes the wordcount and based on that
it compares the statically initialized size as you have point out in
https://gi
Same thing happens with 3.12.rc0. This time perf top shows hanging in
libglusterfs.so and below is the glusterd logs, which are different
from 3.10.
With 3.10.5, after 60-70 minutes CPU usage becomes normal and we see
brick processes come online and system starts to answer commands like
"gluster pe
27 matches
Mail list logo