Re: [Gluster-users] [EXT] Re: [Glusterusers] State of the gluster project

2023-10-31 Thread W Kern
Well if what you mean by  'dead project' is there haven't been 
significant improvements then yes. Maybe given HOW gluster's 
architecture works there isn't a lot that can be done to re-architect it.


If you mean dead project because Gluster is broken then no. At least for 
its initial feature set it works really well. We have never used the 
more advanced features nor did we even try GFAPI. Just vanilla 
replication w/Fuse


We started with Gluster 3.x and it worked well and was easy to manage. 
Recovering from a failure was bummer though due to the need to heal 
whole VM files, especially on the 1GB network connections of those days.


We then migrated to 6.x and got sharding and the arbiter. Both of which 
made huge improvements to the speed and recovery in our replication 
environments.


Again, we never had gluster issues with 6.x. The problems we did see 
were bad networks, drives etc and gluster handled those including the 
fuse mount keeping the images up during a hardware failure. Then it was 
a matter of


swapping out drives, reassigning volumes etc, all of which were pretty 
straight forward and didn't involve downtime.


We are now on 10.1, and have yet to see any issues. Speed seems a little 
faster than 6.x but that is subjective. We haven't upgraded to beyond 
that because we have seen people report issues with 10.2/3/4 and it 
aint' broke so we have a wait and see attitude.


We have used other Distributed File Systems and still use MooseFS for 
archiving which is quite nice and also easy to use, but as was mentioned 
with BeeGFS, its freemium.


To get the important pieces you have to pay up. In the MFS example that 
means that the free version has a single point of failure with the 
mfsmaster. Only enterprise has the ability to failover to another 
mfsmaster.  So its not as resilient as Gluster and we did lose some 
files during one particular ugly outage (totally our fault, but those 
files would have survived if on gluster).


Gluster is open source and on github. I hope it stays that way.

-wk


On 10/29/23 3:54 AM, Dmitry Melekhov wrote:


29.10.2023 00:07, Zakhar Kirpichenko пишет:
I don't think it's worth it for anyone. It's a dead project since 
about 9.0, if not earlier.


Well, really earlier.

Attempt to get better gluster as gluster2 in 4.0 failed...






Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] mount volumes ON the arbiter?

2023-09-21 Thread W Kern

So, I am looking to solve an extremely minor argument among my co-workers.

Since the list is quiet I figured I'd throw this out there. Its clearly 
not a problem, bug, etc so obviously ignore it if you better things to 
do 


We have lots of 2 data hosts +1 arbiter clusters around.  Mostly 
corresponding to Libvirt pools.


So we have something akin to /POOL-GL1 available on the two data notes 
using fuse. We also mount /POOL-GL1 on the arbiter. They correspond to 
the actual volume GL1.


Obviously we can be logged into the arbiter and manipulate the mounted 
files just as we would on the real data nodes and because we may use 
multiple arbiters on the same kit, it can be more convenient to login 
there when moving files among clusters.


The question is:

Assuming equal hardware network speed and similar hard disk i/o 
performance, then if we are transferring a large file (say a VM image) 
then is it more efficient to copy that into the mounted directory on one 
of the real data hosts or do you get the same efficiency just uploading 
it onto the arb node?


Obviously if you copy a large file into the mount on the arb it is not 
actually being uploaded there, but is rather being copied out to the two 
data nodes which have the real data and only the meta data is retained 
on the arb.


So the question is by uploading to the arb are we doing extra work and 
is it more efficient to upload into one of the volumes on a data host 
where it only has to copy data off to the other data volume and the 
metadata onto the arb.


We ran a few tests which implied the direct to host option was a 
'little' faster but they were on production equipment which has varying 
loads and thus we couldn't compare cpu loads or come to a firm conclusion.


Its not yet an important enough issue to build up a test bed, so we were 
wondering if perhaps someone else already knows the answer based on an 
understanding of the architecture or perhaps they did do the testing?


-bill





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [EXT] Re: [Glusterusers] log file spewing on one node but not the

2023-07-25 Thread W Kern

Well as I indicated a day or so later. in the RESOLVED subject addition.

unmounting the folder, and running xfs_repair seemed to solve the problem.

No issues noted in dmesg.   Uptime was almost 300 days.

We shall see if the problem returns.

-wk

On 7/25/23 1:22 PM, Strahil Nikolov wrote:

What is the uptime of the affected node ?
There is a similar error reported in 
https://access.redhat.com/solutions/5518661 which could indicate a 
possible problem in a memory area named ‘lru’ .

Have you noticed any ECC errors in dmesg/IPMI of the system ?

At least I would reboot the node and run hardware diagnostics to check 
that everything is fine.


Best Regards,
Strahil Nikolov

Sent from Yahoo Mail for iPhone 
<https://mail.onelink.me/107872968?pid=nativeplacement=Global_Acquisition_YMktg_315_Internal_EmailSignature_sub1=Acquisition_sub2=Global_YMktg_sub3=_sub4=10604_sub5=EmailSignature__Static_>


On Tuesday, July 25, 2023, 4:31 AM, W Kern  wrote:

we have an older 2+1 arbiter gluster cluster running 6.10  on
Ubuntu18LTS

It has run beautifully for years. Only occaisionally needing
attention
as drives have died, etc

Each peer has two volumes. G1 and G2 with a shared 'gluster' network.

Since July 1st one of the peers for one volume is spewing the logfile
/var-lib-G1.log with the following errors.

The volume (G2) is not showing this nor are there issue with other
peer
and the arbiter for the G1 volume.

So its one machine with one volume that has the problem.  There have
been NO issues with the volumes themselves.

It simply a matter of the the logfiles generating GBs of entries
every
hour (which is how we noticed it when we started running out of
log space).

According to google there are mentions of this error, but that it was
fixed in the 6.x series.  I can find no other mentions.

I have tried restarting glusterd with no change. there doesn't
seem to
be any hardware issues.

I am wondering if perhaps this is an XFS file corruption issue and
if I
were to unmount the Gluster run xfs_repair and bring it back, that
would
solve the issue.

Any other suggestions?

[2023-07-21 18:51:38.260507] W [inode.c:1638:inode_table_prune]

(-->/usr/lib/x86_64-linux-gnu/glusterfs/6.10/xlator/features/shard.so(+0x21b47)

[0x7fb261c13b47]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36)
[0x7fb26947f416]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a)
[0x7fb26947f37a] ) 0-GLB1image-shard: Empty inode lru list found but
with (-2) lru_size
[2023-07-21 18:51:38.261231] W [inode.c:1638:inode_table_prune]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/6.10/xlator/mount/fuse.so(+0xba51)

[0x7fb266cdca51]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36)
[0x7fb26947f416]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a)
[0x7fb26947f37a] ) 0-fuse: Empty inode lru list found but with
(-2) lru_size
[2023-07-21 18:51:38.261377] W [inode.c:1638:inode_table_prune]
(-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(loc_wipe+0x12)
[0x7fb26946bd72]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36)
[0x7fb26947f416]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a)
[0x7fb26947f37a] ) 0-GLB1image-shard: Empty inode lru list found but
with (-2) lru_size
[2023-07-21 18:51:38.261806] W [inode.c:1638:inode_table_prune]

(-->/usr/lib/x86_64-linux-gnu/glusterfs/6.10/xlator/cluster/replicate.so(+0x5ca57)

[0x7fb26213ba57]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36)
[0x7fb26947f416]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a)
[0x7fb26947f37a] ) 0-GLB1image-replicate-0: Empty inode lru list
found
but with (-2) lru_size
[2023-07-21 18:51:38.261933] W [inode.c:1638:inode_table_prune]
(-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(fd_unref+0x1ef)
[0x7fb269495eaf]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36)
[0x7fb26947f416]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a)
[0x7fb26947f37a] ) 0-GLB1image-client-1: Empty inode lru list
found but
with (-2) lru_size
[2023-07-21 18:51:38.262684] W [inode.c:1638:inode_table_prune]

(-->/usr/lib/x86_64-linux-gnu/glusterfs/6.10/xlator/cluster/replicate.so(+0x5ca57)

[0x7fb26213ba57]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36)
[0x7fb26947f416]
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a)
[0x7fb26947f37a] ) 0-GLB1image-replicate-0: Empty inode lru list
found
but with (-2) lru_size

-wk





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list

[Gluster-users] log file spewing on one node, but not the others

2023-07-21 Thread W Kern

we have an older 2+1 arbiter gluster cluster running 6.10  on Ubuntu18LTS

It has run beautifully for years. Only occaisionally needing attention 
as drives have died, etc


Each peer has two volumes. G1 and G2 with a shared 'gluster' network.

Since July 1st one of the peers for one volume is spewing the logfile 
/var-lib-G1.log with the following errors.


The volume (G2) is not showing this nor are there issue with other peer 
and the arbiter for the G1 volume.


So its one machine with one volume that has the problem.  There have 
been NO issues with the volumes themselves.


It simply a matter of the the logfiles generating GBs of entries every 
hour (which is how we noticed it when we started running out of log space).


According to google there are mentions of this error, but that it was 
fixed in the 6.x series.  I can find no other mentions.


I have tried restarting glusterd with no change. there doesn't seem to 
be any hardware issues.


I am wondering if perhaps this is an XFS file corruption issue and if I 
were to unmount the Gluster run xfs_repair and bring it back, that would 
solve the issue.


Any other suggestions?

[2023-07-21 18:51:38.260507] W [inode.c:1638:inode_table_prune] 
(-->/usr/lib/x86_64-linux-gnu/glusterfs/6.10/xlator/features/shard.so(+0x21b47) 
[0x7fb261c13b47] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36) 
[0x7fb26947f416] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a) 
[0x7fb26947f37a] ) 0-GLB1image-shard: Empty inode lru list found but 
with (-2) lru_size
[2023-07-21 18:51:38.261231] W [inode.c:1638:inode_table_prune] 
(-->/usr/lib/x86_64-linux-gnu/glusterfs/6.10/xlator/mount/fuse.so(+0xba51) 
[0x7fb266cdca51] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36) 
[0x7fb26947f416] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a) 
[0x7fb26947f37a] ) 0-fuse: Empty inode lru list found but with (-2) lru_size
[2023-07-21 18:51:38.261377] W [inode.c:1638:inode_table_prune] 
(-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(loc_wipe+0x12) 
[0x7fb26946bd72] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36) 
[0x7fb26947f416] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a) 
[0x7fb26947f37a] ) 0-GLB1image-shard: Empty inode lru list found but 
with (-2) lru_size
[2023-07-21 18:51:38.261806] W [inode.c:1638:inode_table_prune] 
(-->/usr/lib/x86_64-linux-gnu/glusterfs/6.10/xlator/cluster/replicate.so(+0x5ca57) 
[0x7fb26213ba57] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36) 
[0x7fb26947f416] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a) 
[0x7fb26947f37a] ) 0-GLB1image-replicate-0: Empty inode lru list found 
but with (-2) lru_size
[2023-07-21 18:51:38.261933] W [inode.c:1638:inode_table_prune] 
(-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(fd_unref+0x1ef) 
[0x7fb269495eaf] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36) 
[0x7fb26947f416] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a) 
[0x7fb26947f37a] ) 0-GLB1image-client-1: Empty inode lru list found but 
with (-2) lru_size
[2023-07-21 18:51:38.262684] W [inode.c:1638:inode_table_prune] 
(-->/usr/lib/x86_64-linux-gnu/glusterfs/6.10/xlator/cluster/replicate.so(+0x5ca57) 
[0x7fb26213ba57] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(inode_unref+0x36) 
[0x7fb26947f416] 
-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(+0x3337a) 
[0x7fb26947f37a] ) 0-GLB1image-replicate-0: Empty inode lru list found 
but with (-2) lru_size


-wk





Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] [EXT] [Glusterusers] Using glusterfs for virtual machines with qco

2023-06-01 Thread W Kern
We use qcow2 with libvirt based kvm on many small clusters and have 
found it to be exremely reliable though maybe not the fastest, though 
some of that is most of our storage is SATA SSDs in a software RAID1 
config for each brick.


What problems are you running into?

You just mention 'problems'

-wk

On 6/1/23 8:42 AM, Christian Schoepplein wrote:

Hi,

we'd like to use glusterfs for Proxmox and virtual machines with qcow2
disk images. We have a three node glusterfs setup with one volume and
Proxmox is attached and VMs are created, but after some time, and I think
after much i/o is going on for a VM, the data inside the virtual machine
gets corrupted. When I copy files from or to our glusterfs
directly everything is OK, I've checked the files with md5sum. So in general
our glusterfs setup seems to be OK I think..., but with the VMs and the self
growing qcow2 images there are problems. If I use raw images for the VMs
tests look better, but I need to do more testing to be sure, the problem is
a bit hard to reproduce :-(.

I've also asked on a Proxmox mailinglist, but got no helpfull response so
far :-(. So maybe you have any helping hint what might be wrong with our
setup, what needs to be configured to use glusterfs as a storage backend for
virtual machines with self growing disk images. e.g. Any helpfull tip would
be great, because I am absolutely no glusterfs expert and also not a expert
for virtualization and what has to be done to let all components play well
together... Thanks for your support!

Here some infos about our glusterfs setup, please let me know if you need
more infos. We are using Ubuntu 22.04 as operating system:

root@gluster1:~# gluster --version
glusterfs 10.1
Repository revision: git://git.gluster.org/glusterfs.git
Copyright (c) 2006-2016 Red Hat, Inc. 
GlusterFS comes with ABSOLUTELY NO WARRANTY.
It is licensed to you under your choice of the GNU Lesser
General Public License, version 3 or any later version (LGPLv3
or later), or the GNU General Public License, version 2 (GPLv2),
in all cases as published by the Free Software Foundation.
root@gluster1:~#

root@gluster1:~# gluster v status gfs_vms

Status of volume: gfs_vms
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick gluster1.linova.de:/glusterfs/sde1enc
/brick  58448 0  Y   1062218
Brick gluster2.linova.de:/glusterfs/sdc1enc
/brick  50254 0  Y   20596
Brick gluster3.linova.de:/glusterfs/sdc1enc
/brick  52840 0  Y   1627513
Brick gluster1.linova.de:/glusterfs/sdf1enc
/brick  49832 0  Y   1062227
Brick gluster2.linova.de:/glusterfs/sdd1enc
/brick  56095 0  Y   20612
Brick gluster3.linova.de:/glusterfs/sdd1enc
/brick  51252 0  Y   1627521
Brick gluster1.linova.de:/glusterfs/sdg1enc
/brick  54991 0  Y   1062230
Brick gluster2.linova.de:/glusterfs/sde1enc
/brick  60812 0  Y   20628
Brick gluster3.linova.de:/glusterfs/sde1enc
/brick  59254 0  Y   1627522
Self-heal Daemon on localhost   N/A   N/AY   1062249
Bitrot Daemon on localhost  N/A   N/AY   3591335
Scrubber Daemon on localhostN/A   N/AY   3591346
Self-heal Daemon on gluster2.linova.de  N/A   N/AY   20645
Bitrot Daemon on gluster2.linova.de N/A   N/AY   987517
Scrubber Daemon on gluster2.linova.de   N/A   N/AY   987588
Self-heal Daemon on gluster3.linova.de  N/A   N/AY   1627568
Bitrot Daemon on gluster3.linova.de N/A   N/AY   1627543
Scrubber Daemon on gluster3.linova.de   N/A   N/AY   1627554
  
Task Status of Volume gfs_vms

--
There are no active volume tasks
  
root@gluster1:~#


root@gluster1:~# gluster v status gfs_vms detail

Status of volume: gfs_vms
--
Brick: Brick gluster1.linova.de:/glusterfs/sde1enc/brick
TCP Port : 58448
RDMA Port: 0
Online   : Y
Pid  : 1062218
File System  : xfs
Device   : /dev/mapper/sde1enc
Mount Options: rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,noquota
Inode Size   : 512
Disk Space Free  : 3.6TB
Total Disk Space : 3.6TB
Inode Count  : 390700096
Free Inodes  

Re: [Gluster-users] thin arbiter vs standard arbiter

2018-08-01 Thread W Kern



On 8/1/18 11:04 AM, Amar Tumballi wrote:
This recently added document talks about some of the technicalities of 
the feature:


https://docs.gluster.org/en/latest/Administrator%20Guide/Thin-Arbiter-Volumes/

Please go through and see if it answers your questions.

-Amar




Well yes that does answer some. By skipping a lot more of the arbiter 
traffic, there may be some noticeable performance benefits especially in 
an older 1G network.

At least until you have to deal with a failure situation.

Though the "would you use it on a VM, either now or when the code is 
more seasoned?" question is still there.


I'm willing to try it out on some non-critical VMs (cloud-native stuff, 
where I always spawn from a golden image), but if it is not ready for 
production, then I don't want to bother with it at the moment.


-wk



On Wed, Aug 1, 2018 at 11:09 PM, wkmail > wrote:


I see mentions of thin arbiter in the 4.x notes and I am intrigued.

As I understand it, the thin arbiter volume is

a) receives its data on an async basis (thus it can be on a slower
link). Thus gluster isn't waiting around to verify if it actually
got the data.

b) is only consulted in situations where Gluster needs that third
vote, otherwise it is not consulted.

c) Performance should therefore be better because Gluster is only
seriously talking to 2 nodes instead of 3 nodes (as in normal
arbiter or rep 3)

Am I correct?

If so, is thin arbiter ready for production or at least use on
non-critical workloads?

How safe is it for VMs images (and/or VMs with sharding)?

How much faster is thin arbiter setup over a normal arbiter given
that the normal data only really sees the metadata?

In a degraded situation (i.e. loss of one real node), would having
a thin arbiter on a slow link be problematic until everything is
healed and returned to normal?

Sincerely,

-wk

___
Gluster-users mailing list
Gluster-users@gluster.org 
https://lists.gluster.org/mailman/listinfo/gluster-users






--
Amar Tumballi (amarts)



___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume hacked

2017-08-06 Thread W Kern



On 8/6/2017 4:57 PM, lemonni...@ulrar.net wrote:


Gluster already uses a vlan, the problem is that there is no easy way
that I know of to tell gluster not to listen on an interface, and I
can't not have a public IP on the server. I really wish ther was a
simple "listen only on this IP/interface" option for this


What about this?

transport.socket.bind-address

I know the were some BZs on it with earlier Gluster Versions, so I assume its 
still there now.

-bill


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How to remove dead peer, osrry urgent again :(

2017-06-11 Thread W Kern



On 6/10/2017 5:54 PM, Lindsay Mathieson wrote:

On 11/06/2017 10:46 AM, WK wrote:
I thought you had removed vna as defective and then ADDED in vnh as 
the replacement?


Why is vna still there? 


Because I *can't* remove it. It died, was unable to be brought up. The 
gluster peer detach command only works with live servers - A severe 
problem IMHO.



wow, yes that is problematic.

I wonder if replace-brick would have handled that.
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Recovering from Arb/Quorum Write Locks

2017-05-28 Thread W Kern

So I have testbed composed of a simple 2+1 replicate 3 with ARB testbed.

gluster1, gluster2 and gluster-arb (with shards)

My testing involves some libvirt VMs running continuous write fops on a 
localhost fuse mount on gluster1


Works great when all the pieces are there. Once I figured out the shard 
tuning, I was really happy with the speed, even with the older kit I was 
using for the testbed. Sharding is a huge win.


So for Failure testing I found the following:

If you take down the ARB, the VMs continue to run perfectly and when the 
ARB returns it catches up.


However, if you take down Gluster2 (with the ARB still being up) you 
often (but not always) get a write lock on one or more of the VMs, until 
Gluster2 recovers and heals.


Per the Docs, this Write Lock is evidently EXPECTED behavior with an 
Arbiter to avoid a Split-Brain.


As I understand it, if the Arb thinks that it knows about (and agrees 
with) data that exists on Gluster2 (now down) that should be written to 
Gluster1, it will write lock the volume because the ARB itself doesn't 
have that data and going forward is problematic until Gluster2's data  
is back in the cluster and can bring the volume back into proper sync.


OK, that is the reality of using an Rep2 + ARB versus a true Rep3 
environment. You get Split-Brain protection but not much increase in HA 
over old school Replica 2.


So I have some questions:

a) In the event that gluster2 had died and we have entered this write 
lock phase, how does one go forward if the Gluster2 outage can't be 
immediately (or remotely) resolved?


At that point I have some hung VMs and annoyed users.

The current quorum settings are:

# gluster volume get VOL all | grep 'quorum'
cluster.quorum-type auto
cluster.quorum-count2
cluster.server-quorum-type  server
cluster.server-quorum-ratio 0
cluster.quorum-readsno

Do I simply kill the quorum and and the VMs will continue where they 
left off?


gluster volume set VOL cluster.server-quorum-type none
gluster volume set VOL cluster.quorum-type none

If I do so, should I also kill the ARB (before or after)? or leave it up

Or should I switch to quorum-type fixed with a quorum count of 1?

b) If I WANT to take down Gluster2 for maintenance, how do I prevent the 
quorum write-lock from occurring.


I suppose I could fiddle with the quorum settings as above, but I'd like 
to be able to PAUSE/FLUSH/FSYNC the Volume before taking down Gluster2, 
then unpause and let the volume continue with Gluster1 and the ARB 
providing some sort of protection and to help when Gluster2 is returned 
to the cluster.


c) Does any of the above behaviour change when I switch to GFAPI

Sincerely

-bill





___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] ganesha.nfsd: `NTIRPC_1.4.3' not found

2017-05-21 Thread W Kern

Here is the relevant Advisory for RHEL which just came in

https://access.redhat.com/errata/RHSA-2017:1263


-bill



On 5/21/2017 10:23 PM, Jiffin Tony Thottan wrote:

Forwarding mail to ganesha list

Adding Kaleb as well who usually build nfs-ganesha packages


On 21/05/17 07:52, W Kern wrote:
I got bit by that during a maintenance session on a production NFS 
server.  I upgraded and got the same message.


libntirpc 1.4.4 is a security upgrade due to a DOS possibility with 
1.4.3 or earlier


but the nfs-ganesha package is still looking for 1.4.3

Unfortunately the maintainers removed the older libntirpc 1.4.3 
package but didn't update the nfs-ganesha deb to accept 1.4.4


I was in a hurry so I ended up digging up an older 1.4.3 Trusty deb 
package (I'm on Xenial) and installed that manually.


That seemed to work fine. NFS-Ganesha sees 1.4.3 and is fine with it.

When the nfs-ganasha package is fixed, I'll put back in the proper 
1.4.4 package


-wk


On 5/20/2017 1:33 AM, Bernhard Dübi wrote:

Hi,

is this list also dealing with nfs-ganesha problems?

I just ran a dist-upgrade on my Ubuntu 16.04 machine and now
nfs-ganesha doesn't start anymore

May 20 10:00:15 chastcvtprd03 bash[5720]: /usr/bin/ganesha.nfsd:
/lib/x86_64-linux-gnu/libntirpc.so.1.4: version `NTIRPC_1.4.3' not
found (required by /usr/bin/ganesha.nfsd)

Any hints?


Here some info about my system:

# uname -a
Linux hostname 4.4.0-78-generic #99-Ubuntu SMP Thu Apr 27 15:29:09 UTC
2017 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.2 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.2 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/;
SUPPORT_URL="http://help.ubuntu.com/;
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/;
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial


/etc/apt/sources.list.d# head *.list
==> gluster-ubuntu-glusterfs-3_8-xenial.list <==
deb http://ppa.launchpad.net/gluster/glusterfs-3.8/ubuntu xenial main
# deb-src http://ppa.launchpad.net/gluster/glusterfs-3.8/ubuntu 
xenial main


==> gluster-ubuntu-libntirpc-xenial.list <==
deb http://ppa.launchpad.net/gluster/libntirpc/ubuntu xenial main
# deb-src http://ppa.launchpad.net/gluster/libntirpc/ubuntu xenial main

==> gluster-ubuntu-nfs-ganesha-xenial.list <==
deb http://ppa.launchpad.net/gluster/nfs-ganesha/ubuntu xenial main
# deb-src http://ppa.launchpad.net/gluster/nfs-ganesha/ubuntu xenial 
main



# dpkg -l | grep -E 'gluster|ganesha|libntirpc'
ii  glusterfs-common 3.8.12-ubuntu1~xenial1
   amd64GlusterFS common libraries and translator
modules
ii  libntirpc1:amd64 1.4.4-ubuntu1~xenial1
   amd64new transport-independent RPC library
ii  nfs-ganesha 2.4.5-ubuntu1~xenial1
   amd64nfs-ganesha is a NFS server in User Space
ii  nfs-ganesha-fsal:amd64 2.4.5-ubuntu1~xenial1
   amd64nfs-ganesha fsal libraries


Best Regards
Bernhard
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

---
This email has been checked for viruses by AVG.
http://www.avg.com



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users




___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] URGENT - Cheat on quorum

2017-05-21 Thread W Kern
So I am experimenting with shards using a couple VMs and decided to test 
his scenario (i.e. only one node available on a simple 2 node + 1 
arbiter replicated/sharded volume use 3.10.1 on Cent7.3)


I setup a VM testbed. Then verified everything including the sharding 
works and then shutdown nodes 2 and 3 (the arbiter).


As expected I got a quorum error on the mount.

So I tried

gluster volume set VOL cluster.quorum-type none

from the remaining 'working' node1 and it simply responds with

"volume set: failed: Quorum not met. Volume operation not allowed"

how do you FORCE gluster to ignore the quorum in such a situation?

I tried stopping the volume and even rebooting node1 and still get the 
error (And of course the volume wont start for the same reason)


-WK


On 5/18/2017 7:41 AM, Ravishankar N wrote:

On 05/18/2017 07:18 PM, lemonni...@ulrar.net wrote:

Hi,


We are having huge hardware issues (oh joy ..) with RAID cards.
On a replica 3 volume, we have 2 nodes down. Can we somehow tell
gluster that it's quorum is 1, to get some amount of service back
while we try to fix the other nodes or install new ones ?
If you know what you are getting into, then `gluster v set  
cluster.quorum-type none` should give you the desired result, i.e. 
allow write access to the volume.

Thanks


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users




 
	Virus-free. www.avg.com 
 



<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] ganesha.nfsd: `NTIRPC_1.4.3' not found

2017-05-20 Thread W Kern
I got bit by that during a maintenance session on a production NFS 
server.  I upgraded and got the same message.


libntirpc 1.4.4 is a security upgrade due to a DOS possibility with 
1.4.3 or earlier


but the nfs-ganesha package is still looking for 1.4.3

Unfortunately the maintainers removed the older libntirpc 1.4.3 package 
but didn't update the nfs-ganesha deb to accept 1.4.4


I was in a hurry so I ended up digging up an older 1.4.3 Trusty deb 
package (I'm on Xenial) and installed that manually.


That seemed to work fine. NFS-Ganesha sees 1.4.3 and is fine with it.

When the nfs-ganasha package is fixed, I'll put back in the proper 1.4.4 
package


-wk


On 5/20/2017 1:33 AM, Bernhard Dübi wrote:

Hi,

is this list also dealing with nfs-ganesha problems?

I just ran a dist-upgrade on my Ubuntu 16.04 machine and now
nfs-ganesha doesn't start anymore

May 20 10:00:15 chastcvtprd03 bash[5720]: /usr/bin/ganesha.nfsd:
/lib/x86_64-linux-gnu/libntirpc.so.1.4: version `NTIRPC_1.4.3' not
found (required by /usr/bin/ganesha.nfsd)

Any hints?


Here some info about my system:

# uname -a
Linux hostname 4.4.0-78-generic #99-Ubuntu SMP Thu Apr 27 15:29:09 UTC
2017 x86_64 x86_64 x86_64 GNU/Linux

# cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04.2 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.2 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/;
SUPPORT_URL="http://help.ubuntu.com/;
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/;
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial


/etc/apt/sources.list.d# head *.list
==> gluster-ubuntu-glusterfs-3_8-xenial.list <==
deb http://ppa.launchpad.net/gluster/glusterfs-3.8/ubuntu xenial main
# deb-src http://ppa.launchpad.net/gluster/glusterfs-3.8/ubuntu xenial main

==> gluster-ubuntu-libntirpc-xenial.list <==
deb http://ppa.launchpad.net/gluster/libntirpc/ubuntu xenial main
# deb-src http://ppa.launchpad.net/gluster/libntirpc/ubuntu xenial main

==> gluster-ubuntu-nfs-ganesha-xenial.list <==
deb http://ppa.launchpad.net/gluster/nfs-ganesha/ubuntu xenial main
# deb-src http://ppa.launchpad.net/gluster/nfs-ganesha/ubuntu xenial main


# dpkg -l | grep -E 'gluster|ganesha|libntirpc'
ii  glusterfs-common  3.8.12-ubuntu1~xenial1
   amd64GlusterFS common libraries and translator
modules
ii  libntirpc1:amd64  1.4.4-ubuntu1~xenial1
   amd64new transport-independent RPC library
ii  nfs-ganesha   2.4.5-ubuntu1~xenial1
   amd64nfs-ganesha is a NFS server in User Space
ii  nfs-ganesha-fsal:amd642.4.5-ubuntu1~xenial1
   amd64nfs-ganesha fsal libraries


Best Regards
Bernhard
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

---
This email has been checked for viruses by AVG.
http://www.avg.com



___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users