Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread WK
I've always wondered what the scenario for these situations are (aside from the doc description of nodes coming up and down). Aren't Gluster writes atomic for all nodes?  I seem to recall Jeff Darcy stating that years ago. So a clean shutdown for maintenance shouldn't be a problem at all. If

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread WK
Pavel. Is there a difference between native client (fuse) and libgfapi in regards to the crashing/read-only behaviour? We use Rep2 + Arb and can shutdown a node cleanly, without issue on our VMs. We do it all the time for upgrades and maintenance. However we are still on native client as

Re: [Gluster-users] Slow performance of gluster volume

2017-09-08 Thread Abi Askushi
Following changes resolved the perf issue: Added the option /etc/glusterfs/glusterd.vol : option rpc-auth-allow-insecure on restarted glusterd Then set the volume option: gluster volume set vms server.allow-insecure on I am reaching now the max network bandwidth and performance of VMs is quite

Re: [Gluster-users] Can I use 3.7.11 server with 3.10.5 client?

2017-09-08 Thread Shyam Ranganathan
On 09/08/2017 01:32 PM, Serkan Çoban wrote: Any suggestions? On Thu, Sep 7, 2017 at 4:35 PM, Serkan Çoban wrote: Hi, Is it safe to use 3.10.5 client with 3.7.11 server with read-only data move operation? The normal upgrade, and hence tested, procedure is older

Re: [Gluster-users] Can I use 3.7.11 server with 3.10.5 client?

2017-09-08 Thread Serkan Çoban
Any suggestions? On Thu, Sep 7, 2017 at 4:35 PM, Serkan Çoban wrote: > Hi, > > Is it safe to use 3.10.5 client with 3.7.11 server with read-only data > move operation? > Client will have 3.10.5 glusterfs-client packages. It will mount one > volume from 3.7.11 cluster and

Re: [Gluster-users] Redis db permission issue while running GitLab in Kubernetes with Gluster

2017-09-08 Thread Gaurav Chhabra
You were right John. After you mentioned about the file names, i checked the listing again and yes, the uid 1000 does belongs to 'git' user present on the GitLab container. Actually the long listing i mentioned in my first mail had all contents mapped from GitLab, Redis and PostgreSQL in one

Re: [Gluster-users] Redis db permission issue while running GitLab in Kubernetes with Gluster

2017-09-08 Thread John Strunk
Getting this answer back on the list in case anyone else is trying to share storage. Thanks for the docs pointer, Tanner. -John On Thu, Sep 7, 2017 at 6:50 PM, Tanner Bruce wrote: > You can set a security context on your pod to set the guid as needed: >

Re: [Gluster-users] Announcing GlusterFS release 3.12.0 (Long Term Maintenance)

2017-09-08 Thread Niels de Vos
On Wed, Sep 06, 2017 at 05:45:05PM -0400, Shyam Ranganathan wrote: > On 09/05/2017 02:07 PM, Serkan Çoban wrote: > > For rpm packages you can use [1], just installed without any problems. > > It is taking time packages to land in Centos storage SIG repo... > > Thank you for reporting this. The

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
Well I really do not like the non-deterministic characteristic of it. However the server crash did never occur in my production environment - only upgrades and reboots ;-) -ps On Fri, Sep 8, 2017 at 2:13 PM, Gandalf Corvotempesta wrote: > 2017-09-08 14:11

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
Btw after few more seconds in SIGTERM scenario, VM kind of revived and seems to be fine... And after few more restarts of fio job, I got I/O error. -ps On Fri, Sep 8, 2017 at 2:11 PM, Pavel Szalbot wrote: > Gandalf, SIGKILL (killall -9 glusterfsd) did not stop I/O after

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Gandalf Corvotempesta
2017-09-08 14:11 GMT+02:00 Pavel Szalbot : > Gandalf, SIGKILL (killall -9 glusterfsd) did not stop I/O after few > minutes. SIGTERM on the other hand causes crash, but this time it is > not read-only remount, but around 10 IOPS tops and 2 IOPS on average. > -ps So, seems

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
Gandalf, SIGKILL (killall -9 glusterfsd) did not stop I/O after few minutes. SIGTERM on the other hand causes crash, but this time it is not read-only remount, but around 10 IOPS tops and 2 IOPS on average. -ps On Fri, Sep 8, 2017 at 1:56 PM, Diego Remolina wrote: > I

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Diego Remolina
I currently only have a Windows 2012 R2 server VM in testing on top of the gluster storage, so I will have to take some time to provision a couple Linux VMs with both ext4 and XFS to see what happens on those. The Windows server VM is OK with killall glusterfsd, but when the 42 second timeout

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
I added firewall rule to block all traffic from Gluster VLAN on one of the nodes. Approximately 3 minutes in and no crash so far. Errors about missing node in qemu instance log are present, but this is normal. -ps On Fri, Sep 8, 2017 at 1:53 PM, Gandalf Corvotempesta

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Gandalf Corvotempesta
2017-09-08 13:44 GMT+02:00 Pavel Szalbot : > I did not test SIGKILL because I suppose if graceful exit is bad, SIGKILL > will be as well. This assumption might be wrong. So I will test it. It would > be interesting to see client to work in case of crash (SIGKILL) and not

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
On Sep 8, 2017 13:36, "Gandalf Corvotempesta" < gandalf.corvotempe...@gmail.com> wrote: 2017-09-08 13:21 GMT+02:00 Pavel Szalbot : > Gandalf, isn't possible server hard-crash too much? I mean if reboot > reliably kills the VM, there is no doubt network crash or poweroff >

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Gandalf Corvotempesta
2017-09-08 13:21 GMT+02:00 Pavel Szalbot : > Gandalf, isn't possible server hard-crash too much? I mean if reboot > reliably kills the VM, there is no doubt network crash or poweroff > will as well. IIUP, the only way to keep I/O running is to gracefully exiting

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
So even killall situation eventually kills VM (I/O errors). Gandalf, isn't possible server hard-crash too much? I mean if reboot reliably kills the VM, there is no doubt network crash or poweroff will as well. I am tempted to test this setup on DigitalOcean to eliminate possibility of my

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Gandalf Corvotempesta
2017-09-08 13:07 GMT+02:00 Pavel Szalbot : > OK, so killall seems to be ok after several attempts i.e. iops do not stop > on VM. Reboot caused I/O errors after maybe 20 seconds since issuing the > command. I will check the servers console during reboot to see if the VM >

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
OK, so killall seems to be ok after several attempts i.e. iops do not stop on VM. Reboot caused I/O errors after maybe 20 seconds since issuing the command. I will check the servers console during reboot to see if the VM errors appear just after the power cycle and will try to crash the VM after

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Diego Remolina
I would prefer the behavior was different to what it is of I/O stopping. The argument I heard for the long 42 second time out was that MTBF on a server was high, and that the client reconnection operation was *costly*. Those were arguments to *not* change the ping timeout value down from 42

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
Btw now I am experiencing "Transport endpoint disconnects" because of 1s ping-timeout even though nodes are up. This sucks. The network is not overloaded at all, switches are used only by gluster network and network consists only of three gluster nodes and one VM hypervisor and Cinder controller

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
On Fri, Sep 8, 2017 at 12:48 PM, Gandalf Corvotempesta wrote: > I think this should be considered a bug > If you have a server crash, glusterfsd process obviously doesn't exit > properly and thus this could least to IO stop ? I agree with you completely in this.

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
On Fri, Sep 8, 2017 at 12:43 PM, Diego Remolina wrote: > This is exactly the problem, > > Systemctl stop glusterd does *not* kill the brick processes. Yes, I now. > On CentOS with gluster 3.10.x there is also a service, meant to only stop > glusterfsd (brick processes). I

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
On Fri, Sep 8, 2017 at 12:38 PM, Diego Remolina wrote: > If your VMs use ext4 also check this: > > https://joejulian.name/blog/keeping-your-vms-from-going-read-only-when-encountering-a-ping-timeout-in-glusterfs/ I know about this post, but as I pointed out - ping-timeout does

[Gluster-users] pausing scrub crashed scrub daemon on nodes

2017-09-08 Thread Amudhan P
Hi, I am using glusterfs 3.10.1 with 30 nodes each with 36 bricks and 10 nodes each with 16 bricks in a single cluster. By default I have paused scrub process to have it run manually. for the first time, i was trying to run scrub-on-demand and it was running fine, but after some time, i decided

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
Hi Diego, indeed glusterfsd processes are runnin and it is the reason I do server reboot instead of systemctl glusterd stop. Is killall different from reboot in a way glusterfsd processes are terminated in CentOS (init 1?)? However I will try this and let you know. -ps On Fri, Sep 8, 2017 at

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
This is the qemu log of instance: [2017-09-08 09:31:48.381077] C [rpc-clnt-ping.c:160:rpc_clnt_ping_timer_expired] 0-gv_openstack_1-client-1: server 10.0.1.202:49152 has not responded in the last 1 seconds, disconnecting. [2017-09-08 09:31:48.382411] E [rpc-clnt.c:365:saved_frames_unwind] (-->

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
On Fri, Sep 8, 2017 at 11:42 AM, wrote: > Oh, you really don't want to go below 30s, I was told. > I'm using 30 seconds for the timeout, and indeed when a node goes down > the VM freez for 30 seconds, but I've never seen them go read only for > that. > > I _only_ use virtio

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread lemonnierk
Oh, you really don't want to go below 30s, I was told. I'm using 30 seconds for the timeout, and indeed when a node goes down the VM freez for 30 seconds, but I've never seen them go read only for that. I _only_ use virtio though, maybe it's that. What are you using ? On Fri, Sep 08, 2017 at

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
Back to replica 3 w/o arbiter. Two fio jobs running (direct=1 and direct=0), rebooting one node... and VM dmesg looks like: [ 483.862664] blk_update_request: I/O error, dev vda, sector 23125016 [ 483.898034] blk_update_request: I/O error, dev vda, sector 2161832 [ 483.901103]

Re: [Gluster-users] ovirt 4.1 hosted engine hyper converged on glusterfs 3.8.10 : "engine" storage domain alway complain about "unsynced" elements

2017-09-08 Thread yayo (j)
2017-07-19 11:22 GMT+02:00 yayo (j) : > running the "gluster volume heal engine" don't solve the problem... > > Some extra info: > > We have recently changed the gluster from: 2 (full repliacated) + 1 > arbiter to 3 full replicated cluster but i don't know this is the problem...

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread Pavel Szalbot
FYI I set up replica 3 (no arbiter this time), did the same thing - rebooted one node during lots of file IO on VM and IO stopped. As I mentioned either here or in another thread, this behavior is caused by high default of network.ping-timeout. My main problem used to be that setting it to low