Re: [Gluster-users] VMs blocked for more than 120 seconds

2019-05-13 Thread lemonnierk
On Mon, May 13, 2019 at 08:47:45AM +0200, Martin Toth wrote:
> Hi all,

Hi

> 
> I am running replica 3 on SSDs with 10G networking, everything works OK but 
> VMs stored in Gluster volume occasionally freeze with “Task XY blocked for 
> more than 120 seconds”.
> Only solution is to poweroff (hard) VM and than boot it up again. I am unable 
> to SSH and also login with console, its stuck probably on some disk 
> operation. No error/warning logs or messages are store in VMs logs.
> 

As far as I know this should be unrelated, I get this during heals
without any freezes, it just means the storage is slow I think.

> KVM/Libvirt(qemu) using libgfapi and fuse mount to access VM disks on replica 
> volume. Can someone advice  how to debug this problem or what can cause these 
> issues? 
> It’s really annoying, I’ve tried to google everything but nothing came up. 
> I’ve tried changing virtio-scsi-pci to virtio-blk-pci disk drivers, but its 
> not related.
> 

Any chance your gluster goes readonly ? Have you checked your gluster
logs to see if maybe they lose each other some times ?
/var/log/glusterfs

For libgfapi accesses you'd have it's log on qemu's standard output,
that might contain the actual error at the time of the freez.
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Settings for VM hosting

2019-04-19 Thread lemonnierk
On Fri, Apr 19, 2019 at 06:47:49AM +0530, Krutika Dhananjay wrote:
> Looks good mostly.
> You can also turn on performance.stat-prefetch, and also set

Ah the corruption bug has been fixed, I missed that. Great !

> client.event-threads and server.event-threads to 4.

I didn't realize that would also apply to libgfapi ?
Good to know, thanks.

> And if your bricks are on ssds, then you could also enable
> performance.client-io-threads.

I'm surprised by that, the doc says "This feature is not recommended for
distributed, replicated or distributed-replicated volumes."
Since this volume is just a replica 3, shouldn't this stay off ?
The disks are all nvme, which I assume would count as ssd.

> And if your bricks and hypervisors are on same set of machines
> (hyperconverged),
> then you can turn off cluster.choose-local and see if it helps read
> performance.

Thanks, we'll give those a try !
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Settings for VM hosting

2019-04-18 Thread lemonnierk
On Thu, Apr 18, 2019 at 03:13:25PM +0200, Martin Toth wrote:
> Hi,
> 
> I am curious about your setup and settings also. I have exactly same setup 
> and use case.
> 
> - why do you use sharding on replica3? Do you have various size of 
> bricks(disks) pre node?
>

Back in the 3.7 era there was a bug locking the files during heal. So
without sharding the whole disk was locked for ~30 minutes (depending on
the disk's size of course), it was briningh the whole service down
during heals.

We started using shards then because it locks only the shard being
healed instead of the whole file, I believe the bug has been fixed since
but we've kept it just in case.

As a bonus combined to the heal algo full (just re-transmit the shard
instead of trying to figure out what's changed) it's much, much faster
heal times with very little cpu usage, so really there's no reason not
to imho, sharding is great.
Might be different if you have big dedicated servers for gluster and
nothing else to do with your cpu, I don't know, but for us sharding is a
big gain during heals, which unfortunatly is very common on OVH's shaky
vRacks :(
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Settings for VM hosting

2019-04-18 Thread lemonnierk
Hi,

We've been using the same settings, found in an old email here, since
v3.7 of gluster for our VM hosting volumes. They've been working fine
but since we've just installed a v6 for testing I figured there might
be new settings I should be aware of.

So for access through the libgfapi (qemu), for VM hard drives, is that
still optimal and recommended ?

Volume Name: glusterfs
Type: Replicate
Volume ID: b28347ff-2c27-44e0-bc7d-c1c017df7cd1
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: ips1adm.X:/mnt/glusterfs/brick
Brick2: ips2adm.X:/mnt/glusterfs/brick
Brick3: ips3adm.X:/mnt/glusterfs/brick
Options Reconfigured:
performance.readdir-ahead: on
cluster.quorum-type: auto
cluster.server-quorum-type: server
network.remote-dio: enable
cluster.eager-lock: enable
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
features.shard: on
features.shard-block-size: 64MB
cluster.data-self-heal-algorithm: full
network.ping-timeout: 30
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

Thanks !
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Announcing Glusterfs release 3.12.13 (Long Term Maintenance)

2018-08-27 Thread lemonnierk
Hi,

Seems like you linked the 3.12.12 changelog instead of the 3.12.13 one.
Does it fix the memory leak problem ?

Thanks

On Mon, Aug 27, 2018 at 11:10:21AM +0530, Jiffin Tony Thottan wrote:
> The Gluster community is pleased to announce the release of Gluster 
> 3.12.13 (packages available at [1,2,3]).
> 
> Release notes for the release can be found at [4].
> 
> Thanks,
> Gluster community
> 
> 
> [1] https://download.gluster.org/pub/gluster/glusterfs/3.12/3.12.13/
> [2] https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.12
> [3] https://build.opensuse.org/project/subprojects/home:glusterfs
> [4] Release notes: 
> https://gluster.readthedocs.io/en/latest/release-notes/3.12.12/
> 

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users


-- 
PGP Fingerprint : 0x624E42C734DAC346
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Disconnected peers after reboot

2018-08-20 Thread lemonnierk
Hi,

To add to the problematic memory leak, I've been seeing another strange
behavior on the 3.12 servers. When I reboot a node, it seems like often
(but not always) the other nodes mark it as disconnected and won't
accept it back until I restart them.

Sometimes I need to restart the glusterd on other nodes, sometimes on
the node I rebooted too, but not always.
I'm also seeing that after a network outage of course, I have bricks
staying down because quorum isn't met on some nodes until I restart
their glusterd.

3.7 didn't have that problem at all, so it must be a new bug. It's very
problematic because we end up with VMs locked, or doing I/O errors after
simple node reboots, making upgrades impossible to perform without the
clients noticing everything went down. Sometimes we don't even see a VM
gets I/O errors, it takes a while for that to show on some of them ..

-- 
PGP Fingerprint : 0x624E42C734DAC346
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Gluster release 3.12.13 (Long Term Maintenance) Canceled for 10th of August, 2018

2018-08-14 Thread lemonnierk
Hi,

That's actually pretty bad, we've all been waiting for the memory leak
patch for a while now, an extra month is a bit of a nightmare for us.

Is there no way to get 3.12.12 with that patch sooner, at least ? I'm
getting a bit tired of rebooting virtual machines by hand everyday to
avoid the OOM killer ..

On Tue, Aug 14, 2018 at 04:12:28PM +0530, Jiffin Tony Thottan wrote:
> Hi,
> 
> Currently master branch is lock for fixing failures in the regression 
> test suite [1].
> 
> As a result we are not releasing the next minor update for the 3.12 branch,
> 
> which falls on the 10th of every month.
> 
> The next 3.12 update would be around the 10th of September, 2018.
> 
> Apologies for the delay to inform above details.
> 
> [1] 
> https://lists.gluster.org/pipermail/gluster-devel/2018-August/055160.html
> 
> Regards,
> 
> Jiffin
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

-- 
PGP Fingerprint : 0x624E42C734DAC346
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] gluster 3.12 memory leak

2018-08-07 Thread lemonnierk
Hi,

Any chance that was what's leaking for the libgfapi users too ?
I assume the next release you mention will be 3.12.13, is that correct ?

On Tue, Aug 07, 2018 at 11:33:58AM +0530, Hari Gowtham wrote:
> Hi,
> 
> The reason for memory leak was found. The patch (
> https://review.gluster.org/#/c/20437/ ) will fix the leak.
> Should be made available with the next release. You can keep an eye on it.
> For more info refer the above mentioned bug.
> 
> Regards,
> Hari.
> On Fri, Aug 3, 2018 at 7:36 PM Alex K  wrote:
> >
> > Thank you Hari.
> > Hope we get a fix soon to put us out of our misery J
> >
> > Alex
> >
> > On Fri, Aug 3, 2018 at 4:58 PM, Hari Gowtham  wrote:
> >>
> >> Hi,
> >>
> >> It is a known issue.
> >> This bug will give more insight on the memory leak.
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1593826
> >> On Fri, Aug 3, 2018 at 6:15 PM Alex K  wrote:
> >> >
> >> > Hi,
> >> >
> >> > I was using 3.8.12-1 up to 3.8.15-2. I did not have issue with these 
> >> > versions.
> >> > I still have systems running with those with no such memory leaks.
> >> >
> >> > Thanx,
> >> > Alex
> >> >
> >> >
> >> > On Fri, Aug 3, 2018 at 3:13 PM, Nithya Balachandran 
> >> >  wrote:
> >> >>
> >> >> Hi,
> >> >>
> >> >> What version of gluster were you using before you  upgraded?
> >> >>
> >> >> Regards,
> >> >> Nithya
> >> >>
> >> >> On 3 August 2018 at 16:56, Alex K  wrote:
> >> >>>
> >> >>> Hi all,
> >> >>>
> >> >>> I am using gluster 3.12.9-1 on ovirt 4.1.9 and I have observed 
> >> >>> consistent high memory use which at some point renders the hosts 
> >> >>> unresponsive. This behavior is observed also while using 3.12.11-1 
> >> >>> with ovirt 4.2.5. I did not have this issue prior to upgrading gluster.
> >> >>>
> >> >>> I have seen a relevant bug reporting memory leaks of gluster and it 
> >> >>> seems that this is the case for my trouble. To temporarily resolve the 
> >> >>> high memory issue, I put hosts in maintenance then activate them back 
> >> >>> again. This indicates that the memory leak is caused from the gluster 
> >> >>> client. Ovirt is using fuse mounts.
> >> >>>
> >> >>> Is there any bug fx available for this?
> >> >>> This issue is hitting us hard with several production installations.
> >> >>>
> >> >>> Thanx,
> >> >>> Alex
> >> >>>
> >> >>> ___
> >> >>> Gluster-users mailing list
> >> >>> Gluster-users@gluster.org
> >> >>> https://lists.gluster.org/mailman/listinfo/gluster-users
> >> >>
> >> >>
> >> >
> >> > ___
> >> > Gluster-users mailing list
> >> > Gluster-users@gluster.org
> >> > https://lists.gluster.org/mailman/listinfo/gluster-users
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Hari Gowtham.
> >
> >
> 
> 
> -- 
> Regards,
> Hari Gowtham.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

-- 
PGP Fingerprint : 0x624E42C734DAC346
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Memory leak with the libgfapi in 3.12 ?

2018-08-01 Thread lemonnierk
Hey,

Is there by any chance a known bug about a memory leak for the libgfapi
in the latests 3.12 releases ?
I've migrated a lot of virtual machines from an old proxmox cluster to a
new one, with a newer gluster (3.12.10) and ever since the virtual
machines have been eating more and more RAM all the time, without ever
stopping. I have 8 Gb machines occupying 40 Gb or ram, which they
weren't doing on the old cluster.

It could be a proxmox problem, maybe a leak in their qemu, but since
no one seems to be reporting that problem I wonder if maybe the newer
gluster might have a leak, I believe libgfapi isn't used much.
I tried looking at the bug tracker but I don't see anything obvious, the
only leak I found seems to be for distributed volumes, but we only use
replica mode.

Is anyone aware of a way to know if libgfapi is responsible or not ?
Does it have any kind of reporting I could enable ? Worse case I could
always boot a VM through the fuse mount instead of libgfapi, but that's
not ideal, it'd take a while to confirm.

-- 
PGP Fingerprint : 0x624E42C734DAC346


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] adding third brick to replica volume and suddenly files need healing

2018-06-11 Thread lemonnierk
Hi,

That's normal, the heal is how it syncs the files to the new bricks.
And yes, the heal shows on the sources, not on the destination, which is
a bit weird but that's just how it is :)

On Mon, Jun 11, 2018 at 10:25:08AM +0100, lejeczek wrote:
> hi guys
> 
> I've had two replicas volume, added third brick and now I see hundred 
> thousand files to heal, interestingly though only on two bricks that 
> already constituted the volume.
> Volume prior to expansion was, according to gluster, okey and when I 
> added third brick it immediately started increasing count of files in 
> need of healing.
> 
> I that normal behavior? (and if not then how to troubleshoot it? $ heal 
> full does not seem to do anything, unless it takes time for over 100K 
> files on a fairly slow storage?)
> 
> I do heal info and I see a lot of:
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ...
> 
> many thanks, L
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
PGP Fingerprint : 0x624E42C734DAC346
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


[Gluster-users] Current bug for VM hosting with 3.12 ?

2018-06-11 Thread lemonnierk
Hi,

Given the numerous problems we've had with setting up gluster for VM
hosting at the start, we've been staying with 3.7.15, which was the
first version to work properly.

However the repo for 3.7.15 is now down, so we've decided to give
3.12.9 a try. Unfortunatly, a few days ago, one of our nodes rebooted
and after a quick heal one of the VM wasn't in a great state. Didn't
think much of it, but I'm seeing right now other VMs doing I/O errors ..
Just like with the versions of gluster < 3.7.15, which were causing
corruption of disk images.

Are there any known bug with 3.12.9 ? Any new settings we should have
enabled but might have missed ?

Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
network.ping-timeout: 30
cluster.data-self-heal-algorithm: full
features.shard-block-size: 64MB
features.shard: on
performance.stat-prefetch: off
performance.read-ahead: off
performance.quick-read: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.server-quorum-type: server
cluster.quorum-type: auto
performance.readdir-ahead: on

I haven't had the courrage to reboot the VM yet, guess I'll go do that

-- 
PGP Fingerprint : 0x624E42C734DAC346
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] @devel - Why no inotify?

2018-05-03 Thread lemonnierk
Hey,

I thought about it a while back, haven't actually done it but I assume
using inotify on the brick should work, at least in replica volumes
(disperse probably wouldn't, you wouldn't get all events or you'd need
to make sure your inotify runs on every brick). Then from there you
could notify your clients, not ideal, but that should work.

I agree that adding support for inotify directly into gluster would be
great, but I'm not sure gluster has any mechanics for notifying clients
of changes since most of the logic is in the client, as I understand it.

On Thu, May 03, 2018 at 04:33:30PM +0100, lejeczek wrote:
> hi guys
> 
> will we have gluster with inotify? some point / never?
> 
> thanks, L.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
PGP Fingerprint : 0x624E42C734DAC346
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users


Re: [Gluster-users] Exact purpose of network.ping-timeout

2017-12-29 Thread lemonnierk
On Fri, Dec 29, 2017 at 03:19:36PM +1100, Sam McLeod wrote:
> Sure, if you never restart / autoscale anything and if your use case isn't 
> bothered with up to 42 seconds of downtime, for us - 42 seconds is a really 
> long time for something like a patient management system to refuse file 
> attachments from being uploaded etc...
> 

It won't refuse anything for 42 seconds, it'll just take 42 seconds +
whatever time the upload would take to complete.
Might be as bad to you, I don't know, but it shouldn't refuse.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Exact purpose of network.ping-timeout

2017-12-28 Thread lemonnierk
I/O is frozen, so you don't get errors, just a delay when accessing.
It's completly transparent, and for VM disks at least even 40 seconds is
fine, not long enough for a web server to timeout, the visitor just
thinks the site was slow for a minute.

Really hasn't been that bad here, but I guess it all depends on what
the files are

On Thu, Dec 28, 2017 at 12:57:21PM +1100, Sam McLeod wrote:
> 10 seconds is a very long time for files to go away for applications used at 
> any scale, it is however what I've set our failover time to after being 
> shocked by the default of 42 seconds.
> 
> --
> Sam McLeod
> https://smcleod.net
> https://twitter.com/s_mcleod
> 
> > On 27 Dec 2017, at 10:17 pm, Omar Kohl  wrote:
> > 
> > Hi,
> > 
> >> If you set it to 10 seconds, and a node goes down, you'll see a 10 seconds 
> >> freez in all I/O for the volume.
> > 
> > Exactly! ONLY 10 seconds instead of the default 42 seconds :-)
> > 
> > As I said before the problem with the 42 seconds is that a Windows Samba 
> > Client will disconnect (and therefore interrupt any read/write operation) 
> > after waiting for about 25 seconds. So 42 seconds is too high. In this case 
> > it would therefore make more sense to reduce the ping-timeout, right?
> > 
> > Has anyone done any performance measurements on what the implications of a 
> > low ping-timeout are? What are the costs of "triggering heals all the time"?
> > 
> > On a related note I found the 
> > extras/hook-scripts/start/post/S29CTDBsetup.sh script that mounts a CTDB 
> > (Samba) share and explicitly sets the ping-timeout to 10 seconds. There is 
> > a comment saying: "Make sure ping-timeout is not default for CTDB volume". 
> > Unfortunately there is no explanation in the script, in the commit or in 
> > the Gerrit review history (https://review.gluster.org/#/c/7569/, 
> > https://review.gluster.org/#/c/8007/) for WHY you make sure ping-timeout is 
> > not default. Can anyone tell me the reason?
> > 
> > Kind regards,
> > Omar
> > 
> 

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Exact purpose of network.ping-timeout

2017-12-28 Thread lemonnierk
Can't tell you, I only use gluster for VM disks.
The heal will hammer performances pretty bad, but that really depends on
what you do, so I'd say test it a bunch and use whatever works best.

I think they advise for a high value to make sure you don't have two
nodes marked down in cose succession, which could either cause a
split-brain or make your volume readonly for a while, depending on your
config and number of nodes.

On Wed, Dec 27, 2017 at 11:17:01AM +, Omar Kohl wrote:
> Hi,
> 
> > If you set it to 10 seconds, and a node goes down, you'll see a 10 seconds 
> > freez in all I/O for the volume.
> 
> Exactly! ONLY 10 seconds instead of the default 42 seconds :-)
> 
> As I said before the problem with the 42 seconds is that a Windows Samba 
> Client will disconnect (and therefore interrupt any read/write operation) 
> after waiting for about 25 seconds. So 42 seconds is too high. In this case 
> it would therefore make more sense to reduce the ping-timeout, right?
> 
> Has anyone done any performance measurements on what the implications of a 
> low ping-timeout are? What are the costs of "triggering heals all the time"?
> 
> On a related note I found the extras/hook-scripts/start/post/S29CTDBsetup.sh 
> script that mounts a CTDB (Samba) share and explicitly sets the ping-timeout 
> to 10 seconds. There is a comment saying: "Make sure ping-timeout is not 
> default for CTDB volume". Unfortunately there is no explanation in the 
> script, in the commit or in the Gerrit review history 
> (https://review.gluster.org/#/c/7569/, https://review.gluster.org/#/c/8007/) 
> for WHY you make sure ping-timeout is not default. Can anyone tell me the 
> reason?
> 
> Kind regards,
> Omar
> 
> -Ursprüngliche Nachricht-
> Von: gluster-users-boun...@gluster.org 
> [mailto:gluster-users-boun...@gluster.org] Im Auftrag von lemonni...@ulrar.net
> Gesendet: Dienstag, 26. Dezember 2017 22:05
> An: gluster-users@gluster.org
> Betreff: Re: [Gluster-users] Exact purpose of network.ping-timeout
> 
> Hi,
> 
> It's just the delay for which a node can stop responding before being marked 
> as down.
> Basically that's how long a node can go down before a heal becomes necessary 
> to bring it back.
> 
> If you set it to 10 seconds, and a node goes down, you'll see a 10 seconds 
> freez in all I/O for the volume. That's why you don't want it too high 
> (having a 2 minutes freez on I/O for example would be pretty bad, depending 
> on what you host), but you don't want it too low either (to avoid triggering 
> heals all the time).
> 
> You can configure it because it depends on what you host. You might be okay 
> with a few minutes freez to avoid a heal, or you might not care about heals 
> at all and prefer a very low value to avoid feezes.
> The default value should work pretty well for most things though
> 
> On Tue, Dec 26, 2017 at 01:11:48PM +, Omar Kohl wrote:
> > Hi,
> > 
> > I have a question regarding the "ping-timeout" option. I have been 
> > researching its purpose for a few days and it is not completely clear to 
> > me. Especially that it is apparently strongly encouraged by the Gluster 
> > community not to change or at least decrease this value!
> > 
> > Assuming that I set ping-timeout to 10 seconds (instead of the default 42) 
> > this would mean that if I have a network outage of 11 seconds then Gluster 
> > internally would have to re-allocate some resources that it freed after the 
> > 10 seconds, correct? But apart from that there are no negative 
> > implications, are there? For instance if I'm copying files during the 
> > network outage then those files will continue copying after those 11 
> > seconds.
> > 
> > This means that the only purpose of ping-timeout is to save those extra 
> > resources that are used by "short" network outages. Is that correct?
> > 
> > If I am confident that my network will not have many 11 second outages and 
> > if they do occur I am willing to incur those extra costs due to resource 
> > allocation is there any reason not to set ping-timeout to 10 seconds?
> > 
> > The problem I have with a long ping-timeout is that the Windows Samba 
> > Client disconnects after 25 seconds. So if one of the nodes of a Gluster 
> > cluster shuts down ungracefully then the Samba Client disconnects and the 
> > file that was being copied is incomplete on the server. These "costs" seem 
> > to be much higher than the potential costs of those Gluster resource 
> > re-allocations. But it is hard to estimate because there is not clear 
> > documentation what exactly those Gluster costs are.
> > 
> > In general I would be very interested in a comprehensive explanation of 
> > ping-timeout and the up- and downsides of setting high or low values for it.
> > 
> > Kinds regards,
> > Omar
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> 

Re: [Gluster-users] Exact purpose of network.ping-timeout

2017-12-26 Thread lemonnierk
Hi,

It's just the delay for which a node can stop responding before being
marked as down.
Basically that's how long a node can go down before a heal becomes
necessary to bring it back.

If you set it to 10 seconds, and a node goes down, you'll see a 10
seconds freez in all I/O for the volume. That's why you don't want it
too high (having a 2 minutes freez on I/O for example would be
pretty bad, depending on what you host), but you don't want it too
low either (to avoid triggering heals all the time).

You can configure it because it depends on what you host. You might be
okay with a few minutes freez to avoid a heal, or you might not care
about heals at all and prefer a very low value to avoid feezes.
The default value should work pretty well for most things though

On Tue, Dec 26, 2017 at 01:11:48PM +, Omar Kohl wrote:
> Hi,
> 
> I have a question regarding the "ping-timeout" option. I have been 
> researching its purpose for a few days and it is not completely clear to me. 
> Especially that it is apparently strongly encouraged by the Gluster community 
> not to change or at least decrease this value!
> 
> Assuming that I set ping-timeout to 10 seconds (instead of the default 42) 
> this would mean that if I have a network outage of 11 seconds then Gluster 
> internally would have to re-allocate some resources that it freed after the 
> 10 seconds, correct? But apart from that there are no negative implications, 
> are there? For instance if I'm copying files during the network outage then 
> those files will continue copying after those 11 seconds.
> 
> This means that the only purpose of ping-timeout is to save those extra 
> resources that are used by "short" network outages. Is that correct?
> 
> If I am confident that my network will not have many 11 second outages and if 
> they do occur I am willing to incur those extra costs due to resource 
> allocation is there any reason not to set ping-timeout to 10 seconds?
> 
> The problem I have with a long ping-timeout is that the Windows Samba Client 
> disconnects after 25 seconds. So if one of the nodes of a Gluster cluster 
> shuts down ungracefully then the Samba Client disconnects and the file that 
> was being copied is incomplete on the server. These "costs" seem to be much 
> higher than the potential costs of those Gluster resource re-allocations. But 
> it is hard to estimate because there is not clear documentation what exactly 
> those Gluster costs are.
> 
> In general I would be very interested in a comprehensive explanation of 
> ping-timeout and the up- and downsides of setting high or low values for it.
> 
> Kinds regards,
> Omar
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Adding a slack for communication?

2017-11-09 Thread lemonnierk
> 
> and for chat I've found that if IRC + a good web frontend for history/search 
> isn't enough using either Mattermost (https://about.mattermost.com/ 
> ) or Rocket Chat (https://rocket.chat/ 
> ) has been very successful.
> 

+1 for Rocket.Chat, we've switched to that when the team started asking
about slack (and I just never want to hear about that for us) and
everyone is very happy with it.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] create volume in two different Data Centers

2017-10-24 Thread lemonnierk
Hi,

You can, but unless the two datacenters are very close, it'll be slow as
hell. I tried it myself and even a 10ms ping between the bricks is
horrible.

On Tue, Oct 24, 2017 at 01:42:49PM +0330, atris adam wrote:
> Hi
> 
> I have two data centers, each of them have 3 servers. This two data centers
> can see each other over the internet.
> I want to create a distributed glusterfs volume with these 6 servers, but I
> have only one valid ip in each data center. Is it possible to create a
> glusterfs volume?Can anyone guide me?
> 
> thx alot

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] data corruption - any update?

2017-10-11 Thread lemonnierk
> corruption happens only in this cases:
> 
> - volume with shard enabled
> AND
> - rebalance operation
> 

I believe so

> So, what If I have to replace a failed brick/disks ? Will this trigger
> a rebalance and then corruption?
> 
> rebalance, is only needed when you have to expend a volume, ie by
> adding more bricks ?

That's correct, replacing a brick shouldn't cause corruption, I've done
it a few times without any problems. As long as you don't expand the
cluster, you are fine.

Basically you can add or remove replicas all you want, but you can't add
new replica sets.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Peer isolation while healing

2017-10-09 Thread lemonnierk
On Mon, Oct 09, 2017 at 03:29:41PM +0200, ML wrote:
> The server's load was huge during the healing (cpu at 100%), and the 
> disk latency increased a lot.

Depending on the file sizes, you might want to consider changing the
heal algortithm. Might be better to just re-download the whole file /
shard than to try and heal it, assuming you don't have big files. That
would free up the CPU


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Adding bricks to an existing installation.

2017-09-25 Thread lemonnierk
Do you have sharding enabled ? If yes, don't do it.
If no I'll let someone who knows better answer you :)

On Mon, Sep 25, 2017 at 02:27:13PM -0400, Ludwig Gamache wrote:
> All,
> 
> We currently have a Gluster installation which is made of 2 servers. Each
> server has 10 drives on ZFS. And I have a gluster mirror between these 2.
> 
> The current config looks like:
> SERVER A-BRICK 1 replicated to SERVER B-BRICK 1
> 
> I now need to add more space and a third server. Before I do the changes, I
> want to know if this is a supported config. By adding a third server, I
> simply want to distribute the load. I don't want to add extra redundancy.
> 
> In the end, I want to have the following done:
> Add a peer to the cluster
> Add 2 bricks to the cluster (one on server A and one on SERVER C) to the
> existing volume
> Add 2 bricks to the cluster (one on server B and one on SERVER C) to the
> existing volume
> After that, I need to rebalance all the data between the bricks...
> 
> Is this config supported? Is there something I should be careful before I
> do this? SHould I do a rebalancing before I add the 3 set of disks?
> 
> Regards,
> 
> 
> Ludwig

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-09 Thread lemonnierk
Mh, not so sure really, using libgfapi and it's been working perfectly
fine. And trust me, there had been A LOT of various crashes, reboots and
kill of nodes.

Maybe it's a version thing ? A new bug in the new gluster releases that
doesn't affect our 3.7.15.

On Sat, Sep 09, 2017 at 10:19:24AM -0700, WK wrote:
> Well, that makes me feel better.
> 
> I've seen all these stories here and on Ovirt recently about VMs going 
> read-only, even on fairly simply layouts.
> 
> Each time, I've responded that we just don't see those issues.
> 
> I guess the fact that we were lazy about switching to gfapi turns out to 
> be a potential explanation 
> 
> -wk
> 
> 
> 
> 
> 
> 
> On 9/9/2017 6:49 AM, Pavel Szalbot wrote:
> > Yes, this is my observation so far.
> >
> > On Sep 9, 2017 13:32, "Gionatan Danti"  > > wrote:
> >
> >
> > So, to recap:
> > - with gfapi, your VMs crashes/mount read-only with a single node
> > failure;
> > - with gpapi also, fio seems to have no problems;
> > - with native FUSE client, both VMs and fio have no problems at all.
> >
> 

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-08 Thread lemonnierk
Oh, you really don't want to go below 30s, I was told.
I'm using 30 seconds for the timeout, and indeed when a node goes down
the VM freez for 30 seconds, but I've never seen them go read only for
that.

I _only_ use virtio though, maybe it's that. What are you using ?


On Fri, Sep 08, 2017 at 11:41:13AM +0200, Pavel Szalbot wrote:
> Back to replica 3 w/o arbiter. Two fio jobs running (direct=1 and
> direct=0), rebooting one node... and VM dmesg looks like:
> 
> [  483.862664] blk_update_request: I/O error, dev vda, sector 23125016
> [  483.898034] blk_update_request: I/O error, dev vda, sector 2161832
> [  483.901103] blk_update_request: I/O error, dev vda, sector 2161832
> [  483.904045] Aborting journal on device vda1-8.
> [  483.906959] blk_update_request: I/O error, dev vda, sector 2099200
> [  483.908306] blk_update_request: I/O error, dev vda, sector 2099200
> [  483.909585] Buffer I/O error on dev vda1, logical block 262144,
> lost sync page write
> [  483.911121] blk_update_request: I/O error, dev vda, sector 2048
> [  483.912192] blk_update_request: I/O error, dev vda, sector 2048
> [  483.913221] Buffer I/O error on dev vda1, logical block 0, lost
> sync page write
> [  483.914546] EXT4-fs error (device vda1):
> ext4_journal_check_start:56: Detected aborted journal
> [  483.916230] EXT4-fs (vda1): Remounting filesystem read-only
> [  483.917231] EXT4-fs (vda1): previous I/O error to superblock detected
> [  483.917353] JBD2: Error -5 detected when updating journal
> superblock for vda1-8.
> [  483.921106] blk_update_request: I/O error, dev vda, sector 2048
> [  483.922147] blk_update_request: I/O error, dev vda, sector 2048
> [  483.923107] Buffer I/O error on dev vda1, logical block 0, lost
> sync page write
> 
> Root fs is read-only even with 1s ping-timeout...
> 
> I really hope I have been idiot for almost a year now and someone
> shows what am I doing completely wrong because I dream about joining
> the hordes of fellow colleagues who store multiple VMs in gluster and
> never had a problem with it. I also suspect the CentOS libvirt version
> to be the cause.
> 
> -ps
> 
> 
> On Fri, Sep 8, 2017 at 10:50 AM, Pavel Szalbot  
> wrote:
> > FYI I set up replica 3 (no arbiter this time), did the same thing -
> > rebooted one node during lots of file IO on VM and IO stopped.
> >
> > As I mentioned either here or in another thread, this behavior is
> > caused by high default of network.ping-timeout. My main problem used
> > to be that setting it to low values like 3s or even 2s did not prevent
> > the FS to be mounted as read-only in the past (at least with arbiter)
> > and docs describe reconnect as very costly. If I set ping-timeout to
> > 1s disaster of read-only mount is now prevented.
> >
> > However I find it very strange because in the past I actually did end
> > up with read-only filesystem despite of the low ping-timeout.
> >
> > With replica 3 after node reboot iftop shows data flowing only to the
> > one of remaining two nodes and there is no entry in heal info for the
> > volume. Explanation would be very much appreciated ;-)
> >
> > Few minutes later I reverted back to replica 3 with arbiter (group
> > virt, ping-timeout 1). All nodes are up. During first fio run, VM
> > disconnected my ssh session, so I reconnected and saw ext4 problems in
> > dmesg. I deleted the VM and started a new one. Glustershd.log fills
> > with metadata heal shortly after fio job starts, but this time system
> > is stable.
> > Rebooting one of the nodes does not cause any problem (watching heal
> > log, i/o on vm).
> >
> > So I decided to put more stress on VMs disk - I added second job with
> > direct=1 and started it (now both are running) while one gluster node
> > is still booting. What happened? One fio job reports "Bus error" and
> > VM segfaults when trying to run dmesg...
> >
> > Is this gfapi related? Is this bug in arbiter?
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-06 Thread lemonnierk
Mh, I never had to do that and I never had that problem. Is that an
arbiter specific thing ? With replica 3 it just works.

On Wed, Sep 06, 2017 at 03:59:14PM -0400, Alastair Neil wrote:
> you need to set
> 
> cluster.server-quorum-ratio 51%
> 
> On 6 September 2017 at 10:12, Pavel Szalbot  wrote:
> 
> > Hi all,
> >
> > I have promised to do some testing and I finally find some time and
> > infrastructure.
> >
> > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
> > replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
> > with disk accessible through gfapi. Volume group is set to virt
> > (gluster volume set gv_openstack_1 virt). VM runs current (all
> > packages updated) Ubuntu Xenial.
> >
> > I set up following fio job:
> >
> > [job1]
> > ioengine=libaio
> > size=1g
> > loops=16
> > bs=512k
> > direct=1
> > filename=/tmp/fio.data2
> >
> > When I run fio fio.job and reboot one of the data nodes, IO statistics
> > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
> > filesystem gets remounted as read-only.
> >
> > If you care about infrastructure, setup details etc., do not hesitate to
> > ask.
> >
> > Gluster info on volume:
> >
> > Volume Name: gv_openstack_1
> > Type: Replicate
> > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
> > Status: Started
> > Snapshot Count: 0
> > Number of Bricks: 1 x (2 + 1) = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: gfs-2.san:/export/gfs/gv_1
> > Brick2: gfs-3.san:/export/gfs/gv_1
> > Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
> > Options Reconfigured:
> > nfs.disable: on
> > transport.address-family: inet
> > performance.quick-read: off
> > performance.read-ahead: off
> > performance.io-cache: off
> > performance.stat-prefetch: off
> > performance.low-prio-threads: 32
> > network.remote-dio: enable
> > cluster.eager-lock: enable
> > cluster.quorum-type: auto
> > cluster.server-quorum-type: server
> > cluster.data-self-heal-algorithm: full
> > cluster.locking-scheme: granular
> > cluster.shd-max-threads: 8
> > cluster.shd-wait-qlength: 1
> > features.shard: on
> > user.cifs: off
> >
> > Partial KVM XML dump:
> >
> > 
> >   
> >> name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
> > 
> >   
> >   
> >   
> >   77ebfd13-6a92-4f38-b036-e9e55d752e1e
> >   
> >> function='0x0'/>
> > 
> >
> > Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps
> > SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
> > nodes (including arbiter).
> >
> > I would really love to know what am I doing wrong, because this is my
> > experience with Gluster for a long time a and a reason I would not
> > recommend it as VM storage backend in production environment where you
> > cannot start/stop VMs on your own (e.g. providing private clouds for
> > customers).
> > -ps
> >
> >
> > On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti 
> > wrote:
> > > Il 30-08-2017 17:07 Ivan Rossi ha scritto:
> > >>
> > >> There has ben a bug associated to sharding that led to VM corruption
> > >> that has been around for a long time (difficult to reproduce I
> > >> understood). I have not seen reports on that for some time after the
> > >> last fix, so hopefully now VM hosting is stable.
> > >
> > >
> > > Mmmm... this is precisely the kind of bug that scares me... data
> > corruption
> > > :|
> > > Any more information on what causes it and how to resolve? Even if in
> > newer
> > > Gluster releases it is a solved bug, knowledge on how to treat it would
> > be
> > > valuable.
> > >
> > >
> > > Thanks.
> > >
> > > --
> > > Danti Gionatan
> > > Supporto Tecnico
> > > Assyoma S.r.l. - www.assyoma.it
> > > email: g.da...@assyoma.it - i...@assyoma.it
> > > GPG public key ID: FF5F32A8
> > > ___
> > > Gluster-users mailing list
> > > Gluster-users@gluster.org
> > > http://lists.gluster.org/mailman/listinfo/gluster-users
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> >

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-09-03 Thread lemonnierk
On Sun, Sep 03, 2017 at 10:21:33PM +0200, Gionatan Danti wrote:
> Il 30-08-2017 17:07 Ivan Rossi ha scritto:
> > There has ben a bug associated to sharding that led to VM corruption
> > that has been around for a long time (difficult to reproduce I
> > understood). I have not seen reports on that for some time after the
> > last fix, so hopefully now VM hosting is stable.
> 
> Mmmm... this is precisely the kind of bug that scares me... data 
> corruption :|
> Any more information on what causes it and how to resolve? Even if in 
> newer Gluster releases it is a solved bug, knowledge on how to treat it 
> would be valuable.
> 

I don't have a solution, instead of growing my volumes I just create new
ones. Couldn't tell you if it's solved in recent release, never had the
courage to try it out :)
It's a bit hard to trigger too so having it work once might not be
enough.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-30 Thread lemonnierk
Solved as to 3.7.12. The only bug left is when adding new bricks to
create a new replica set, now sure where we are now on that bug but
that's not a common operation (well, at least for me).

On Wed, Aug 30, 2017 at 05:07:44PM +0200, Ivan Rossi wrote:
> There has ben a bug associated to sharding that led to VM corruption that
> has been around for a long time (difficult to reproduce I understood). I
> have not seen reports on that for some time after the last fix, so
> hopefully now VM hosting is stable.
> 
> 2017-08-30 3:57 GMT+02:00 Everton Brogliatto :
> 
> > Ciao Gionatan,
> >
> > I run Gluster 3.10.x (Replica 3 arbiter or 2 + 1 arbiter) to provide
> > storage for oVirt 4.x and I have had no major issues so far.
> > I have done online upgrades a couple of times, power losses, maintenance,
> > etc with no issues. Overall, it is very resilient.
> >
> > Important thing to keep in mind is your network, I run the Gluster nodes
> > on a redundant network using bonding mode 1 and I have performed
> > maintenance on my switches, bringing one of them off-line at a time without
> > causing problems in my Gluster setup or in my running VMs.
> > Gluster recommendation is to enable jumbo frames across the
> > subnet/servers/switches you use for Gluster operations. Your switches must
> > support MTU 9000 + 208 at least.
> >
> > There were two occasions where I purposely caused a split brain situation
> > and I was able to heal the files manually.
> >
> > Volume performance tuning can make a significant difference in Gluster. As
> > others have mentioned previously, sharding is recommended when running VMs
> > as it will split big files in smaller pieces, making it easier for the
> > healing to occur.
> > When you enable sharding, the default sharding block size is 4MB which
> > will significantly reduce your writing speeds. oVirt recommends the shard
> > block size to be 512MB.
> > The volume options you are looking here are:
> > features.shard on
> > features.shard-block-size 512MB
> >
> > I had an experimental setup in replica 2 using an older version of Gluster
> > few years ago and it was unstable, corrupt data and crashed many times. Do
> > not use replica 2. As others have already said, minimum is replica 2+1
> > arbiter.
> >
> > If you have any questions that I perhaps can help with, drop me an email.
> >
> >
> > Regards,
> > Everton Brogliatto
> >
> >
> > On Sat, Aug 26, 2017 at 1:40 PM, Gionatan Danti 
> > wrote:
> >
> >> Il 26-08-2017 07:38 Gionatan Danti ha scritto:
> >>
> >>> I'll surely give a look at the documentation. I have the "bad" habit
> >>> of not putting into production anything I know how to repair/cope
> >>> with.
> >>>
> >>> Thanks.
> >>>
> >>
> >> Mmmm, this should read as:
> >>
> >> "I have the "bad" habit of not putting into production anything I do NOT
> >> know how to repair/cope with"
> >>
> >> Really :D
> >>
> >>
> >> Thanks.
> >>
> >> --
> >> Danti Gionatan
> >> Supporto Tecnico
> >> Assyoma S.r.l. - www.assyoma.it
> >> email: g.da...@assyoma.it - i...@assyoma.it
> >> GPG public key ID: FF5F32A8
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/gluster-users
> >>
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> >

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread lemonnierk
> 
> This surprise me: I found DRBD quite simple to use, albeit I mostly use 
> active/passive setup in production (with manual failover)
> 

I think you are talking about DRBD 8, which is indeed very easy. DRBD 9
on the other hand, which is the one that compares to gluster (more or
less), is a whole other story. Never managed to make it work correctly
either


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-25 Thread lemonnierk
> This is true even if I manage locking at application level (via virlock 
> or sanlock)?

Yes. Gluster has it's own quorum, you can disable it but that's just a
recipe for a disaster.

> Also, on a two-node setup it is *guaranteed* for updates to one node to 
> put offline the whole volume?

I think so, but I never took the chance so who knows.

> On the other hand, a 3-way setup (or 2+arbiter) if free from all these 
> problems?
> 

Free from a lot of problems, but apparently not as good as a replica 3
volume. I can't comment on arbiter, I only have replica 3 clusters. I
can tell you that my colleagues setting up 2 nodes clusters have _a lot_
of problems.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-23 Thread lemonnierk
Really ? I can't see why. But I've never used arbiter so you probably
know more about this than I do.

In any case, with replica 3, never had a problem.

On Wed, Aug 23, 2017 at 09:13:28PM +0200, Pavel Szalbot wrote:
> Hi, I believe it is not that simple. Even replica 2 + arbiter volume
> with default network.ping-timeout will cause the underlying VM to
> remount filesystem as read-only (device error will occur) unless you
> tune mount options in VM's fstab.
> -ps
> 
> 
> On Wed, Aug 23, 2017 at 6:59 PM,   wrote:
> > What he is saying is that, on a two node volume, upgrading a node will
> > cause the volume to go down. That's nothing weird, you really should use
> > 3 nodes.
> >
> > On Wed, Aug 23, 2017 at 06:51:55PM +0200, Gionatan Danti wrote:
> >> Il 23-08-2017 18:14 Pavel Szalbot ha scritto:
> >> > Hi, after many VM crashes during upgrades of Gluster, losing network
> >> > connectivity on one node etc. I would advise running replica 2 with
> >> > arbiter.
> >>
> >> Hi Pavel, this is bad news :(
> >> So, in your case at least, Gluster was not stable? Something as simple
> >> as an update would let it crash?
> >>
> >> > I once even managed to break this setup (with arbiter) due to network
> >> > partitioning - one data node never healed and I had to restore from
> >> > backups (it was easier and kind of non-production). Be extremely
> >> > careful and plan for failure.
> >>
> >> I would use VM locking via sanlock or virtlock, so a split brain should
> >> not cause simultaneous changes on both replicas. I am more concerned
> >> about volume heal time: what will happen if the standby node
> >> crashes/reboots? Will *all* data be re-synced from the master, or only
> >> changed bit will be re-synced? As stated above, I would like to avoid
> >> using sharding...
> >>
> >> Thanks.
> >>
> >>
> >> --
> >> Danti Gionatan
> >> Supporto Tecnico
> >> Assyoma S.r.l. - www.assyoma.it
> >> email: g.da...@assyoma.it - i...@assyoma.it
> >> GPG public key ID: FF5F32A8
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/gluster-users
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-23 Thread lemonnierk
What he is saying is that, on a two node volume, upgrading a node will
cause the volume to go down. That's nothing weird, you really should use
3 nodes.

On Wed, Aug 23, 2017 at 06:51:55PM +0200, Gionatan Danti wrote:
> Il 23-08-2017 18:14 Pavel Szalbot ha scritto:
> > Hi, after many VM crashes during upgrades of Gluster, losing network
> > connectivity on one node etc. I would advise running replica 2 with
> > arbiter.
> 
> Hi Pavel, this is bad news :(
> So, in your case at least, Gluster was not stable? Something as simple 
> as an update would let it crash?
> 
> > I once even managed to break this setup (with arbiter) due to network
> > partitioning - one data node never healed and I had to restore from
> > backups (it was easier and kind of non-production). Be extremely
> > careful and plan for failure.
> 
> I would use VM locking via sanlock or virtlock, so a split brain should 
> not cause simultaneous changes on both replicas. I am more concerned 
> about volume heal time: what will happen if the standby node 
> crashes/reboots? Will *all* data be re-synced from the master, or only 
> changed bit will be re-synced? As stated above, I would like to avoid 
> using sharding...
> 
> Thanks.
> 
> 
> -- 
> Danti Gionatan
> Supporto Tecnico
> Assyoma S.r.l. - www.assyoma.it
> email: g.da...@assyoma.it - i...@assyoma.it
> GPG public key ID: FF5F32A8
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] GlusterFS as virtual machine storage

2017-08-23 Thread lemonnierk
On Mon, Aug 21, 2017 at 10:09:20PM +0200, Gionatan Danti wrote:
> Hi all,
> I would like to ask if, and with how much success, you are using 
> GlusterFS for virtual machine storage.

Hi, we have similar clusters.

> 
> My plan: I want to setup a 2-node cluster, where VM runs on the nodes 
> themselves and can be live-migrated on demand.
> 

Use 3 nodes for gluster, or at the very least 2 + 1 arbiter. You really
really don't want a 2 node setup, you'll get split brains.

> I have some questions:
> - do you use GlusterFS for similar setup?

Yes, but always 3 nodes.

> - if so, how do you feel about it?

Works pretty well now, we've had problems but 3.7.15 works great. We are
currently testing 3.8, seems to work fine too. I imagine the newer ones
do aswell.

> - if a node crashes/reboots, how the system re-syncs? Will the VM files 
> be fully resynchronized, or the live node keeps some sort of write 
> bitmap to resynchronize changed/written chunks only? (note: I know about 
> sharding, but I would like to avoid it);

You really should use sharding, not sure exactly what happens without it
but I know it made the VM unusable during heal (they froze), basically.
Sharding solved that, with it it works well even during heal. It only
retransfert each shard so the amount to re-sync is pretty low.

> - finally, how much stable is the system?
> 

Haven't had any problems with gluster itself since we updated to >
3.7.11. It just works, even on a pretty bad network.
Performances aren't amazing, but I imagine you know that, replication
has a cost.

> Thanks.
> 
> -- 
> Danti Gionatan
> Supporto Tecnico
> Assyoma S.r.l. - www.assyoma.it
> email: g.da...@assyoma.it - i...@assyoma.it
> GPG public key ID: FF5F32A8
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] How are bricks healed in Debian Jessie 3.11

2017-08-08 Thread lemonnierk
> Healing of contents works at the entire file level at the moment. For VM 
> image use cases, it is advised to enable sharding by virtue of which 
> heals would be restricted to only the shards that were modified when the 
> brick was down.

We even change the heal algo to "full", since it seems better to just
re-download a small shard than trying to heal it. At least on 3.7 it
works better that way


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Mailing list question

2017-08-08 Thread lemonnierk
Hi,

If you haven't subscribed to the mailing-list, indeed you won't get it.
I'd say just "craft" a reply by using the same subject and put Re: in
front of it for your reply.

Next time I'd advise in subscribing before posting, even if to
unsubscribe a few days later when the problem is solved :)

On Tue, Aug 08, 2017 at 10:26:58AM +0300, Ilan Schwarts wrote:
> Hi all,
> 
> How can I answer my question or delete the thread ?
> When I sent a question to gluster-users@gluster.org I didnt get the
> mail (probably by-design), so I cannot reply it with a solution.
> 
> I see it in the
> archive:http://lists.gluster.org/pipermail/gluster-users/2017-August/032008.html
> 
> But i cannot answer/delete it..
> 
> Thanks
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume hacked

2017-08-07 Thread lemonnierk
> It really depends on the application if locks are used. Most (Linux)
> applications will use advisory locks. This means that locking is only
> effective when all participating applications use and honour the locks.
> If one application uses (advisory) locks, and an other application now,
> well, then all bets are off.
> 
> It is also possible to delete files that are in active use. The contens
> will still be served by the filesystem, but there is no accessible
> filename anymore. If the VMs using those files are still running, there
> might be a way to create a new filename for the data. If the VMs have
> been stopped, and the file-descriptior has been closed, the data will be
> gone :-/
>

Oh the data was gone long before I stopped the VM, every binary was
doing I/O errors when accessed, only whatever was in ram (ssh ..) when
the disk got suppressed was still working.

I'm a bit surpised they could be deleted, but I imagine qemu through
libgfapi doesn't really access the file as a whole, maybe just the part
it needs when it needs it. In any case the gluster logs show clearly
file descriptor errors from 8h47 AM UTC, which seems to match our first
monitoring alerts. I assume that's when the deletion happened.

Now I just need to figure out what they used to access the volume, I
hope it's just NFS since that's the only thing I can think of.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume hacked

2017-08-07 Thread lemonnierk
On Mon, Aug 07, 2017 at 10:40:08AM +0200, Arman Khalatyan wrote:
> Interesting problem...
> Did you considered an insider job?( comes to mind http://verelox.com
>  recent troubles)

I would be really really surprised, we are only 5 / 6 with access and as
far as I know no one has a problem with the company.
The last person to leave did so last year, and we revoked everything (I
hope). And I can't think of a reason they'd leave the website of a
hungarian company in there, we contacted them and they think it's one
of their ex-employee trying to cause them problems.
I think we were just unlucky, but I'd really love to confirm how they
did it

> 
> On Mon, Aug 7, 2017 at 3:30 AM, W Kern  wrote:
> 
> >
> >
> > On 8/6/2017 4:57 PM, lemonni...@ulrar.net wrote:
> >
> >
> > Gluster already uses a vlan, the problem is that there is no easy way
> > that I know of to tell gluster not to listen on an interface, and I
> > can't not have a public IP on the server. I really wish ther was a
> > simple "listen only on this IP/interface" option for this
> >
> >
> > What about this?
> >
> > transport.socket.bind-address
> >
> > I know the were some BZs on it with earlier Gluster Versions, so I assume 
> > its still there now.
> >
> > -bill
> >
> >
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> >

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume hacked

2017-08-06 Thread lemonnierk
> You should add VLANS, and/or overlay networks and/or Mac Address 
> filtering/locking/security which raises the bar quite a bit for hackers. 
> Perhaps your provider can help you with that.
> 

Gluster already uses a vlan, the problem is that there is no easy way
that I know of to tell gluster not to listen on an interface, and I
can't not have a public IP on the server. I really wish ther was a
simple "listen only on this IP/interface" option for this

> Then there is the Gluster Auth stuff, which is cert based as I recall. 
> Unfortunately, I don't have any experience with it as we have relied on 
> unique seperate physical networks for our clusters.
> Hackers (and us) can't even get to our Gluster boxes except via IP/KVM 
> or the client itself.
> 

Well never used it, but I never thought I needed that since the vlan
gluster uses is private so outside users can't reach it. Didn't realise
NFS works with access to any one node since we don't use it.

> 
> Well if you aren't using it, then turn NFS off. I think NFS is turned 
> off by default in the new versions anyway in favor of NFS-Ganesha.

Yeah, we are still on 3.7 for now, I haven't taken the time to test
newer versions yet. Since 3.7.15 does everything we need pretty well,
not really felt the need for that.

> 
> But the original question remains, did they get into just the Gluster 
> boxes or are they in the Client already?
> 
> Unless they rooted the boxes and cleaned the logs, there should be some 
> traces of activity in the various system and gluster logs. The various 
> root kit checker programs may find something (chkrootkit)
> 

Well it's one and the same, gluster is installed on the proxmox servers
so the VM are just using localhost as their disk storage. So either they
got into the volume itself (NFS or some other way I haven't thought of),
or they got root on the hypervisors but in that case why f*ck up with
the volume instead of everything else.
Since everything else looks okay, I think they just had access to the
volume, and the only way I can think of is NFS. But I don't see anything
really suspicious in nfs.log, it seems to me like only normal glusterd
restart logs

I'll be sure to scan for rootkits tomorrow just in case, but I assume
they would have re-wiped everything if they still had access.
Googling the link they left I found a forum where some guy got his hard
drive wiped in a similar manner on his router a few days ago, it looks
like someone having fun wiping unsecured NAS .. What a great way to
spend your free time :(



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume hacked

2017-08-06 Thread lemonnierk
On Sun, Aug 06, 2017 at 01:01:56PM -0700, wk wrote:
> I'm not sure what you mean by saying "NFS is available by anyone"?
> 
> Are your gluster nodes physically isolated on their own network/switch?

Nope, impossible to do for us

> 
> In other words can an outsider access them directly without having to 
> compromise a NFS client machine first?
> 

Yes, but we don't have any NFS client, only libgfapi.
I added a bunch of iptables rules to prevent that from happening, if
they did use NFS which I am unsure of. If they used something else to
access the volume though, who knows .. It hasn't been re-hacked since so
that's a good sign.

> -bill
> 
> 
> On 8/6/2017 7:57 AM, lemonni...@ulrar.net wrote:
> > Hi,
> >
> > This morning one of our cluster was hacked, all the VM disks were
> > deleted and a file README.txt was left with inside just
> > "http://virtualisan.net/contactus.php :D"
> >
> > I don't speak the language but with google translete it looks like it's
> > just a webdev company or something like that, a bit surprised ..
> > In any case, we'd really like to know how that happened.
> >
> > I realised NFS is accessible by anyone (sigh), is there a way to check
> > if that is what they used ? I tried reading the nfs.log but it's not
> > really clear if someone used it or not. What do I need to look for in
> > there to see if someone mounted the volume ?
> > There are stuff in the log on one of the bricks (only one),
> > and as we aren't using NFS for that volume that in itself seems
> > suspicious.
> >
> > Thanks
> >
> >
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> 

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Volume hacked

2017-08-06 Thread lemonnierk
Thinking about it, is it even normal they managed to delete the VM disks?
Shoudn't they have gotten "file in use" errors ? Or does libgfapi not
lock the access files ?


On Sun, Aug 06, 2017 at 03:57:06PM +0100, lemonni...@ulrar.net wrote:
> Hi,
> 
> This morning one of our cluster was hacked, all the VM disks were
> deleted and a file README.txt was left with inside just
> "http://virtualisan.net/contactus.php :D"
> 
> I don't speak the language but with google translete it looks like it's
> just a webdev company or something like that, a bit surprised ..
> In any case, we'd really like to know how that happened.
> 
> I realised NFS is accessible by anyone (sigh), is there a way to check
> if that is what they used ? I tried reading the nfs.log but it's not
> really clear if someone used it or not. What do I need to look for in
> there to see if someone mounted the volume ?
> There are stuff in the log on one of the bricks (only one), 
> and as we aren't using NFS for that volume that in itself seems
> suspicious.
> 
> Thanks



> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] Volume hacked

2017-08-06 Thread lemonnierk
Hi,

This morning one of our cluster was hacked, all the VM disks were
deleted and a file README.txt was left with inside just
"http://virtualisan.net/contactus.php :D"

I don't speak the language but with google translete it looks like it's
just a webdev company or something like that, a bit surprised ..
In any case, we'd really like to know how that happened.

I realised NFS is accessible by anyone (sigh), is there a way to check
if that is what they used ? I tried reading the nfs.log but it's not
really clear if someone used it or not. What do I need to look for in
there to see if someone mounted the volume ?
There are stuff in the log on one of the bricks (only one), 
and as we aren't using NFS for that volume that in itself seems
suspicious.

Thanks


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Gluster native mount is really slow compared to nfs

2017-07-11 Thread lemonnierk
Hi,

We've been doing that for some clients, basically it works fine if you
configure your OPCache very very agressivly. Increase the available ram
for it, disable any form of opcache validating from disk and it'll work
great, 'cause your app won't touch gluster.
Then whenever you make a change in the PHP, just restart PHP to force
it to reload the source from gluster.
For example :

zend_extension = opcache.so

[opcache]
opcache.enable = 1
opcache.enable_cli = 1
opcache.memory_consumption = 1024
opcache.max_accelerated_files = 8
opcache.revalidate_freq = 300
opcache.validate_timestamps = 1
opcache.interned_strings_buffer = 32
opcache.fast_shutdown = 1

With that config, it works well. Needs some getting used to though, since
you'll need to restart php to see any change in the sources applied.

If you use something with an on-disk cache (Prestashop, magento, typo3 ..)
do think of storing that in a redis or something, never on gluster, that'd
kill performances. I've seen a gain of ~10 seconds by just moving the cache
from gluster to redis for Magento for example.


On Tue, Jul 11, 2017 at 11:01:52AM +0200, Jo Goossens wrote:
> Hello,
> 
>  
>  
> We tried tons of settings to get a php app running on a native gluster mount:
> 
>  
> e.g.: 192.168.140.41:/www /var/www glusterfs 
> defaults,_netdev,backup-volfile-servers=192.168.140.42:192.168.140.43,direct-io-mode=disable
>  0 0
> 
>  
> I tried some mount variants in order to speed up things without luck.
> 
>  
>  
> After that I tried nfs (native gluster nfs 3 and ganesha nfs 4), it was a 
> crazy performance difference.
> 
>  
> e.g.: 192.168.140.41:/www /var/www nfs4 defaults,_netdev 0 0
> 
>  
> I tried a test like this to confirm the slowness:
> 
>  
> ./smallfile_cli.py  --top /var/www/test --host-set 192.168.140.41 --threads 8 
> --files 5000 --file-size 64 --record-size 64
>  This test finished in around 1.5 seconds with NFS and in more than 250 
> seconds without nfs (can't remember exact numbers, but I reproduced it 
> several times for both).
>  With the native gluster mount the php app had loading times of over 10 
> seconds, with the nfs mount the php app loaded around 1 second maximum and 
> even less. (reproduced several times)
>   I tried all kind of performance settings and variants of this but not 
> helped , the difference stayed huge, here are some of the settings played 
> with in random order:
> 
>  
> gluster volume set www features.cache-invalidation on
> gluster volume set www features.cache-invalidation-timeout 600
> gluster volume set www performance.stat-prefetch on
> gluster volume set www performance.cache-samba-metadata on
> gluster volume set www performance.cache-invalidation on
> gluster volume set www performance.md-cache-timeout 600
> gluster volume set www network.inode-lru-limit 25
>  gluster volume set www performance.cache-refresh-timeout 60
> gluster volume set www performance.read-ahead disable
> gluster volume set www performance.readdir-ahead on
> gluster volume set www performance.parallel-readdir on
> gluster volume set www performance.write-behind-window-size 4MB
> gluster volume set www performance.io-thread-count 64
>  gluster volume set www performance.client-io-threads on
>  gluster volume set www performance.cache-size 1GB
> gluster volume set www performance.quick-read on
> gluster volume set www performance.flush-behind on
> gluster volume set www performance.write-behind on
> gluster volume set www nfs.disable on
>  gluster volume set www client.event-threads 3
> gluster volume set www server.event-threads 3
>   
>  
> The NFS ha adds a lot of complexity which we wouldn't need at all in our 
> setup, could you please explain what is going on here? Is NFS the only 
> solution to get acceptable performance? Did I miss one crucial settting 
> perhaps?
> 
>  
> We're really desperate, thanks a lot for your help!
> 
>  
>  
> PS: We tried with gluster 3.11 and 3.8 on Debian, both had terrible 
> performance when not used with nfs.
> 
>  
>  
> 
> 
> Kind regards
> 
> Jo Goossens
> 
>  
>  
>  

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Urgent :) Procedure for replacing Gluster Node on 3.8.12

2017-06-09 Thread lemonnierk
> Must admit this sort of process - replacing bricks and/or node is *very*
> stressful with gluster. That sick feeling in the stomach - will I have to
> restore everything from backups?
> 
> Shouldn't be this way.

I know exactly what you mean.
Last week end I replaced a server (it was working fine, though), I did that
in the middle of the night, very stressed. It ended up going perfectly fine
and no-one saw anything, but that exact sick feeling in the stomach.

Well, haven't had any problems lately so I guess It'll go away after a while,
just have to get used to see it working fine I suppose. That's kind of why
I haven't dared to upgrade from 3.7 yet I think


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Urgent :) Procedure for replacing Gluster Node on 3.8.12

2017-06-09 Thread lemonnierk
> And a big thanks (*not*) to the smart reporting which showed no issues at
> all.

Heh, on that, did you think to take a look at the Media_Wearout indicator ?
I recently learned that existed, and it explained A LOT.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Urgent :) Procedure for replacing Gluster Node on 3.8.12

2017-06-09 Thread lemonnierk
> I'm thinking the following:
> 
> gluster volume remove-brick datastore4 replica 2
> vna.proxmox.softlog:/tank/vmdata/datastore4 force
> 
> gluster volume add-brick datastore4 replica 3
> vnd.proxmox.softlog:/tank/vmdata/datastore4

I think that should work perfectly fine yes, either that
or directly use replace-brick ?


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] NFS-Ganesha packages for debian aren't installing

2017-06-07 Thread lemonnierk
Well the NFS-Ganesha packages don't seem to install init.d files, no.
For reference, a locate ganesha gives that after installation of the .deb :

/lib/systemd/system/nfs-ganesha-config.service
/lib/systemd/system/nfs-ganesha-config.service-in.cmake
/lib/systemd/system/nfs-ganesha-lock.service
/lib/systemd/system/nfs-ganesha.service

I have indeed found a valid init.d script on the deb repo, after creating
a few directories by hand it did start fine. But I wonder why it's not in
the .deb packages ?

As for gluster, yes they do come with init.d files and that's been working
very well, that's why I was surprised not to find them for NFS Ganesha.

On Wed, Jun 07, 2017 at 06:29:20PM +0530, Kaleb S. KEITHLEY wrote:
> On 06/07/2017 06:03 PM, Niels de Vos wrote:
> > On Wed, Jun 07, 2017 at 11:59:14AM +0100, lemonni...@ulrar.net wrote:
> >> Although looking at it I see .service files for systemd but nothing for 
> >> SysV.
> >> Is there no support for SysV ? Guess I'll have to write that myself
> > 
> > The packaging for packages provided by the Gluster Community (not in the
> > standard Debian repos) is maintained here:
> >https://github.com/gluster/glusterfs-debian (check the branches)
> > 
> > I'm pretty sure that patches to enable systemd support are welcome.
> > NFS-Ganesha has systemd enabled for Fedora and CentOS, so the bits
> > should be there somewhere.
> 
> I'm a little confused. lemonnierk says he sees .service files and is 
> looking for SYSV files, which I interpret as meaning init.d files.
> 
> The Debian and Ubuntu packages are installing an init.d file. If you 
> want an example .service file for GlusterFS and/or Ganesha you can get 
> them, as Niels notes, from the Fedora, CentOS, and perhaps the SuSE rpms.
> 
> But even the official Debian packages, e.g. for Gluster, haven't 
> switched to systemd .service files yet (last I looked) and I have been 
> following that lead.
> 
> But if someone wants to send a diff or a pull request I'll be happy to 
> take it.
> 
> > 
> > HTH,
> > Niels
> > 
> > 
> >>
> >> On Wed, Jun 07, 2017 at 11:36:05AM +0100, lemonni...@ulrar.net wrote:
> >>> Wait, ignore that.
> >>> I added the stretch repo .. I think I got mind flooded by the broken link 
> >>> for the key before that,
> >>> sorry about the noise.
> >>>
> >>> On Wed, Jun 07, 2017 at 11:31:22AM +0100, lemonni...@ulrar.net wrote:
> >>>> Hi,
> >>>>
> >>>> I finally have the opportunity to give NFS-Ganesha a try, so I followed 
> >>>> that :
> >>>> https://download.gluster.org/pub/gluster/nfs-ganesha/2.4.5/Debian/
> >>>>
> >>>> But when I try to install it, I get this :
> >>>>
> >>>> The following packages have unmet dependencies:
> >>>>   nfs-ganesha : Depends: libntirpc1 (>= 1.4.3) but it is not going to be 
> >>>> installed
> >>>>   nfs-ganesha-fsal : Depends: libdbus-1-3 (>= 1.9.14) but 
> >>>> 1.8.22-0+deb8u1 is to be installed
> >>>>  Depends: libntirpc1 but it is not going to be 
> >>>> installed
> >>>>  E: Unable to correct problems, you have held broken 
> >>>> packages.
> >>>>
> >>>> Clearly something must be missing, any idea what ?
> >>>> I'm afraid to hear "you need the backports enabled", I'm asking hoping 
> >>>> the packages are just
> >>>> broken right now and will be fixed :)
> >>>>  
> >>>
> >>>
> >>>
> >>>> ___
> >>>> Gluster-users mailing list
> >>>> Gluster-users@gluster.org
> >>>> http://lists.gluster.org/mailman/listinfo/gluster-users
> >>>
> >>
> >>
> >>
> >>> ___
> >>> Gluster-users mailing list
> >>> Gluster-users@gluster.org
> >>> http://lists.gluster.org/mailman/listinfo/gluster-users
> >>
> > 
> > 
> > 
> >> ___
> >> Gluster-users mailing list
> >> Gluster-users@gluster.org
> >> http://lists.gluster.org/mailman/listinfo/gluster-users
> > 
> > 
> > 
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> > 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] NFS-Ganesha packages for debian aren't installing

2017-06-07 Thread lemonnierk
Hi,

I finally have the opportunity to give NFS-Ganesha a try, so I followed that :
https://download.gluster.org/pub/gluster/nfs-ganesha/2.4.5/Debian/

But when I try to install it, I get this :

The following packages have unmet dependencies:
 nfs-ganesha : Depends: libntirpc1 (>= 1.4.3) but it is not going to be 
installed
 nfs-ganesha-fsal : Depends: libdbus-1-3 (>= 1.9.14) but 1.8.22-0+deb8u1 is to 
be installed
Depends: libntirpc1 but it is not going to be installed
E: Unable to correct problems, you have held broken 
packages.

Clearly something must be missing, any idea what ?
I'm afraid to hear "you need the backports enabled", I'm asking hoping the 
packages are just
broken right now and will be fixed :)
  


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] 3.7.13 - Safe to replace brick ?

2017-05-24 Thread lemonnierk
Hi,

Does anyone know if the corruption bugs we've had for a while in add-brick
only happen when adding new bricks, or does replace-brick corrupt shards
too ?

I have a 3.7.13 volume with a brick I'd like to move to another
server, and I'll do a backup and the move at night just in case,
but I'd rather know if it risks corrupting the disk or not.

I believe the rebalance bug is a separate issue, it won't be a problem
here as it's only a replica 3 with no distribute at all.

Thanks !


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] URGENT - Cheat on quorum

2017-05-22 Thread lemonnierk
> 
> Great, that worked.  ie  gluster volume set VOL 
> cluster.server-quorum-type none
> 
> Although I did get an error  of "Volume set: failed: Commit failed on 
> localhost, please check the log files for more details"
> 
> but then I noticed that volume immediately came back up and I was able 
> to mount the single remaining node and access those files.
> 
> So you need to do both settings in my scenario.
> 

Thanks for testing that, it's good to know that the solution actually works !
We don't use arbiters so I guess we're okay on that side though, but if it
ever explodes again I'm glad to know there is a way to quickly get back some
service, even if that means re-creating a new volume later on.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] URGENT - Cheat on quorum

2017-05-18 Thread lemonnierk
> If you know what you are getting into, then `gluster v set  
> cluster.quorum-type none` should give you the desired result, i.e. allow 
> write access to the volume.

Thanks a lot ! We won't be needing it now, but I'll write that in the wiki
just in case.

We realised that the problem was the CacheCade SSDs, they all died together
on every nodes. But that makes sense, they were used exactly the same through
gluster.
We disabled those for now, and once the heal finishes we'll change them one by
one.

signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

[Gluster-users] URGENT - Cheat on quorum

2017-05-18 Thread lemonnierk
Hi,


We are having huge hardware issues (oh joy ..) with RAID cards.
On a replica 3 volume, we have 2 nodes down. Can we somehow tell
gluster that it's quorum is 1, to get some amount of service back
while we try to fix the other nodes or install new ones ?

Thanks


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Question about healing and adding a new brick

2017-05-12 Thread lemonnierk
> I have the scenario to expand a single gluster server with no replica to a
> replica of 2 by adding a new server.

No sharding, right ?

> 
> Since I have many TB's of data, can I use the first gluster server while
> the data is being replicated to the second new brick or should I wait for
> it to finish?

You can use it on both during the heal, performances won't be good though.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] small files optimizations

2017-05-10 Thread lemonnierk
On Wed, May 10, 2017 at 09:14:59AM +0200, Gandalf Corvotempesta wrote:
> Yes much clearer but I think this makes some trouble like space available
> shown by gluster. Or not?

Not really, you'll just see "used space" on your volumes that you won't be
able to track down, keep in mind that the used space on each volume will be
the same (gluster uses the used space from the brick's filesystem)


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] [Gluster-devel] Don't allow data loss via add-brick (was Re: Add single server)

2017-05-03 Thread lemonnierk
> Fix is up @ https://review.gluster.org/#/c/17160/ . The only thing which
> we'd need to decide (and are debating on) is that should we bypass this
> validation with rebalance start force or not. What do others think?

I think that if you are specifying force, that's your business. Maybe still
show a warning and ask for a yes / no confirmation after, in case some people
use force "by default" ?


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Add single server

2017-05-02 Thread lemonnierk
> Don't bother with another bug. We have raised
> https://github.com/gluster/glusterfs/issues/169 for the issue in mail
> thread.

If I'm not mistaken that's about the possibility of adding bricks
without adding a full replica set at once, that's a different subject.

We were talking about adding a warning in the cli when trying to add
bricks on a sharded volume, which currently corrupts VM disks.


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Add single server

2017-04-30 Thread lemonnierk
> So I was a little but luck. If I has all the hardware part, probably i
> would be firesd after causing data loss by using a software marked as stable

Yes, we lost our data last year to this bug, and it wasn't a test cluster.
We still hear from it from our clients to this day.

> Is known that this feature is causing data loss and there is no evidence or
> no warning in official docs.
> 

I was (I believe) the first one to run into the bug, it happens and I knew it
was a risk when installing gluster.
But since then I didn't see any warnings anywhere except here, I agree
with you that it should be mentionned in big bold letters on the site.

Might even be worth adding a warning directly on the cli when trying to
add bricks if sharding is enabled, to make sure no-one will destroy a
whole cluster for a known bug.


> Il 30 apr 2017 12:14 AM,  ha scritto:
> 
> > I have to agree though, you keep acting like a customer.
> > If you don't like what the developers focus on, you are free to
> > try and offer a bounty to motivate someone to look at what you want,
> > or even better : go and buy a license for one of gluster's commercial
> > alternatives.
> >
> >
> > On Sat, Apr 29, 2017 at 11:43:54PM +0200, Gandalf Corvotempesta wrote:
> > > I'm pretty sure that I'll be able to sleep well even after your block.
> > >
> > > Il 29 apr 2017 11:28 PM, "Joe Julian"  ha scritto:
> > >
> > > > No, you proposed a wish. A feature needs described behavior, certainly
> > a
> > > > lot more than "it should just know what I want it to do".
> > > >
> > > > I'm done. You can continue to feel entitled here on the mailing list.
> > I'll
> > > > just set my filters to bitbucket anything from you.
> > > >
> > > > On 04/29/2017 01:00 PM, Gandalf Corvotempesta wrote:
> > > >
> > > > I repeat: I've just proposed a feature
> > > > I'm not a C developer and I don't know gluster internals, so I can't
> > > > provide details
> > > >
> > > > I've just asked if simplifying the add brick process is something that
> > > > developers are interested to add
> > > >
> > > > Il 29 apr 2017 9:34 PM, "Joe Julian"  ha
> > scritto:
> > > >
> > > >> What I said publicly in another email ... but not to call out my
> > > >> perception of your behavior publicly if also like to say:
> > > >>
> > > >> Acting adversarial doesn't make anybody want to help, especially not
> > me
> > > >> and I'm the user community's biggest proponent.
> > > >>
> > > >> On April 29, 2017 11:08:45 AM PDT, Gandalf Corvotempesta <
> > > >> gandalf.corvotempe...@gmail.com> wrote:
> > > >>>
> > > >>> Mine was a suggestion
> > > >>> Fell free to ignore was gluster users has to say and still keep going
> > > >>> though your way
> > > >>>
> > > >>> Usually, open source project tends to follow users suggestions
> > > >>>
> > > >>> Il 29 apr 2017 5:32 PM, "Joe Julian"  ha
> > scritto:
> > > >>>
> > >  Since this is an open source community project, not a company
> > product,
> > >  feature requests like these are welcome, but would be more welcome
> > with
> > >  either code or at least a well described method. Broad asks like
> > these are
> > >  of little value, imho.
> > > 
> > > 
> > >  On 04/29/2017 07:12 AM, Gandalf Corvotempesta wrote:
> > > 
> > > > Anyway, the proposed workaround:
> > > > https://joejulian.name/blog/how-to-expand-glusterfs-replicat
> > > > ed-clusters-by-one-server/
> > > > won't work with just a single volume made up of 2 replicated
> > bricks.
> > > > If I have a replica 2 volume with server1:brick1 and
> > server2:brick1,
> > > > how can I add server3:brick1 ?
> > > > I don't have any bricks to "replace"
> > > >
> > > > This is something i would like to see implemented in gluster.
> > > >
> > > > 2017-04-29 16:08 GMT+02:00 Gandalf Corvotempesta
> > > > :
> > > >
> > > >> 2017-04-24 10:21 GMT+02:00 Pranith Kumar Karampuri <
> > > >> pkara...@redhat.com>:
> > > >>
> > > >>> Are you suggesting this process to be easier through commands,
> > > >>> rather than
> > > >>> for administrators to figure out how to place the data?
> > > >>>
> > > >>> [1] http://lists.gluster.org/pipermail/gluster-users/2016-July/0
> > > >>> 27431.html
> > > >>>
> > > >> Admin should always have the ability to choose where to place
> > data,
> > > >> but something
> > > >> easier should be added, like in any other SDS.
> > > >>
> > > >> Something like:
> > > >>
> > > >> gluster volume add-brick gv0 new_brick
> > > >>
> > > >> if gv0 is a replicated volume, the add-brick should automatically
> > add
> > > >> the new brick and rebalance data automatically, still keeping the
> > > >> required redundancy level
> > > >>
> > > >> In case admin would like to set a custom placement for data, it
> > 

Re: [Gluster-users] Add single server

2017-04-29 Thread lemonnierk
I have to agree though, you keep acting like a customer.
If you don't like what the developers focus on, you are free to
try and offer a bounty to motivate someone to look at what you want,
or even better : go and buy a license for one of gluster's commercial
alternatives.


On Sat, Apr 29, 2017 at 11:43:54PM +0200, Gandalf Corvotempesta wrote:
> I'm pretty sure that I'll be able to sleep well even after your block.
> 
> Il 29 apr 2017 11:28 PM, "Joe Julian"  ha scritto:
> 
> > No, you proposed a wish. A feature needs described behavior, certainly a
> > lot more than "it should just know what I want it to do".
> >
> > I'm done. You can continue to feel entitled here on the mailing list. I'll
> > just set my filters to bitbucket anything from you.
> >
> > On 04/29/2017 01:00 PM, Gandalf Corvotempesta wrote:
> >
> > I repeat: I've just proposed a feature
> > I'm not a C developer and I don't know gluster internals, so I can't
> > provide details
> >
> > I've just asked if simplifying the add brick process is something that
> > developers are interested to add
> >
> > Il 29 apr 2017 9:34 PM, "Joe Julian"  ha scritto:
> >
> >> What I said publicly in another email ... but not to call out my
> >> perception of your behavior publicly if also like to say:
> >>
> >> Acting adversarial doesn't make anybody want to help, especially not me
> >> and I'm the user community's biggest proponent.
> >>
> >> On April 29, 2017 11:08:45 AM PDT, Gandalf Corvotempesta <
> >> gandalf.corvotempe...@gmail.com> wrote:
> >>>
> >>> Mine was a suggestion
> >>> Fell free to ignore was gluster users has to say and still keep going
> >>> though your way
> >>>
> >>> Usually, open source project tends to follow users suggestions
> >>>
> >>> Il 29 apr 2017 5:32 PM, "Joe Julian"  ha scritto:
> >>>
>  Since this is an open source community project, not a company product,
>  feature requests like these are welcome, but would be more welcome with
>  either code or at least a well described method. Broad asks like these 
>  are
>  of little value, imho.
> 
> 
>  On 04/29/2017 07:12 AM, Gandalf Corvotempesta wrote:
> 
> > Anyway, the proposed workaround:
> > https://joejulian.name/blog/how-to-expand-glusterfs-replicat
> > ed-clusters-by-one-server/
> > won't work with just a single volume made up of 2 replicated bricks.
> > If I have a replica 2 volume with server1:brick1 and server2:brick1,
> > how can I add server3:brick1 ?
> > I don't have any bricks to "replace"
> >
> > This is something i would like to see implemented in gluster.
> >
> > 2017-04-29 16:08 GMT+02:00 Gandalf Corvotempesta
> > :
> >
> >> 2017-04-24 10:21 GMT+02:00 Pranith Kumar Karampuri <
> >> pkara...@redhat.com>:
> >>
> >>> Are you suggesting this process to be easier through commands,
> >>> rather than
> >>> for administrators to figure out how to place the data?
> >>>
> >>> [1] http://lists.gluster.org/pipermail/gluster-users/2016-July/0
> >>> 27431.html
> >>>
> >> Admin should always have the ability to choose where to place data,
> >> but something
> >> easier should be added, like in any other SDS.
> >>
> >> Something like:
> >>
> >> gluster volume add-brick gv0 new_brick
> >>
> >> if gv0 is a replicated volume, the add-brick should automatically add
> >> the new brick and rebalance data automatically, still keeping the
> >> required redundancy level
> >>
> >> In case admin would like to set a custom placement for data, it should
> >> specify a "force" argument or something similiar.
> >>
> >> tl;dr: as default, gluster should preserve data redundancy allowing
> >> users to add single bricks without having to think how to place data.
> >> This will make gluster way easier to manage and much less error prone,
> >> thus increasing the resiliency of the whole gluster.
> >> after all , if you have a replicated volume, is obvious that you want
> >> your data to be replicated and gluster should manage this on it's own.
> >>
> >> Is this something are you planning or considering for further
> >> implementation?
> >> I know that lack of metadata server (this is a HUGE advantage for
> >> gluster) means less flexibility, but as there is a manual workaround
> >> for adding
> >> single bricks, gluster should be able to handle this automatically.
> >>
> > ___
> > Gluster-users mailing list
> > Gluster-users@gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> >
> 
>  ___
>  Gluster-users mailing list
>  Gluster-users@gluster.org
>  http://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 

Re: [Gluster-users] Cluster management

2017-04-25 Thread lemonnierk
> 1) can I add a fourth server, with one brick, increasing the total
> available space? If yes, how?

No

> 
> 2) can I increase replica count from 3 to 4 ?

Yes

> 
> 3) can I decrease replica count from 3 to 2 ?

Yes

> 
> 4) can I move from a replicated volume to a distributed replicated volume ?

Yes, that's what decreasing the replica count would do


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] web folder on glusterfs

2017-04-22 Thread lemonnierk
Hi,

I can't talk about performances on newer versions, but as of gluster
3.7 / 3.8, performances for small files (like a website) are pretty bad.
It does work well though as long as you configure OPCache to keep
everything in memory (bump the cache size and disable stat).

As for storing a disk on gluster, it would work a lot better yes
but that's not doable for a website, as you need all of your webservers
to access the files at the same time. For that you'd have to use OCSF2
instead of ext4, and that defeats the purpose of using glusterfs in the
first place I think.

You could consider setting up your 3 servers as hypervisors (we are using
proxmox in that configurating) and use gluster as your shared storage,
then set up a VM on that cluster : that way you would have a highly
available web server. That works very well, but that might be overkill
for just one website.

I'd go with what you are thinking of, it's simple to setup and with
agressive OPCache settings it works fairly well ! You really don't
want to forget disabling stat though. You'll have to restart PHP-FPM
everytime you make a change to a php file, but it's worth it.
One last thing, use a redis / memcache / whatever to store your
application cache, I don't know joomla but if it's anything like
prestashop / magento / wordpress, storing the cache on glusterfs will
kill performances even more than forgetting to disable stat in OPCache.

On Tue, Apr 11, 2017 at 11:00:01PM +0800, Umarzuki Mochlis wrote:
> Hi,
> 
> I'm planning to install glusterfs on 3 nodes of Joomla 3 with nginx &
> php-fpm on buntu 16.04.
> 
> /var/www will be used as storage volume on every node.
> 
> Each node has a secondary network interface only used for mariadb
> cluster and glusterfs. 1G interface and 1G switch.
> 
> Any pros/cons of this kind of setup?
> 
> Would it be better to have a LUN shared among all nodes?
> 
> I was told that to reduce a very slow "ls" command on volume with
> clustered filsystem, it is better that I created a raw image with
> diskdump, format it as ext4 and mount it. Is this the proper way for
> web folders?
> 
> Thanks for any input and suggestions.
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] current Version

2017-04-19 Thread lemonnierk
Hi,

Look there : 
https://download.gluster.org/pub/gluster/glusterfs/3.8/3.8.11/Debian/jessie/

On Wed, Apr 19, 2017 at 06:54:37AM +0200, Mario Roeber wrote:
> Hello Everyone, 
> 
> my Name is Mario and i’use glusterFS now around 1 Year at Home with some 
> raspberry and USB HD. I’have the question where i’can found for Debian/Jessy 
> the current 8.11 install *deb.
> Maybe someone know. 
> 
> thanks for help.  
> 
> 
> Mario Roeber
> er...@port-x.de
> 
> Sie möchten mit mir Verschluesselt eMails austauschen? Hier mein 
> oeffendlicher Schlüssel.  
> 
> 


>  
> 

> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users



signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Sharding/Local Gluster Volumes

2017-04-06 Thread lemonnierk
Hi,

If you want replica 3, you must have a multiple of 3 bricks.
So no, you can't use 5 bricks for a replica 3, that's one of the
things gluster can't do unfortunatly.


On Thu, Apr 06, 2017 at 01:09:32PM +0200, Holger Rojahn wrote:
> Hi,
> 
> i ask a question several Days ago ...
> in short: Is it Possible to have 5 Bricks with Sharding enabled and 
> replica count 3 to ensure that all files are on 3 of 5 bricks ?
> Manual says i can add any count of bricks to shared volumes but when i 
> test it with replica count 2 i can use 4 bricks but not the 5th because 
> it says must be 2 bricks to add...
> 
> any idea ?
> 
> greets from dinslaken (germany)
> holger ak icebear
> 
> ___
> Gluster-users mailing list
> Gluster-users@gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users


signature.asc
Description: Digital signature
___
Gluster-users mailing list
Gluster-users@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-users