> Kevin, did you solve this issue? Any updates?
Oh yeah, we discussed it on IRC and it's apparently a known bug,
it's fixed in the next version. I tested a patched version and it
does seem to work, so I've been waiting for 3.7.12 since then to
do some proper testing and confirm that it's been
2016-05-27 13:56 GMT+02:00 Kevin Lemonnier :
> Yes, I did configure it to do a daily scrub when I reinstalled last time,
> when I was wondering if maybe it was hardware. Doesn't seem like it detected
> anything.
Kevin, did you solve this issue? Any updates?
On 27/05/2016 9:56 PM, Kevin Lemonnier wrote:
Yes, I did configure it to do a daily scrub when I reinstalled last time,
when I was wondering if maybe it was hardware. Doesn't seem like it detected
anything.
I was wondering if the scrub was interfering with things
--
Lindsay Mathieson
>Just a thought - do you have bitrot detection enabled? (I don't)
Yes, I did configure it to do a daily scrub when I reinstalled last time,
when I was wondering if maybe it was hardware. Doesn't seem like it detected
anything.
--
Kevin Lemonnier
PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
On 26/05/2016 1:58 AM, Kevin Lemonnier wrote:
There, re-created the VM from scratch, and still got the same errors.
Just a thought - do you have bitrot detection enabled? (I don't)
--
Lindsay Mathieson
___
Gluster-users mailing list
There, re-created the VM from scratch, and still got the same errors.
Attached are the logs, I created the VM on node 50, worked fine. I tried
to reboot it and start my import again, still worked fine. I powered off the
VM, then started it again on node 2, rebooted it a bunch and just got the
Just did that, below is the output.
Didn't seem to move after the boot, and no new lines when the I/O errors
appeared.
Also, as mentionned I tried moving the disk on NFS and had the exact same
errors,
so it doesn't look like it's a libgfapi problem ..
I should probably re-create the VM, maybe
On 25/05/2016 5:58 PM, Kevin Lemonnier wrote:
I use XFS, I read that was recommended. What are you using ?
Since yours seems to work, I'm not opposed to changing !
ZFS
- RAID10 (4 * WD Red 3TB)
- 8GB ram dedicated to ZFS
- SSD for log and cache (10GB and 100GB partitions respectively)
Also, it seems Lindsay knows a way to get the gluster client logs when
using proxmox and libgfapi.
Would it be possible for you to get that sorted with Lindsay's help before
recreating this issue next time
and share the glusterfs client logs from all the nodes when you do hit the
issue?
It is
Hi,
Not that I know of, no. Doesn't look like the bricks have trouble
communication, but is there a simple way to check that in glusterFS,
some sort of brick uptime ? Who knows, maybe the bricks are flickering
and I don't notice, that's entirely possible.
As mentionned, the problem occurs on
Hi Kevin,
If you actually ran into a 'read-only filesystem' issue, then it could
possibly because of a bug in AFR
that Pranith recently fixed.
To confirm if that is indeed the case, could you tell me if you saw the
pause after a brick (single brick) was
down while IO was going on?
-Krutika
On
>Whats the underlying filesystem under the bricks?
I use XFS, I read that was recommended. What are you using ?
Since yours seems to work, I'm not opposed to changing !
--
Kevin Lemonnier
PGP Fingerprint : 89A5 2283 04A0 E6E9 0111
signature.asc
Description: Digital signature
On 25/05/2016 5:36 PM, Kevin Lemonnier wrote:
Nope, not solved !
Looks like directsync just delays the problem, this morning the VM had
thrown a bunch of I/O errors again. Tried writethrough and it seems to
behave exactly like cache=none, the errors appear in a few minutes.
Trying again with
Nope, not solved !
Looks like directsync just delays the problem, this morning the VM had
thrown a bunch of I/O errors again. Tried writethrough and it seems to
behave exactly like cache=none, the errors appear in a few minutes.
Trying again with directsync and no errors for now, so it looks like
Le 24/05/2016 12:54, Lindsay Mathieson a écrit :
On 24/05/2016 8:24 PM, Kevin Lemonnier wrote:
So the VM were configured with cache set to none, I just tried with
cache=directsync and it seems to be fixing the issue. Still need to run
more test, but did a couple already with that option and no
On 24/05/2016 8:24 PM, Kevin Lemonnier wrote:
So the VM were configured with cache set to none, I just tried with
cache=directsync and it seems to be fixing the issue. Still need to run
more test, but did a couple already with that option and no I/O errors.
Never had to do this before, is it
So the VM were configured with cache set to none, I just tried with
cache=directsync and it seems to be fixing the issue. Still need to run
more test, but did a couple already with that option and no I/O errors.
Never had to do this before, is it known ? Found the clue in some old mail
from this
Hi,
Some news on this.
I actually don't need to trigger a heal to get corruption, so the problem
is not the healing. Live migrating the VM seems to trigger corruption every
time, and even without that just doing a database import, rebooting then
doing another import seems to corrupt as well.
To
Hi,
I didn't specify it but I use "localhost" to add the storage in proxmox.
My thinking is that every proxmox node is also a glusterFS node, so that
should work fine.
I don't want to use the "normal" way of setting a regular address in there
because you can't change it afterwards in proxmox,
*David Gossage*
*Carousel Checks Inc. | System Administrator*
*Office* 708.613.2284
On Thu, May 19, 2016 at 7:25 PM, Kevin Lemonnier
wrote:
> The I/O errors are happening after, not during the heal.
> As described, I just rebooted a node, waited for the heal to finish,
>
The I/O errors are happening after, not during the heal.
As described, I just rebooted a node, waited for the heal to finish,
rebooted another, waited for the heal to finish then rebooted the third.
From that point, the VM just has a lot of I/O errors showing whenever I
use the disk a lot
I am slightly confused you say you have image file corruption but then you
say the qemu-img check says there is no corruption. If what you mean is
that you see I/O errors during a heal this is likely to be due to io
starvation, something that is a well know issue.
There is work happening to
On 19/05/2016 12:17 AM, Lindsay Mathieson wrote:
One thought - since the VM's are active while the brick is
removed/re-added, could it be the shards that are written while the
brick is added that are the reverse healing shards?
I tested by:
- removing brick 3
- erasing brick 3
- closing
On 18/05/2016 11:41 PM, Krutika Dhananjay wrote:
I will try to recreate this issue tomorrow on my machines with the
steps that Lindsay provided in this thread. I will let you know the
result soon after that.
Thanks Krutika, I've been trying to get the shard stats you wanted, but
by the time
Some additional details if it helps, there is no cache on the disk,
it's virtio and iothread=1. The file is in qcow and using qemu-img check
it says it's not corrupted, but when the VM is running I have I/O Errors.
As you can see in the config, performance.stat-prefetch: off but being
on a debian
Hi,
I will try to recreate this issue tomorrow on my machines with the steps
that Lindsay provided in this thread. I will let you know the result soon
after that.
-Krutika
On Wednesday, May 18, 2016, Kevin Lemonnier wrote:
> Hi,
>
> Some news on this.
> Over the week end
Hi,
Some news on this.
Over the week end the RAID Card of the node ipvr2 died, and I thought
that maybe that was the problem all along. The RAID Card was changed
and yesterday I reinstalled everything.
Same problem just now.
My test is simple, using the website hosted on the VMs all the time
I
As discussed, the missing ipvr50 log file.
On Thu, May 12, 2016 at 04:24:14PM +0200, Kevin Lemonnier wrote:
> As requested on IRC, here are the logs on the 3 nodes.
>
> On Thu, May 12, 2016 at 04:03:02PM +0200, Kevin Lemonnier wrote:
> > Hi,
> >
> > I had a problem some time ago with 3.7.6 and
On 13/05/2016 12:03 AM, Kevin Lemonnier wrote:
I just tried to refresh
the database by importing the production one on the two MySQL VMs, and both of
them
started doing I/O errors.
Sorry, I don't quite undertsand what you did - you migrated 1 or 2 VM's
onto the test gluster volume?
--
As requested on IRC, here are the logs on the 3 nodes.
On Thu, May 12, 2016 at 04:03:02PM +0200, Kevin Lemonnier wrote:
> Hi,
>
> I had a problem some time ago with 3.7.6 and freezing during heals,
> and multiple persons advised to use 3.7.11 instead. Indeed, with that
> version the freez
Hi,
I had a problem some time ago with 3.7.6 and freezing during heals,
and multiple persons advised to use 3.7.11 instead. Indeed, with that
version the freez problem is fixed, it works like a dream ! You can
almost not tell that a node is down or healing, everything keeps working
except for a
31 matches
Mail list logo