Hi Martin,
Glad it worked! And yes, 3.7.6 is really old! :)
So the issue is occurring when the vm flushes outstanding data to disk. And
this
is taking > 120s because there's lot of buffered writes to flush, possibly
followed
by an fsync too which needs to sync them to disk (volume profile would
Hi Krutika,
> Also, gluster version please?
I am running old 3.7.6. (Yes I know I should upgrade asap)
I’ve applied firstly "network.remote-dio off", behaviour did not changed, VMs
got stuck after some time again.
Then I’ve set "performance.strict-o-direct on" and problem completly
OK. In that case, can you check if the following two changes help:
# gluster volume set $VOL network.remote-dio off
# gluster volume set $VOL performance.strict-o-direct on
preferably one option changed at a time, its impact tested and then the
next change applied and tested.
Also, gluster
what is the context from dmesg ?
On Mon, May 13, 2019 at 7:33 AM Andrey Volodin
wrote:
> as per
> https://helpful.knobs-dials.com/index.php/INFO:_task_blocked_for_more_than_120_seconds.
> ,
> the informational warning could be suppressed with :
>
> "echo 0 >
as per
https://helpful.knobs-dials.com/index.php/INFO:_task_blocked_for_more_than_120_seconds.
,
the informational warning could be suppressed with :
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
Moreover, as per their website : "*This message is not an error*.
It is an indication that a
Cache in qemu is none. That should be correct. This is full command :
/usr/bin/qemu-system-x86_64 -name one-312 -S -machine
pc-i440fx-xenial,accel=kvm,usb=off -m 4096 -realtime mlock=off -smp
4,sockets=4,cores=1,threads=1 -uuid e95a774e-a594-4e98-b141-9f30a3f848c1
-no-user-config -nodefaults
Also, what's the caching policy that qemu is using on the affected vms?
Is it cache=none? Or something else? You can get this information in the
command line of qemu-kvm process corresponding to your vm in the ps output.
-Krutika
On Mon, May 13, 2019 at 12:49 PM Krutika Dhananjay
wrote:
> What
What version of gluster are you using?
Also, can you capture and share volume-profile output for a run where you
manage to recreate this issue?
https://docs.gluster.org/en/v3/Administrator%20Guide/Monitoring%20Workload/#running-glusterfs-volume-profile-command
Let me know if you have any
Hi,
there is no healing operation, not peer disconnects, no readonly filesystem.
Yes, storage is slow and unavailable for 120 seconds, but why, its SSD with
10G, performance is good.
> you'd have it's log on qemu's standard output,
If you mean /var/log/libvirt/qemu/vm.log there is nothing. I
On Mon, May 13, 2019 at 08:47:45AM +0200, Martin Toth wrote:
> Hi all,
Hi
>
> I am running replica 3 on SSDs with 10G networking, everything works OK but
> VMs stored in Gluster volume occasionally freeze with “Task XY blocked for
> more than 120 seconds”.
> Only solution is to poweroff
10 matches
Mail list logo