Am 08.10.2015 um 16:15 schrieb Digimer:
On 08/10/15 07:50 AM, J. Echter wrote:
Hi,

i have a strange issue on CentOS 6.5

If i install a new vm on node1 it works well.

If i install a new vm on node2 it gets stuck.

Same if i do a dd if=/dev/zero of=/dev/DATEN/vm-test (on node2)

On node1 it works:

dd if=/dev/zero of=vm-test
Schreiben in „vm-test“: Auf dem Gerät ist kein Speicherplatz mehr verfügbar
83886081+0 Datensätze ein
83886080+0 Datensätze aus
42949672960 Bytes (43 GB) kopiert, 2338,15 s, 18,4 MB/s


dmesg shows the following (while dd'ing on node2):

INFO: task flush-253:18:9820 blocked for more than 120 seconds.
       Not tainted 2.6.32-573.7.1.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
<snip>
any hint on fixing that?
Every time I've seen this, it was because dlm was blocked. The most
common cause of DLM blocking is a failed fence call. Do you have fencing
configured *and* tested?

If I were to guess, given the rather limited information you shared
about your setup, the live migration consumed the network bandwidth,
chocking out corosync traffic which caused the peer to be declared lost,
called a fence which failed and left locking hung (which is by design;
better to hang that risk corruption).

Hi,

fencing is configured and works.

I re-checked it by typing

echo c > /proc/sysrq-trigger

into node2 console.

The machine is fenced and comes back up. But the problem persists.

_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to