Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

Nir Soffer Fri, 28 Feb 2014 10:06:18 -0800

----- Original Message -----
> From: "Boyan Tabakov" <bl...@alslayer.net>
> To: "Nir Soffer" <nsof...@redhat.com>
> Cc: users@ovirt.org
> Sent: Tuesday, February 25, 2014 11:53:45 AM
> Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
> nodes
> 
> Hello,
> 
> On 22.2.2014, 22:19, Nir Soffer wrote:
> > ----- Original Message -----
> >> From: "Boyan Tabakov" <bl...@alslayer.net>
> >> To: "Nir Soffer" <nsof...@redhat.com>
> >> Cc: users@ovirt.org
> >> Sent: Wednesday, February 19, 2014 7:18:36 PM
> >> Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
> >> some nodes
> >>
> >> Hello,
> >>
> >> On 19.2.2014, 17:09, Nir Soffer wrote:
> >>> ----- Original Message -----
> >>>> From: "Boyan Tabakov" <bl...@alslayer.net>
> >>>> To: users@ovirt.org
> >>>> Sent: Tuesday, February 18, 2014 3:34:49 PM
> >>>> Subject: [Users] SD Disk's Logical Volume not visible/activated on some
> >>>> nodes
> >>>
> >>>> Consequently, when creating/booting
> >>>> a VM with the said disk attached, the VM fails to start on host2,
> >>>> because host2 can't see the LV. Similarly, if the VM is started on
> >>>> host1, it fails to migrate to host2. Extract from host2 log is in the
> >>>> end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.
> >>>>
> >>>> As far as I could track quickly the vdsm code, there is only call to lvs
> >>>> and not to lvscan or lvchange so the host2 LVM doesn't fully refresh.


lvs should see any change on the shared storage.

> >>>> The only workaround so far has been to restart VDSM on host2, which
> >>>> makes it refresh all LVM data properly.

When vdsm starts, it calls multipath -r, which ensure that we see all physical 
volumes.

> >>>>
> >>>> When is host2 supposed to pick up any newly created LVs in the SD VG?
> >>>> Any suggestions where the problem might be?
> >>>
> >>> When you create a new lv on the shared storage, the new lv should be
> >>> visible on the other host. Lets start by verifying that you do see
> >>> the new lv after a disk was created.
> >>>
> >>> Try this:
> >>>
> >>> 1. Create a new disk, and check the disk uuid in the engine ui
> >>> 2. On another machine, run this command:
> >>>
> >>> lvs -o vg_name,lv_name,tags
> >>>
> >>> You can identify the new lv using tags, which should contain the new disk
> >>> uuid.
> >>>
> >>> If you don't see the new lv from the other host, please provide
> >>> /var/log/messages
> >>> and /var/log/sanlock.log.
> >>
> >> Just tried that. The disk is not visible on the non-SPM node.
> > 
> > This means that storage is not accessible from this host.
> 
> Generally, the storage seems accessible ok. For example, if I restart
> the vdsmd, all volumes get picked up correctly (become visible in lvs
> output and VMs can be started with them).

Lests repeat this test, but now, if you do not see the new lv, please 
run:

    multipath -r

And report the results.

> >> Here's the full
> >> sanlock.log for that host:
> ...
> >> 0x7fc37c0008c0:0x7fc37c0008d0:0x7fc391f5f000 ioto 10 to_count 1
> >> 2014-02-06 05:24:10+0200 563065 [31453]: s1 delta_renew read rv -202
> >> offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
> > 
> > Sanlock cannot write to the ids lockspace
> 
> Which line shows that sanlock can't write? The messages are not very
> "human readable".

The one above my comment at 2014-02-06 05:24:10+0200

I suggest to set sanlock debug level on the sanlock log to get more detailed 
output.

Edit /etc/sysconfig/sanlock and add:

# -L 7: use debug level logging to sanlock log file
SANLOCKOPTS="$SANLOCKOPTS -L 7"

> >>
> >> Last entry is from yesterday, while I just created a new disk.
> > 
> > What was the status of this host in the engine from 2014-02-06
> > 05:24:10+0200 to 2014-02-18 14:22:16?
> > 
> > vdsm.log and engine.log for this time frame will make it more clear.
> 
> Host was up and running. The vdsm and engine logs are quite large, as we
> were running some VM migrations between the hosts. Any pointers at what
> to look for? For example, I noticed many entries in engine.log like this:

It will be hard to make any progress without the logs.

> 
> One warning that I keep seeing in vdsm logs on both nodes is this:
> 
> Thread-1617881::WARNING::2014-02-24
> 16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
> 3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
>  critical size: mdasize=134217728 mdafree=0

Can you share the output of the command bellow?

    lvs -o 
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
 
I suggest that you open a bug and attach there  engine.log, /var/log/messages, 
vdsm.log and sanlock.log.

Please also give detailed info on the host os, vdsm version etc.

Nir
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

Reply via email to