----- Original Message ----- > From: "Boyan Tabakov" <bl...@alslayer.net> > To: "Nir Soffer" <nsof...@redhat.com> > Cc: users@ovirt.org > Sent: Wednesday, March 5, 2014 3:38:25 PM > Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some > nodes > > Hello Nir, > > On Wed Mar 5 14:37:17 2014, Nir Soffer wrote: > > ----- Original Message ----- > >> From: "Boyan Tabakov" <bl...@alslayer.net> > >> To: "Nir Soffer" <nsof...@redhat.com> > >> Cc: users@ovirt.org > >> Sent: Tuesday, March 4, 2014 3:53:24 PM > >> Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on > >> some nodes > >> > >> On Tue Mar 4 14:46:33 2014, Nir Soffer wrote: > >>> ----- Original Message ----- > >>>> From: "Nir Soffer" <nsof...@redhat.com> > >>>> To: "Boyan Tabakov" <bl...@alslayer.net> > >>>> Cc: users@ovirt.org, "Zdenek Kabelac" <zkabe...@redhat.com> > >>>> Sent: Monday, March 3, 2014 9:39:47 PM > >>>> Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on > >>>> some nodes > >>>> > >>>> Hi Zdenek, can you look into this strange incident? > >>>> > >>>> When user creates a disk on one host (create a new lv), the lv is not > >>>> seen > >>>> on another host in the cluster. > >>>> > >>>> Calling multipath -r cause the new lv to appear on the other host. > >>>> > >>>> Finally, lvs tell us that vg_mda_free is zero - maybe unrelated, but > >>>> unusual. > >>>> > >>>> ----- Original Message ----- > >>>>> From: "Boyan Tabakov" <bl...@alslayer.net> > >>>>> To: "Nir Soffer" <nsof...@redhat.com> > >>>>> Cc: users@ovirt.org > >>>>> Sent: Monday, March 3, 2014 9:51:05 AM > >>>>> Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on > >>>>> some > >>>>> nodes > >>>>>>>>>>> Consequently, when creating/booting > >>>>>>>>>>> a VM with the said disk attached, the VM fails to start on host2, > >>>>>>>>>>> because host2 can't see the LV. Similarly, if the VM is started > >>>>>>>>>>> on > >>>>>>>>>>> host1, it fails to migrate to host2. Extract from host2 log is in > >>>>>>>>>>> the > >>>>>>>>>>> end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280. > >>>>>>>>>>> > >>>>>>>>>>> As far as I could track quickly the vdsm code, there is only call > >>>>>>>>>>> to > >>>>>>>>>>> lvs > >>>>>>>>>>> and not to lvscan or lvchange so the host2 LVM doesn't fully > >>>>>>>>>>> refresh. > >>>>>> > >>>>>> lvs should see any change on the shared storage. > >>>>>> > >>>>>>>>>>> The only workaround so far has been to restart VDSM on host2, > >>>>>>>>>>> which > >>>>>>>>>>> makes it refresh all LVM data properly. > >>>>>> > >>>>>> When vdsm starts, it calls multipath -r, which ensure that we see all > >>>>>> physical volumes. > >>>>>> > >>>>>>>>>>> > >>>>>>>>>>> When is host2 supposed to pick up any newly created LVs in the SD > >>>>>>>>>>> VG? > >>>>>>>>>>> Any suggestions where the problem might be? > >>>>>>>>>> > >>>>>>>>>> When you create a new lv on the shared storage, the new lv should > >>>>>>>>>> be > >>>>>>>>>> visible on the other host. Lets start by verifying that you do see > >>>>>>>>>> the new lv after a disk was created. > >>>>>>>>>> > >>>>>>>>>> Try this: > >>>>>>>>>> > >>>>>>>>>> 1. Create a new disk, and check the disk uuid in the engine ui > >>>>>>>>>> 2. On another machine, run this command: > >>>>>>>>>> > >>>>>>>>>> lvs -o vg_name,lv_name,tags > >>>>>>>>>> > >>>>>>>>>> You can identify the new lv using tags, which should contain the > >>>>>>>>>> new > >>>>>>>>>> disk > >>>>>>>>>> uuid. > >>>>>>>>>> > >>>>>>>>>> If you don't see the new lv from the other host, please provide > >>>>>>>>>> /var/log/messages > >>>>>>>>>> and /var/log/sanlock.log. > >>>>>>>>> > >>>>>>>>> Just tried that. The disk is not visible on the non-SPM node. > >>>>>>>> > >>>>>>>> This means that storage is not accessible from this host. > >>>>>>> > >>>>>>> Generally, the storage seems accessible ok. For example, if I restart > >>>>>>> the vdsmd, all volumes get picked up correctly (become visible in lvs > >>>>>>> output and VMs can be started with them). > >>>>>> > >>>>>> Lests repeat this test, but now, if you do not see the new lv, please > >>>>>> run: > >>>>>> > >>>>>> multipath -r > >>>>>> > >>>>>> And report the results. > >>>>>> > >>>>> > >>>>> Running multipath -r helped and the disk was properly picked up by the > >>>>> second host. > >>>>> > >>>>> Is running multipath -r safe while host is not in maintenance mode? > >>>> > >>>> It should be safe, vdsm uses in some cases. > >>>> > >>>>> If yes, as a temporary workaround I can patch vdsmd to run multipath -r > >>>>> when e.g. monitoring the storage domain. > >>>> > >>>> I suggested running multipath as debugging aid; normally this is not > >>>> needed. > >>>> > >>>> You should see lv on the shared storage without running multipath. > >>>> > >>>> Zdenek, can you explain this? > >>>> > >>>>>>> One warning that I keep seeing in vdsm logs on both nodes is this: > >>>>>>> > >>>>>>> Thread-1617881::WARNING::2014-02-24 > >>>>>>> 16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG > >>>>>>> 3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded > >>>>>>> critical size: mdasize=134217728 mdafree=0 > >>>>>> > >>>>>> Can you share the output of the command bellow? > >>>>>> > >>>>>> lvs -o > >>>>>> > >>>>>> uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name > >>>>> > >>>>> Here's the output for both hosts. > >>>>> > >>>>> host1: > >>>>> [root@host1 ~]# lvs -o > >>>>> uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count > >>>>> LV UUID LV > >>>>> Attr LSize VFree Ext #Ext Free LV Tags > >>>>> > >>>>> VMdaSize VMdaFree #LV #PV > >>>>> jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL > >>>>> 3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao--- 2.00g 114.62g 128.00m > >>>>> 1596 917 > >>>>> IU_0227da98-34b2-4b0c-b083-d42e7b760036,MD_5,PU_f4231952-76c5-4764-9c8b-ac73492ac465 > >>>>> 128.00m 0 13 2 > >>>> > >>>> This looks wrong - your vg_mda_free is zero - as vdsm complains. > > > > Patch http://gerrit.ovirt.org/25408 should solve this issue. > > > > It may also solve the other issue with the missing lv - I could > > not reproduce it yet. > > > > Can you try to apply this patch and report the results? > > > > Thanks, > > Nir > > This patch helped, indeed! I tried it on the non-SPM node (as that's > the node that I can currently easily put in maintenance) and the node > started picking up newly created volumes correctly. I also set the > user_lvmetad to 0 in the main lvm.conf, because without it manually > running e.g. lvs was still using the metadata daemon. > > I can't confirm yet that this helps with the metadata volume warning, > as that warning appears only on the SPM. I'll be able to put the SPM > node in maintenance soon and will report later. > > This issue on Fedora makes me think - is Fedora still fully supported > platform?
It is supported, but probably not tested properly. Nir _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users