Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-13 Thread Nir Soffer
- Original Message -
 From: John Taylor jtt77...@gmail.com
 To: users@ovirt.org
 Sent: Tuesday, March 4, 2014 4:56:32 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some   
 nodes
 
 I want to jump in here and say I'm seeing the same thing.
 ovirt 3.3.2  on f19
 hosts  are vdsm 4.13.3-3.fc19
 
 I'm using storage domain iscsi (fujitsu eternus) with 4 hosts. I've
 known about the warning with the vg_mda_free ( I asked on lvm with no
 response
 https://www.redhat.com/archives/linux-lvm/2014-February/msg00033.html
 ) but until now I didn't verify the problem with not seeing the lvs.
 My test was create a standalone disk on the iscsi sd. The lv only
 shows on the spm where it was created. None of the other 3 hosts show
 it.   multipath -r on a non-spm host causes it to show up.

This patch should solve your issue:
http://gerrit.ovirt.org/25408

Please report if it does.

Thanks,
Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-08 Thread John Taylor
I want to jump in here and say I'm seeing the same thing.
ovirt 3.3.2  on f19
hosts  are vdsm 4.13.3-3.fc19

I'm using storage domain iscsi (fujitsu eternus) with 4 hosts. I've
known about the warning with the vg_mda_free ( I asked on lvm with no
response https://www.redhat.com/archives/linux-lvm/2014-February/msg00033.html
) but until now I didn't verify the problem with not seeing the lvs.
My test was create a standalone disk on the iscsi sd. The lv only
shows on the spm where it was created. None of the other 3 hosts show
it.   multipath -r on a non-spm host causes it to show up.

same lvm version
# lvm version
  LVM version: 2.02.98(2) (2012-10-15)
  Library version: 1.02.77 (2012-10-15)
  Driver version:  4.26.0

-John




 This looks wrong - your vg_mda_free is zero - as vdsm complains.

 Zdenek, how can we debug this further?

 I see same issue in Fedora 19.

 Can you share with us the output of:

 cat /etc/redhat-release
 uname -a
 lvm version

 Nir

$ cat /etc/redhat-release
Fedora release 19 (Schrödinger's Cat)
$ uname -a
Linux blizzard.mgmt.futurice.com 3.12.6-200.fc19.x86_64.debug #1 SMP
Mon Dec 23 16:24:32 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ lvm version
  LVM version: 2.02.98(2) (2012-10-15)
  Library version: 1.02.77 (2012-10-15)
  Driver version:  4.26.0
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-05 Thread Nir Soffer
- Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Tuesday, March 4, 2014 3:53:24 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes
 
 On Tue Mar  4 14:46:33 2014, Nir Soffer wrote:
  - Original Message -
  From: Nir Soffer nsof...@redhat.com
  To: Boyan Tabakov bl...@alslayer.net
  Cc: users@ovirt.org, Zdenek Kabelac zkabe...@redhat.com
  Sent: Monday, March 3, 2014 9:39:47 PM
  Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
  some nodes
 
  Hi Zdenek, can you look into this strange incident?
 
  When user creates a disk on one host (create a new lv), the lv is not seen
  on another host in the cluster.
 
  Calling multipath -r cause the new lv to appear on the other host.
 
  Finally, lvs tell us that vg_mda_free is zero - maybe unrelated, but
  unusual.
 
  - Original Message -
  From: Boyan Tabakov bl...@alslayer.net
  To: Nir Soffer nsof...@redhat.com
  Cc: users@ovirt.org
  Sent: Monday, March 3, 2014 9:51:05 AM
  Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
  some
  nodes
  Consequently, when creating/booting
  a VM with the said disk attached, the VM fails to start on host2,
  because host2 can't see the LV. Similarly, if the VM is started on
  host1, it fails to migrate to host2. Extract from host2 log is in
  the
  end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.
 
  As far as I could track quickly the vdsm code, there is only call
  to
  lvs
  and not to lvscan or lvchange so the host2 LVM doesn't fully
  refresh.
 
  lvs should see any change on the shared storage.
 
  The only workaround so far has been to restart VDSM on host2, which
  makes it refresh all LVM data properly.
 
  When vdsm starts, it calls multipath -r, which ensure that we see all
  physical volumes.
 
 
  When is host2 supposed to pick up any newly created LVs in the SD
  VG?
  Any suggestions where the problem might be?
 
  When you create a new lv on the shared storage, the new lv should be
  visible on the other host. Lets start by verifying that you do see
  the new lv after a disk was created.
 
  Try this:
 
  1. Create a new disk, and check the disk uuid in the engine ui
  2. On another machine, run this command:
 
  lvs -o vg_name,lv_name,tags
 
  You can identify the new lv using tags, which should contain the new
  disk
  uuid.
 
  If you don't see the new lv from the other host, please provide
  /var/log/messages
  and /var/log/sanlock.log.
 
  Just tried that. The disk is not visible on the non-SPM node.
 
  This means that storage is not accessible from this host.
 
  Generally, the storage seems accessible ok. For example, if I restart
  the vdsmd, all volumes get picked up correctly (become visible in lvs
  output and VMs can be started with them).
 
  Lests repeat this test, but now, if you do not see the new lv, please
  run:
 
  multipath -r
 
  And report the results.
 
 
  Running multipath -r helped and the disk was properly picked up by the
  second host.
 
  Is running multipath -r safe while host is not in maintenance mode?
 
  It should be safe, vdsm uses in some cases.
 
  If yes, as a temporary workaround I can patch vdsmd to run multipath -r
  when e.g. monitoring the storage domain.
 
  I suggested running multipath as debugging aid; normally this is not
  needed.
 
  You should see lv on the shared storage without running multipath.
 
  Zdenek, can you explain this?
 
  One warning that I keep seeing in vdsm logs on both nodes is this:
 
  Thread-1617881::WARNING::2014-02-24
  16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
  3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
   critical size: mdasize=134217728 mdafree=0
 
  Can you share the output of the command bellow?
 
  lvs -o
  
  uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
 
  Here's the output for both hosts.
 
  host1:
  [root@host1 ~]# lvs -o
  uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count
LV UUIDLV
Attr  LSize   VFree   Ext #Ext  Free  LV Tags
 
  VMdaSize  VMdaFree  #LV #PV
jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL
  3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao---   2.00g 114.62g 128.00m
  1596   917
  IU_0227da98-34b2-4b0c-b083-d42e7b760036,MD_5,PU_f4231952-76c5-4764-9c8b-ac73492ac465
 128.00m0   13   2
 
  This looks wrong - your vg_mda_free is zero - as vdsm complains.

Patch http://gerrit.ovirt.org/25408 should solve this issue.

It may also solve the other issue with the missing lv - I could
not reproduce it yet.

Can you try to apply this patch and report the results?

Thanks,
Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org

Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-05 Thread Boyan Tabakov
Hello Nir,

On Wed Mar  5 14:37:17 2014, Nir Soffer wrote:
 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Tuesday, March 4, 2014 3:53:24 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes

 On Tue Mar  4 14:46:33 2014, Nir Soffer wrote:
 - Original Message -
 From: Nir Soffer nsof...@redhat.com
 To: Boyan Tabakov bl...@alslayer.net
 Cc: users@ovirt.org, Zdenek Kabelac zkabe...@redhat.com
 Sent: Monday, March 3, 2014 9:39:47 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
 some nodes

 Hi Zdenek, can you look into this strange incident?

 When user creates a disk on one host (create a new lv), the lv is not seen
 on another host in the cluster.

 Calling multipath -r cause the new lv to appear on the other host.

 Finally, lvs tell us that vg_mda_free is zero - maybe unrelated, but
 unusual.

 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Monday, March 3, 2014 9:51:05 AM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
 some
 nodes
 Consequently, when creating/booting
 a VM with the said disk attached, the VM fails to start on host2,
 because host2 can't see the LV. Similarly, if the VM is started on
 host1, it fails to migrate to host2. Extract from host2 log is in
 the
 end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.

 As far as I could track quickly the vdsm code, there is only call
 to
 lvs
 and not to lvscan or lvchange so the host2 LVM doesn't fully
 refresh.

 lvs should see any change on the shared storage.

 The only workaround so far has been to restart VDSM on host2, which
 makes it refresh all LVM data properly.

 When vdsm starts, it calls multipath -r, which ensure that we see all
 physical volumes.


 When is host2 supposed to pick up any newly created LVs in the SD
 VG?
 Any suggestions where the problem might be?

 When you create a new lv on the shared storage, the new lv should be
 visible on the other host. Lets start by verifying that you do see
 the new lv after a disk was created.

 Try this:

 1. Create a new disk, and check the disk uuid in the engine ui
 2. On another machine, run this command:

 lvs -o vg_name,lv_name,tags

 You can identify the new lv using tags, which should contain the new
 disk
 uuid.

 If you don't see the new lv from the other host, please provide
 /var/log/messages
 and /var/log/sanlock.log.

 Just tried that. The disk is not visible on the non-SPM node.

 This means that storage is not accessible from this host.

 Generally, the storage seems accessible ok. For example, if I restart
 the vdsmd, all volumes get picked up correctly (become visible in lvs
 output and VMs can be started with them).

 Lests repeat this test, but now, if you do not see the new lv, please
 run:

 multipath -r

 And report the results.


 Running multipath -r helped and the disk was properly picked up by the
 second host.

 Is running multipath -r safe while host is not in maintenance mode?

 It should be safe, vdsm uses in some cases.

 If yes, as a temporary workaround I can patch vdsmd to run multipath -r
 when e.g. monitoring the storage domain.

 I suggested running multipath as debugging aid; normally this is not
 needed.

 You should see lv on the shared storage without running multipath.

 Zdenek, can you explain this?

 One warning that I keep seeing in vdsm logs on both nodes is this:

 Thread-1617881::WARNING::2014-02-24
 16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
 3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
  critical size: mdasize=134217728 mdafree=0

 Can you share the output of the command bellow?

 lvs -o
 
 uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name

 Here's the output for both hosts.

 host1:
 [root@host1 ~]# lvs -o
 uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count
   LV UUIDLV
   Attr  LSize   VFree   Ext #Ext  Free  LV Tags

 VMdaSize  VMdaFree  #LV #PV
   jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL
 3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao---   2.00g 114.62g 128.00m
 1596   917
 IU_0227da98-34b2-4b0c-b083-d42e7b760036,MD_5,PU_f4231952-76c5-4764-9c8b-ac73492ac465
128.00m0   13   2

 This looks wrong - your vg_mda_free is zero - as vdsm complains.

 Patch http://gerrit.ovirt.org/25408 should solve this issue.

 It may also solve the other issue with the missing lv - I could
 not reproduce it yet.

 Can you try to apply this patch and report the results?

 Thanks,
 Nir

This patch helped, indeed! I tried it on the non-SPM node (as that's 
the node that I can currently easily put in maintenance) and the node 
started picking up newly created

Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-05 Thread Nir Soffer
- Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, March 5, 2014 3:38:25 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes
 
 Hello Nir,
 
 On Wed Mar  5 14:37:17 2014, Nir Soffer wrote:
  - Original Message -
  From: Boyan Tabakov bl...@alslayer.net
  To: Nir Soffer nsof...@redhat.com
  Cc: users@ovirt.org
  Sent: Tuesday, March 4, 2014 3:53:24 PM
  Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
  some nodes
 
  On Tue Mar  4 14:46:33 2014, Nir Soffer wrote:
  - Original Message -
  From: Nir Soffer nsof...@redhat.com
  To: Boyan Tabakov bl...@alslayer.net
  Cc: users@ovirt.org, Zdenek Kabelac zkabe...@redhat.com
  Sent: Monday, March 3, 2014 9:39:47 PM
  Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
  some nodes
 
  Hi Zdenek, can you look into this strange incident?
 
  When user creates a disk on one host (create a new lv), the lv is not
  seen
  on another host in the cluster.
 
  Calling multipath -r cause the new lv to appear on the other host.
 
  Finally, lvs tell us that vg_mda_free is zero - maybe unrelated, but
  unusual.
 
  - Original Message -
  From: Boyan Tabakov bl...@alslayer.net
  To: Nir Soffer nsof...@redhat.com
  Cc: users@ovirt.org
  Sent: Monday, March 3, 2014 9:51:05 AM
  Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
  some
  nodes
  Consequently, when creating/booting
  a VM with the said disk attached, the VM fails to start on host2,
  because host2 can't see the LV. Similarly, if the VM is started
  on
  host1, it fails to migrate to host2. Extract from host2 log is in
  the
  end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.
 
  As far as I could track quickly the vdsm code, there is only call
  to
  lvs
  and not to lvscan or lvchange so the host2 LVM doesn't fully
  refresh.
 
  lvs should see any change on the shared storage.
 
  The only workaround so far has been to restart VDSM on host2,
  which
  makes it refresh all LVM data properly.
 
  When vdsm starts, it calls multipath -r, which ensure that we see all
  physical volumes.
 
 
  When is host2 supposed to pick up any newly created LVs in the SD
  VG?
  Any suggestions where the problem might be?
 
  When you create a new lv on the shared storage, the new lv should
  be
  visible on the other host. Lets start by verifying that you do see
  the new lv after a disk was created.
 
  Try this:
 
  1. Create a new disk, and check the disk uuid in the engine ui
  2. On another machine, run this command:
 
  lvs -o vg_name,lv_name,tags
 
  You can identify the new lv using tags, which should contain the
  new
  disk
  uuid.
 
  If you don't see the new lv from the other host, please provide
  /var/log/messages
  and /var/log/sanlock.log.
 
  Just tried that. The disk is not visible on the non-SPM node.
 
  This means that storage is not accessible from this host.
 
  Generally, the storage seems accessible ok. For example, if I restart
  the vdsmd, all volumes get picked up correctly (become visible in lvs
  output and VMs can be started with them).
 
  Lests repeat this test, but now, if you do not see the new lv, please
  run:
 
  multipath -r
 
  And report the results.
 
 
  Running multipath -r helped and the disk was properly picked up by the
  second host.
 
  Is running multipath -r safe while host is not in maintenance mode?
 
  It should be safe, vdsm uses in some cases.
 
  If yes, as a temporary workaround I can patch vdsmd to run multipath -r
  when e.g. monitoring the storage domain.
 
  I suggested running multipath as debugging aid; normally this is not
  needed.
 
  You should see lv on the shared storage without running multipath.
 
  Zdenek, can you explain this?
 
  One warning that I keep seeing in vdsm logs on both nodes is this:
 
  Thread-1617881::WARNING::2014-02-24
  16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
  3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
   critical size: mdasize=134217728 mdafree=0
 
  Can you share the output of the command bellow?
 
  lvs -o
  
  uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
 
  Here's the output for both hosts.
 
  host1:
  [root@host1 ~]# lvs -o
  uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count
LV UUIDLV
Attr  LSize   VFree   Ext #Ext  Free  LV Tags
 
  VMdaSize  VMdaFree  #LV #PV
jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL
  3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao---   2.00g 114.62g 128.00m
  1596   917
  IU_0227da98-34b2-4b0c-b083-d42e7b760036,MD_5,PU_f4231952-76c5-4764-9c8b-ac73492ac465
 128.00m0   13   2
 
  This looks wrong - your vg_mda_free is zero - as vdsm

Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-05 Thread Boyan Tabakov
On 5.3.2014, 16:01, Nir Soffer wrote:
 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, March 5, 2014 3:38:25 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes

 Hello Nir,

 On Wed Mar  5 14:37:17 2014, Nir Soffer wrote:
 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Tuesday, March 4, 2014 3:53:24 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
 some nodes

 On Tue Mar  4 14:46:33 2014, Nir Soffer wrote:
 - Original Message -
 From: Nir Soffer nsof...@redhat.com
 To: Boyan Tabakov bl...@alslayer.net
 Cc: users@ovirt.org, Zdenek Kabelac zkabe...@redhat.com
 Sent: Monday, March 3, 2014 9:39:47 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
 some nodes

 Hi Zdenek, can you look into this strange incident?

 When user creates a disk on one host (create a new lv), the lv is not
 seen
 on another host in the cluster.

 Calling multipath -r cause the new lv to appear on the other host.

 Finally, lvs tell us that vg_mda_free is zero - maybe unrelated, but
 unusual.

 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Monday, March 3, 2014 9:51:05 AM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
 some
 nodes
 Consequently, when creating/booting
 a VM with the said disk attached, the VM fails to start on host2,
 because host2 can't see the LV. Similarly, if the VM is started
 on
 host1, it fails to migrate to host2. Extract from host2 log is in
 the
 end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.

 As far as I could track quickly the vdsm code, there is only call
 to
 lvs
 and not to lvscan or lvchange so the host2 LVM doesn't fully
 refresh.

 lvs should see any change on the shared storage.

 The only workaround so far has been to restart VDSM on host2,
 which
 makes it refresh all LVM data properly.

 When vdsm starts, it calls multipath -r, which ensure that we see all
 physical volumes.


 When is host2 supposed to pick up any newly created LVs in the SD
 VG?
 Any suggestions where the problem might be?

 When you create a new lv on the shared storage, the new lv should
 be
 visible on the other host. Lets start by verifying that you do see
 the new lv after a disk was created.

 Try this:

 1. Create a new disk, and check the disk uuid in the engine ui
 2. On another machine, run this command:

 lvs -o vg_name,lv_name,tags

 You can identify the new lv using tags, which should contain the
 new
 disk
 uuid.

 If you don't see the new lv from the other host, please provide
 /var/log/messages
 and /var/log/sanlock.log.

 Just tried that. The disk is not visible on the non-SPM node.

 This means that storage is not accessible from this host.

 Generally, the storage seems accessible ok. For example, if I restart
 the vdsmd, all volumes get picked up correctly (become visible in lvs
 output and VMs can be started with them).

 Lests repeat this test, but now, if you do not see the new lv, please
 run:

 multipath -r

 And report the results.


 Running multipath -r helped and the disk was properly picked up by the
 second host.

 Is running multipath -r safe while host is not in maintenance mode?

 It should be safe, vdsm uses in some cases.

 If yes, as a temporary workaround I can patch vdsmd to run multipath -r
 when e.g. monitoring the storage domain.

 I suggested running multipath as debugging aid; normally this is not
 needed.

 You should see lv on the shared storage without running multipath.

 Zdenek, can you explain this?

 One warning that I keep seeing in vdsm logs on both nodes is this:

 Thread-1617881::WARNING::2014-02-24
 16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
 3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
  critical size: mdasize=134217728 mdafree=0

 Can you share the output of the command bellow?

 lvs -o
 
 uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name

 Here's the output for both hosts.

 host1:
 [root@host1 ~]# lvs -o
 uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count
   LV UUIDLV
   Attr  LSize   VFree   Ext #Ext  Free  LV Tags

 VMdaSize  VMdaFree  #LV #PV
   jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL
 3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao---   2.00g 114.62g 128.00m
 1596   917
 IU_0227da98-34b2-4b0c-b083-d42e7b760036,MD_5,PU_f4231952-76c5-4764-9c8b-ac73492ac465
128.00m0   13   2

 This looks wrong - your vg_mda_free is zero - as vdsm complains.

 Patch http://gerrit.ovirt.org/25408 should solve this issue.

 It may also solve the other issue

Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-04 Thread Nir Soffer
- Original Message -
 From: Nir Soffer nsof...@redhat.com
 To: Boyan Tabakov bl...@alslayer.net
 Cc: users@ovirt.org, Zdenek Kabelac zkabe...@redhat.com
 Sent: Monday, March 3, 2014 9:39:47 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes
 
 Hi Zdenek, can you look into this strange incident?
 
 When user creates a disk on one host (create a new lv), the lv is not seen
 on another host in the cluster.
 
 Calling multipath -r cause the new lv to appear on the other host.
 
 Finally, lvs tell us that vg_mda_free is zero - maybe unrelated, but unusual.
 
 - Original Message -
  From: Boyan Tabakov bl...@alslayer.net
  To: Nir Soffer nsof...@redhat.com
  Cc: users@ovirt.org
  Sent: Monday, March 3, 2014 9:51:05 AM
  Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some
  nodes
   Consequently, when creating/booting
   a VM with the said disk attached, the VM fails to start on host2,
   because host2 can't see the LV. Similarly, if the VM is started on
   host1, it fails to migrate to host2. Extract from host2 log is in
   the
   end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.
  
   As far as I could track quickly the vdsm code, there is only call to
   lvs
   and not to lvscan or lvchange so the host2 LVM doesn't fully
   refresh.
   
   lvs should see any change on the shared storage.
   
   The only workaround so far has been to restart VDSM on host2, which
   makes it refresh all LVM data properly.
   
   When vdsm starts, it calls multipath -r, which ensure that we see all
   physical volumes.
   
  
   When is host2 supposed to pick up any newly created LVs in the SD
   VG?
   Any suggestions where the problem might be?
  
   When you create a new lv on the shared storage, the new lv should be
   visible on the other host. Lets start by verifying that you do see
   the new lv after a disk was created.
  
   Try this:
  
   1. Create a new disk, and check the disk uuid in the engine ui
   2. On another machine, run this command:
  
   lvs -o vg_name,lv_name,tags
  
   You can identify the new lv using tags, which should contain the new
   disk
   uuid.
  
   If you don't see the new lv from the other host, please provide
   /var/log/messages
   and /var/log/sanlock.log.
  
   Just tried that. The disk is not visible on the non-SPM node.
  
   This means that storage is not accessible from this host.
  
   Generally, the storage seems accessible ok. For example, if I restart
   the vdsmd, all volumes get picked up correctly (become visible in lvs
   output and VMs can be started with them).
   
   Lests repeat this test, but now, if you do not see the new lv, please
   run:
   
   multipath -r
   
   And report the results.
   
  
  Running multipath -r helped and the disk was properly picked up by the
  second host.
  
  Is running multipath -r safe while host is not in maintenance mode?
 
 It should be safe, vdsm uses in some cases.
 
  If yes, as a temporary workaround I can patch vdsmd to run multipath -r
  when e.g. monitoring the storage domain.
 
 I suggested running multipath as debugging aid; normally this is not needed.
 
 You should see lv on the shared storage without running multipath.
 
 Zdenek, can you explain this?
 
   One warning that I keep seeing in vdsm logs on both nodes is this:
  
   Thread-1617881::WARNING::2014-02-24
   16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
   3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
critical size: mdasize=134217728 mdafree=0
   
   Can you share the output of the command bellow?
   
   lvs -o
   
   uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
  
  Here's the output for both hosts.
  
  host1:
  [root@host1 ~]# lvs -o
  uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count
LV UUIDLV
Attr  LSize   VFree   Ext #Ext  Free  LV Tags
  
  VMdaSize  VMdaFree  #LV #PV
jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL
  3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao---   2.00g 114.62g 128.00m
  1596   917
  IU_0227da98-34b2-4b0c-b083-d42e7b760036,MD_5,PU_f4231952-76c5-4764-9c8b-ac73492ac465
 128.00m0   13   2
 
 This looks wrong - your vg_mda_free is zero - as vdsm complains.
 
 Zdenek, how can we debug this further?

I see same issue in Fedora 19.

Can you share with us the output of:

cat /etc/redhat-release
uname -a
lvm version

Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-04 Thread Boyan Tabakov
On Tue Mar  4 14:46:33 2014, Nir Soffer wrote:
 - Original Message -
 From: Nir Soffer nsof...@redhat.com
 To: Boyan Tabakov bl...@alslayer.net
 Cc: users@ovirt.org, Zdenek Kabelac zkabe...@redhat.com
 Sent: Monday, March 3, 2014 9:39:47 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes

 Hi Zdenek, can you look into this strange incident?

 When user creates a disk on one host (create a new lv), the lv is not seen
 on another host in the cluster.

 Calling multipath -r cause the new lv to appear on the other host.

 Finally, lvs tell us that vg_mda_free is zero - maybe unrelated, but unusual.

 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Monday, March 3, 2014 9:51:05 AM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some
 nodes
 Consequently, when creating/booting
 a VM with the said disk attached, the VM fails to start on host2,
 because host2 can't see the LV. Similarly, if the VM is started on
 host1, it fails to migrate to host2. Extract from host2 log is in
 the
 end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.

 As far as I could track quickly the vdsm code, there is only call to
 lvs
 and not to lvscan or lvchange so the host2 LVM doesn't fully
 refresh.

 lvs should see any change on the shared storage.

 The only workaround so far has been to restart VDSM on host2, which
 makes it refresh all LVM data properly.

 When vdsm starts, it calls multipath -r, which ensure that we see all
 physical volumes.


 When is host2 supposed to pick up any newly created LVs in the SD
 VG?
 Any suggestions where the problem might be?

 When you create a new lv on the shared storage, the new lv should be
 visible on the other host. Lets start by verifying that you do see
 the new lv after a disk was created.

 Try this:

 1. Create a new disk, and check the disk uuid in the engine ui
 2. On another machine, run this command:

 lvs -o vg_name,lv_name,tags

 You can identify the new lv using tags, which should contain the new
 disk
 uuid.

 If you don't see the new lv from the other host, please provide
 /var/log/messages
 and /var/log/sanlock.log.

 Just tried that. The disk is not visible on the non-SPM node.

 This means that storage is not accessible from this host.

 Generally, the storage seems accessible ok. For example, if I restart
 the vdsmd, all volumes get picked up correctly (become visible in lvs
 output and VMs can be started with them).

 Lests repeat this test, but now, if you do not see the new lv, please
 run:

 multipath -r

 And report the results.


 Running multipath -r helped and the disk was properly picked up by the
 second host.

 Is running multipath -r safe while host is not in maintenance mode?

 It should be safe, vdsm uses in some cases.

 If yes, as a temporary workaround I can patch vdsmd to run multipath -r
 when e.g. monitoring the storage domain.

 I suggested running multipath as debugging aid; normally this is not needed.

 You should see lv on the shared storage without running multipath.

 Zdenek, can you explain this?

 One warning that I keep seeing in vdsm logs on both nodes is this:

 Thread-1617881::WARNING::2014-02-24
 16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
 3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
  critical size: mdasize=134217728 mdafree=0

 Can you share the output of the command bellow?

 lvs -o
 
 uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name

 Here's the output for both hosts.

 host1:
 [root@host1 ~]# lvs -o
 uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count
   LV UUIDLV
   Attr  LSize   VFree   Ext #Ext  Free  LV Tags

 VMdaSize  VMdaFree  #LV #PV
   jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL
 3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao---   2.00g 114.62g 128.00m
 1596   917
 IU_0227da98-34b2-4b0c-b083-d42e7b760036,MD_5,PU_f4231952-76c5-4764-9c8b-ac73492ac465
128.00m0   13   2

 This looks wrong - your vg_mda_free is zero - as vdsm complains.

 Zdenek, how can we debug this further?

 I see same issue in Fedora 19.

 Can you share with us the output of:

 cat /etc/redhat-release
 uname -a
 lvm version

 Nir

$ cat /etc/redhat-release
Fedora release 19 (Schrödinger’s Cat)
$ uname -a
Linux blizzard.mgmt.futurice.com 3.12.6-200.fc19.x86_64.debug #1 SMP 
Mon Dec 23 16:24:32 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ lvm version
  LVM version: 2.02.98(2) (2012-10-15)
  Library version: 1.02.77 (2012-10-15)
  Driver version:  4.26.0



signature.asc
Description: OpenPGP digital signature
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-03 Thread Nir Soffer
Hi Zdenek, can you look into this strange incident?

When user creates a disk on one host (create a new lv), the lv is not seen
on another host in the cluster.

Calling multipath -r cause the new lv to appear on the other host.

Finally, lvs tell us that vg_mda_free is zero - maybe unrelated, but unusual.

- Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Monday, March 3, 2014 9:51:05 AM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes
  Consequently, when creating/booting
  a VM with the said disk attached, the VM fails to start on host2,
  because host2 can't see the LV. Similarly, if the VM is started on
  host1, it fails to migrate to host2. Extract from host2 log is in the
  end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.
 
  As far as I could track quickly the vdsm code, there is only call to
  lvs
  and not to lvscan or lvchange so the host2 LVM doesn't fully refresh.
  
  lvs should see any change on the shared storage.
  
  The only workaround so far has been to restart VDSM on host2, which
  makes it refresh all LVM data properly.
  
  When vdsm starts, it calls multipath -r, which ensure that we see all
  physical volumes.
  
 
  When is host2 supposed to pick up any newly created LVs in the SD VG?
  Any suggestions where the problem might be?
 
  When you create a new lv on the shared storage, the new lv should be
  visible on the other host. Lets start by verifying that you do see
  the new lv after a disk was created.
 
  Try this:
 
  1. Create a new disk, and check the disk uuid in the engine ui
  2. On another machine, run this command:
 
  lvs -o vg_name,lv_name,tags
 
  You can identify the new lv using tags, which should contain the new
  disk
  uuid.
 
  If you don't see the new lv from the other host, please provide
  /var/log/messages
  and /var/log/sanlock.log.
 
  Just tried that. The disk is not visible on the non-SPM node.
 
  This means that storage is not accessible from this host.
 
  Generally, the storage seems accessible ok. For example, if I restart
  the vdsmd, all volumes get picked up correctly (become visible in lvs
  output and VMs can be started with them).
  
  Lests repeat this test, but now, if you do not see the new lv, please
  run:
  
  multipath -r
  
  And report the results.
  
 
 Running multipath -r helped and the disk was properly picked up by the
 second host.
 
 Is running multipath -r safe while host is not in maintenance mode?

It should be safe, vdsm uses in some cases.

 If yes, as a temporary workaround I can patch vdsmd to run multipath -r
 when e.g. monitoring the storage domain.

I suggested running multipath as debugging aid; normally this is not needed.

You should see lv on the shared storage without running multipath.

Zdenek, can you explain this?

  One warning that I keep seeing in vdsm logs on both nodes is this:
 
  Thread-1617881::WARNING::2014-02-24
  16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
  3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
   critical size: mdasize=134217728 mdafree=0
  
  Can you share the output of the command bellow?
  
  lvs -o
  
  uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
 
 Here's the output for both hosts.
 
 host1:
 [root@host1 ~]# lvs -o
 uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count
   LV UUIDLV
   Attr  LSize   VFree   Ext #Ext  Free  LV Tags
 
 VMdaSize  VMdaFree  #LV #PV
   jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL
 3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao---   2.00g 114.62g 128.00m
 1596   917
 IU_0227da98-34b2-4b0c-b083-d42e7b760036,MD_5,PU_f4231952-76c5-4764-9c8b-ac73492ac465
128.00m0   13   2

This looks wrong - your vg_mda_free is zero - as vdsm complains.

Zdenek, how can we debug this further?

Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-03-02 Thread Boyan Tabakov
On 28.2.2014, 20:05, Nir Soffer wrote:
 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Tuesday, February 25, 2014 11:53:45 AM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes

 Hello,

 On 22.2.2014, 22:19, Nir Soffer wrote:
 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, February 19, 2014 7:18:36 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
 some nodes

 Hello,

 On 19.2.2014, 17:09, Nir Soffer wrote:
 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: users@ovirt.org
 Sent: Tuesday, February 18, 2014 3:34:49 PM
 Subject: [Users] SD Disk's Logical Volume not visible/activated on some
 nodes

 Consequently, when creating/booting
 a VM with the said disk attached, the VM fails to start on host2,
 because host2 can't see the LV. Similarly, if the VM is started on
 host1, it fails to migrate to host2. Extract from host2 log is in the
 end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.

 As far as I could track quickly the vdsm code, there is only call to lvs
 and not to lvscan or lvchange so the host2 LVM doesn't fully refresh.
 
 lvs should see any change on the shared storage.
 
 The only workaround so far has been to restart VDSM on host2, which
 makes it refresh all LVM data properly.
 
 When vdsm starts, it calls multipath -r, which ensure that we see all 
 physical volumes.
 

 When is host2 supposed to pick up any newly created LVs in the SD VG?
 Any suggestions where the problem might be?

 When you create a new lv on the shared storage, the new lv should be
 visible on the other host. Lets start by verifying that you do see
 the new lv after a disk was created.

 Try this:

 1. Create a new disk, and check the disk uuid in the engine ui
 2. On another machine, run this command:

 lvs -o vg_name,lv_name,tags

 You can identify the new lv using tags, which should contain the new disk
 uuid.

 If you don't see the new lv from the other host, please provide
 /var/log/messages
 and /var/log/sanlock.log.

 Just tried that. The disk is not visible on the non-SPM node.

 This means that storage is not accessible from this host.

 Generally, the storage seems accessible ok. For example, if I restart
 the vdsmd, all volumes get picked up correctly (become visible in lvs
 output and VMs can be started with them).
 
 Lests repeat this test, but now, if you do not see the new lv, please 
 run:
 
 multipath -r
 
 And report the results.
 

Running multipath -r helped and the disk was properly picked up by the
second host.

Is running multipath -r safe while host is not in maintenance mode? If
yes, as a temporary workaround I can patch vdsmd to run multipath -r
when e.g. monitoring the storage domain.

 Here's the full
 sanlock.log for that host:
 ...
 0x7fc37c0008c0:0x7fc37c0008d0:0x7fc391f5f000 ioto 10 to_count 1
 2014-02-06 05:24:10+0200 563065 [31453]: s1 delta_renew read rv -202
 offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids

 Sanlock cannot write to the ids lockspace

 Which line shows that sanlock can't write? The messages are not very
 human readable.
 
 The one above my comment at 2014-02-06 05:24:10+0200
 
 I suggest to set sanlock debug level on the sanlock log to get more detailed 
 output.
 
 Edit /etc/sysconfig/sanlock and add:
 
 # -L 7: use debug level logging to sanlock log file
 SANLOCKOPTS=$SANLOCKOPTS -L 7
 

 Last entry is from yesterday, while I just created a new disk.

 What was the status of this host in the engine from 2014-02-06
 05:24:10+0200 to 2014-02-18 14:22:16?

 vdsm.log and engine.log for this time frame will make it more clear.

 Host was up and running. The vdsm and engine logs are quite large, as we
 were running some VM migrations between the hosts. Any pointers at what
 to look for? For example, I noticed many entries in engine.log like this:
 
 It will be hard to make any progress without the logs.
 

 One warning that I keep seeing in vdsm logs on both nodes is this:

 Thread-1617881::WARNING::2014-02-24
 16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
 3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
  critical size: mdasize=134217728 mdafree=0
 
 Can you share the output of the command bellow?
 
 lvs -o 
 uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name

Here's the output for both hosts.

host1:
[root@host1 ~]# lvs -o
uuid,name,attr,size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count
  LV UUIDLV
  Attr  LSize   VFree   Ext #Ext  Free  LV Tags

VMdaSize  VMdaFree  #LV #PV
  jGEpVm-oPW8-XyxI-l2yi-YF4X-qteQ-dm8SqL
3d362bf2-20f4-438d-9ba9-486bd2e8cedf -wi-ao---   2.00g 114.62g 128.00m
1596   917

Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-02-28 Thread Nir Soffer
- Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Tuesday, February 25, 2014 11:53:45 AM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes
 
 Hello,
 
 On 22.2.2014, 22:19, Nir Soffer wrote:
  - Original Message -
  From: Boyan Tabakov bl...@alslayer.net
  To: Nir Soffer nsof...@redhat.com
  Cc: users@ovirt.org
  Sent: Wednesday, February 19, 2014 7:18:36 PM
  Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on
  some nodes
 
  Hello,
 
  On 19.2.2014, 17:09, Nir Soffer wrote:
  - Original Message -
  From: Boyan Tabakov bl...@alslayer.net
  To: users@ovirt.org
  Sent: Tuesday, February 18, 2014 3:34:49 PM
  Subject: [Users] SD Disk's Logical Volume not visible/activated on some
  nodes
 
  Consequently, when creating/booting
  a VM with the said disk attached, the VM fails to start on host2,
  because host2 can't see the LV. Similarly, if the VM is started on
  host1, it fails to migrate to host2. Extract from host2 log is in the
  end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.
 
  As far as I could track quickly the vdsm code, there is only call to lvs
  and not to lvscan or lvchange so the host2 LVM doesn't fully refresh.

lvs should see any change on the shared storage.

  The only workaround so far has been to restart VDSM on host2, which
  makes it refresh all LVM data properly.

When vdsm starts, it calls multipath -r, which ensure that we see all physical 
volumes.

 
  When is host2 supposed to pick up any newly created LVs in the SD VG?
  Any suggestions where the problem might be?
 
  When you create a new lv on the shared storage, the new lv should be
  visible on the other host. Lets start by verifying that you do see
  the new lv after a disk was created.
 
  Try this:
 
  1. Create a new disk, and check the disk uuid in the engine ui
  2. On another machine, run this command:
 
  lvs -o vg_name,lv_name,tags
 
  You can identify the new lv using tags, which should contain the new disk
  uuid.
 
  If you don't see the new lv from the other host, please provide
  /var/log/messages
  and /var/log/sanlock.log.
 
  Just tried that. The disk is not visible on the non-SPM node.
  
  This means that storage is not accessible from this host.
 
 Generally, the storage seems accessible ok. For example, if I restart
 the vdsmd, all volumes get picked up correctly (become visible in lvs
 output and VMs can be started with them).

Lests repeat this test, but now, if you do not see the new lv, please 
run:

multipath -r

And report the results.

  Here's the full
  sanlock.log for that host:
 ...
  0x7fc37c0008c0:0x7fc37c0008d0:0x7fc391f5f000 ioto 10 to_count 1
  2014-02-06 05:24:10+0200 563065 [31453]: s1 delta_renew read rv -202
  offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
  
  Sanlock cannot write to the ids lockspace
 
 Which line shows that sanlock can't write? The messages are not very
 human readable.

The one above my comment at 2014-02-06 05:24:10+0200

I suggest to set sanlock debug level on the sanlock log to get more detailed 
output.

Edit /etc/sysconfig/sanlock and add:

# -L 7: use debug level logging to sanlock log file
SANLOCKOPTS=$SANLOCKOPTS -L 7

 
  Last entry is from yesterday, while I just created a new disk.
  
  What was the status of this host in the engine from 2014-02-06
  05:24:10+0200 to 2014-02-18 14:22:16?
  
  vdsm.log and engine.log for this time frame will make it more clear.
 
 Host was up and running. The vdsm and engine logs are quite large, as we
 were running some VM migrations between the hosts. Any pointers at what
 to look for? For example, I noticed many entries in engine.log like this:

It will be hard to make any progress without the logs.

 
 One warning that I keep seeing in vdsm logs on both nodes is this:
 
 Thread-1617881::WARNING::2014-02-24
 16:57:50,627::sp::1553::Storage.StoragePool::(getInfo) VG
 3307f6fa-dd58-43db-ab23-b1fb299006c7's metadata size exceeded
  critical size: mdasize=134217728 mdafree=0

Can you share the output of the command bellow?

lvs -o 
uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name
 
I suggest that you open a bug and attach there  engine.log, /var/log/messages, 
vdsm.log and sanlock.log.

Please also give detailed info on the host os, vdsm version etc.

Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-02-25 Thread Boyan Tabakov
Hello,

On 22.2.2014, 22:19, Nir Soffer wrote:
 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, February 19, 2014 7:18:36 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes

 Hello,

 On 19.2.2014, 17:09, Nir Soffer wrote:
 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: users@ovirt.org
 Sent: Tuesday, February 18, 2014 3:34:49 PM
 Subject: [Users] SD Disk's Logical Volume not visible/activated on some
 nodes

 Hello,

 I have ovirt 3.3 installed on on FC 19 hosts with vdsm 4.13.3-2.fc19.

 Which version of ovirt 3.3 is this? (3.3.2? 3.3.3?)

 ovirt-engine is 3.3.2-1.fc19

 One of the hosts (host1) is engine + node + SPM and the other host2 is
 just a node. I have an iSCSI storage domain configured and accessible
 from both nodes.

 When creating a new disk in the SD, the underlying logical volume gets
 properly created (seen in vgdisplay output on host1), but doesn't seem
 to be automatically picked by host2.

 How do you know it is not seen on host2?

 It's not present in the output of vgdisplay -v nor vgs.


 Consequently, when creating/booting
 a VM with the said disk attached, the VM fails to start on host2,
 because host2 can't see the LV. Similarly, if the VM is started on
 host1, it fails to migrate to host2. Extract from host2 log is in the
 end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.

 As far as I could track quickly the vdsm code, there is only call to lvs
 and not to lvscan or lvchange so the host2 LVM doesn't fully refresh.
 The only workaround so far has been to restart VDSM on host2, which
 makes it refresh all LVM data properly.

 When is host2 supposed to pick up any newly created LVs in the SD VG?
 Any suggestions where the problem might be?

 When you create a new lv on the shared storage, the new lv should be
 visible on the other host. Lets start by verifying that you do see
 the new lv after a disk was created.

 Try this:

 1. Create a new disk, and check the disk uuid in the engine ui
 2. On another machine, run this command:

 lvs -o vg_name,lv_name,tags

 You can identify the new lv using tags, which should contain the new disk
 uuid.

 If you don't see the new lv from the other host, please provide
 /var/log/messages
 and /var/log/sanlock.log.

 Just tried that. The disk is not visible on the non-SPM node.
 
 This means that storage is not accessible from this host.

Generally, the storage seems accessible ok. For example, if I restart
the vdsmd, all volumes get picked up correctly (become visible in lvs
output and VMs can be started with them).

 

 On the SPM node (where the LV is visible in vgs output):

 Feb 19 19:10:43 host1 vdsm root WARNING File:
 /rhev/data-center/61f15cc0-8bba-482d-8a81-cd636a581b58/3307f6fa-dd58-43db-ab23-b1fb299006c7/images/4d15543c-4c45-4c23-bbe3-f10b9084472a/3e0ce8cb-3740-49d7-908e-d025875ac9a2
 already removed
 Feb 19 19:10:45 host1 multipathd: dm-65: remove map (uevent)
 Feb 19 19:10:45 host1 multipathd: dm-65: devmap not registered, can't remove
 Feb 19 19:10:45 host1 multipathd: dm-65: remove map (uevent)
 Feb 19 19:10:54 host1 kernel: [1652684.864746] dd: sending ioctl
 80306d02 to a partition!
 Feb 19 19:10:54 host1 kernel: [1652684.963931] dd: sending ioctl
 80306d02 to a partition!

 No recent entries in sanlock.log on the SPM node.

 On the non-SPM node (the one that doesn't show the LV in vgs output),
 there are no relevant entries in /var/log/messages.
 
 Strange - sanlock errors are logged to /var/log/messages. It would be helpful 
 if
 you attach this log - we may find something in it.

No entries appear in /var/log/messages, other than the quoted above
(Sorry, I didn't clarify that that was from /var/log/messages on host1).

 
 Here's the full
 sanlock.log for that host:

 2014-01-30 16:28:09+0200 1324 [2335]: sanlock daemon started 2.8 host
 18bd0a27-c280-4007-98f2-d2e7e73cd8b5.xenon.futu
 2014-01-30 16:59:51+0200 5 [609]: sanlock daemon started 2.8 host
 4a7627e2-296a-4e48-a7e2-f6bcecac07ab.xenon.futu
 2014-01-31 09:51:43+0200 60717 [614]: s1 lockspace
 3307f6fa-dd58-43db-ab23-b1fb299006c7:2:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids:0
 2014-01-31 16:03:51+0200 83045 [613]: s1:r1 resource
 3307f6fa-dd58-43db-ab23-b1fb299006c7:SDM:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/leases:1048576
 for 8,16,30268
 2014-01-31 16:18:01+0200 83896 [614]: s1:r2 resource
 3307f6fa-dd58-43db-ab23-b1fb299006c7:SDM:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/leases:1048576
 for 8,16,30268
 2014-02-06 05:24:10+0200 563065 [31453]: 3307f6fa aio timeout 0
 0x7fc37c0008c0:0x7fc37c0008d0:0x7fc391f5f000 ioto 10 to_count 1
 2014-02-06 05:24:10+0200 563065 [31453]: s1 delta_renew read rv -202
 offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
 
 Sanlock cannot write to the ids lockspace

Which line shows that sanlock can't write? The messages are not very
human readable.
 
 2014

Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-02-22 Thread Nir Soffer
- Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: Nir Soffer nsof...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, February 19, 2014 7:18:36 PM
 Subject: Re: [Users] SD Disk's Logical Volume not visible/activated on some 
 nodes
 
 Hello,
 
 On 19.2.2014, 17:09, Nir Soffer wrote:
  - Original Message -
  From: Boyan Tabakov bl...@alslayer.net
  To: users@ovirt.org
  Sent: Tuesday, February 18, 2014 3:34:49 PM
  Subject: [Users] SD Disk's Logical Volume not visible/activated on some
  nodes
 
  Hello,
 
  I have ovirt 3.3 installed on on FC 19 hosts with vdsm 4.13.3-2.fc19.
  
  Which version of ovirt 3.3 is this? (3.3.2? 3.3.3?)
 
 ovirt-engine is 3.3.2-1.fc19
 
  One of the hosts (host1) is engine + node + SPM and the other host2 is
  just a node. I have an iSCSI storage domain configured and accessible
  from both nodes.
 
  When creating a new disk in the SD, the underlying logical volume gets
  properly created (seen in vgdisplay output on host1), but doesn't seem
  to be automatically picked by host2.
  
  How do you know it is not seen on host2?
 
 It's not present in the output of vgdisplay -v nor vgs.
 
  
  Consequently, when creating/booting
  a VM with the said disk attached, the VM fails to start on host2,
  because host2 can't see the LV. Similarly, if the VM is started on
  host1, it fails to migrate to host2. Extract from host2 log is in the
  end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.
 
  As far as I could track quickly the vdsm code, there is only call to lvs
  and not to lvscan or lvchange so the host2 LVM doesn't fully refresh.
  The only workaround so far has been to restart VDSM on host2, which
  makes it refresh all LVM data properly.
 
  When is host2 supposed to pick up any newly created LVs in the SD VG?
  Any suggestions where the problem might be?
  
  When you create a new lv on the shared storage, the new lv should be
  visible on the other host. Lets start by verifying that you do see
  the new lv after a disk was created.
  
  Try this:
  
  1. Create a new disk, and check the disk uuid in the engine ui
  2. On another machine, run this command:
  
  lvs -o vg_name,lv_name,tags
  
  You can identify the new lv using tags, which should contain the new disk
  uuid.
  
  If you don't see the new lv from the other host, please provide
  /var/log/messages
  and /var/log/sanlock.log.
 
 Just tried that. The disk is not visible on the non-SPM node.

This means that storage is not accessible from this host.

 
 On the SPM node (where the LV is visible in vgs output):
 
 Feb 19 19:10:43 host1 vdsm root WARNING File:
 /rhev/data-center/61f15cc0-8bba-482d-8a81-cd636a581b58/3307f6fa-dd58-43db-ab23-b1fb299006c7/images/4d15543c-4c45-4c23-bbe3-f10b9084472a/3e0ce8cb-3740-49d7-908e-d025875ac9a2
 already removed
 Feb 19 19:10:45 host1 multipathd: dm-65: remove map (uevent)
 Feb 19 19:10:45 host1 multipathd: dm-65: devmap not registered, can't remove
 Feb 19 19:10:45 host1 multipathd: dm-65: remove map (uevent)
 Feb 19 19:10:54 host1 kernel: [1652684.864746] dd: sending ioctl
 80306d02 to a partition!
 Feb 19 19:10:54 host1 kernel: [1652684.963931] dd: sending ioctl
 80306d02 to a partition!
 
 No recent entries in sanlock.log on the SPM node.
 
 On the non-SPM node (the one that doesn't show the LV in vgs output),
 there are no relevant entries in /var/log/messages.

Strange - sanlock errors are logged to /var/log/messages. It would be helpful if
you attach this log - we may find something in it.

 Here's the full
 sanlock.log for that host:
 
 2014-01-30 16:28:09+0200 1324 [2335]: sanlock daemon started 2.8 host
 18bd0a27-c280-4007-98f2-d2e7e73cd8b5.xenon.futu
 2014-01-30 16:59:51+0200 5 [609]: sanlock daemon started 2.8 host
 4a7627e2-296a-4e48-a7e2-f6bcecac07ab.xenon.futu
 2014-01-31 09:51:43+0200 60717 [614]: s1 lockspace
 3307f6fa-dd58-43db-ab23-b1fb299006c7:2:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids:0
 2014-01-31 16:03:51+0200 83045 [613]: s1:r1 resource
 3307f6fa-dd58-43db-ab23-b1fb299006c7:SDM:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/leases:1048576
 for 8,16,30268
 2014-01-31 16:18:01+0200 83896 [614]: s1:r2 resource
 3307f6fa-dd58-43db-ab23-b1fb299006c7:SDM:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/leases:1048576
 for 8,16,30268
 2014-02-06 05:24:10+0200 563065 [31453]: 3307f6fa aio timeout 0
 0x7fc37c0008c0:0x7fc37c0008d0:0x7fc391f5f000 ioto 10 to_count 1
 2014-02-06 05:24:10+0200 563065 [31453]: s1 delta_renew read rv -202
 offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids

Sanlock cannot write to the ids lockspace

 2014-02-06 05:24:10+0200 563065 [31453]: s1 renewal error -202
 delta_length 10 last_success 563034
 2014-02-06 05:24:21+0200 563076 [31453]: 3307f6fa aio timeout 0
 0x7fc37c000910:0x7fc37c000920:0x7fc391d5c000 ioto 10 to_count 2
 2014-02-06 05:24:21+0200 563076 [31453]: s1 delta_renew read rv -202
 offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
 2014-02-06 05:24:21+0200 563076 [31453]: s1

Re: [Users] SD Disk's Logical Volume not visible/activated on some nodes

2014-02-19 Thread Boyan Tabakov
Hello,

On 19.2.2014, 17:09, Nir Soffer wrote:
 - Original Message -
 From: Boyan Tabakov bl...@alslayer.net
 To: users@ovirt.org
 Sent: Tuesday, February 18, 2014 3:34:49 PM
 Subject: [Users] SD Disk's Logical Volume not visible/activated on some nodes

 Hello,

 I have ovirt 3.3 installed on on FC 19 hosts with vdsm 4.13.3-2.fc19.
 
 Which version of ovirt 3.3 is this? (3.3.2? 3.3.3?)

ovirt-engine is 3.3.2-1.fc19

 One of the hosts (host1) is engine + node + SPM and the other host2 is
 just a node. I have an iSCSI storage domain configured and accessible
 from both nodes.

 When creating a new disk in the SD, the underlying logical volume gets
 properly created (seen in vgdisplay output on host1), but doesn't seem
 to be automatically picked by host2.
 
 How do you know it is not seen on host2?

It's not present in the output of vgdisplay -v nor vgs.

 
 Consequently, when creating/booting
 a VM with the said disk attached, the VM fails to start on host2,
 because host2 can't see the LV. Similarly, if the VM is started on
 host1, it fails to migrate to host2. Extract from host2 log is in the
 end. The LV in question is 6b35673e-7062-4716-a6c8-d5bf72fe3280.

 As far as I could track quickly the vdsm code, there is only call to lvs
 and not to lvscan or lvchange so the host2 LVM doesn't fully refresh.
 The only workaround so far has been to restart VDSM on host2, which
 makes it refresh all LVM data properly.

 When is host2 supposed to pick up any newly created LVs in the SD VG?
 Any suggestions where the problem might be?
 
 When you create a new lv on the shared storage, the new lv should be
 visible on the other host. Lets start by verifying that you do see
 the new lv after a disk was created. 
 
 Try this:
 
 1. Create a new disk, and check the disk uuid in the engine ui
 2. On another machine, run this command:
 
 lvs -o vg_name,lv_name,tags
 
 You can identify the new lv using tags, which should contain the new disk 
 uuid.
 
 If you don't see the new lv from the other host, please provide 
 /var/log/messages
 and /var/log/sanlock.log.

Just tried that. The disk is not visible on the non-SPM node.

On the SPM node (where the LV is visible in vgs output):

Feb 19 19:10:43 host1 vdsm root WARNING File:
/rhev/data-center/61f15cc0-8bba-482d-8a81-cd636a581b58/3307f6fa-dd58-43db-ab23-b1fb299006c7/images/4d15543c-4c45-4c23-bbe3-f10b9084472a/3e0ce8cb-3740-49d7-908e-d025875ac9a2
already removed
Feb 19 19:10:45 host1 multipathd: dm-65: remove map (uevent)
Feb 19 19:10:45 host1 multipathd: dm-65: devmap not registered, can't remove
Feb 19 19:10:45 host1 multipathd: dm-65: remove map (uevent)
Feb 19 19:10:54 host1 kernel: [1652684.864746] dd: sending ioctl
80306d02 to a partition!
Feb 19 19:10:54 host1 kernel: [1652684.963931] dd: sending ioctl
80306d02 to a partition!

No recent entries in sanlock.log on the SPM node.

On the non-SPM node (the one that doesn't show the LV in vgs output),
there are no relevant entries in /var/log/messages. Here's the full
sanlock.log for that host:

2014-01-30 16:28:09+0200 1324 [2335]: sanlock daemon started 2.8 host
18bd0a27-c280-4007-98f2-d2e7e73cd8b5.xenon.futu
2014-01-30 16:59:51+0200 5 [609]: sanlock daemon started 2.8 host
4a7627e2-296a-4e48-a7e2-f6bcecac07ab.xenon.futu
2014-01-31 09:51:43+0200 60717 [614]: s1 lockspace
3307f6fa-dd58-43db-ab23-b1fb299006c7:2:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids:0
2014-01-31 16:03:51+0200 83045 [613]: s1:r1 resource
3307f6fa-dd58-43db-ab23-b1fb299006c7:SDM:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/leases:1048576
for 8,16,30268
2014-01-31 16:18:01+0200 83896 [614]: s1:r2 resource
3307f6fa-dd58-43db-ab23-b1fb299006c7:SDM:/dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/leases:1048576
for 8,16,30268
2014-02-06 05:24:10+0200 563065 [31453]: 3307f6fa aio timeout 0
0x7fc37c0008c0:0x7fc37c0008d0:0x7fc391f5f000 ioto 10 to_count 1
2014-02-06 05:24:10+0200 563065 [31453]: s1 delta_renew read rv -202
offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
2014-02-06 05:24:10+0200 563065 [31453]: s1 renewal error -202
delta_length 10 last_success 563034
2014-02-06 05:24:21+0200 563076 [31453]: 3307f6fa aio timeout 0
0x7fc37c000910:0x7fc37c000920:0x7fc391d5c000 ioto 10 to_count 2
2014-02-06 05:24:21+0200 563076 [31453]: s1 delta_renew read rv -202
offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
2014-02-06 05:24:21+0200 563076 [31453]: s1 renewal error -202
delta_length 11 last_success 563034
2014-02-06 05:24:32+0200 563087 [31453]: 3307f6fa aio timeout 0
0x7fc37c000960:0x7fc37c000970:0x7fc391c5a000 ioto 10 to_count 3
2014-02-06 05:24:32+0200 563087 [31453]: s1 delta_renew read rv -202
offset 0 /dev/3307f6fa-dd58-43db-ab23-b1fb299006c7/ids
2014-02-06 05:24:32+0200 563087 [31453]: s1 renewal error -202
delta_length 11 last_success 563034
2014-02-06 05:24:40+0200 563094 [609]: s1 check_our_lease warning 60
last_success 563034
2014-02-06 05:24:41+0200 563095 [609]: s1 check_our_lease warning 61
last_success 563034
2014-02-06 05:24:42