from:"Federico Simoncelli"

Re: [ovirt-users] New user intro some questions

2015-02-11 Thread Federico Simoncelli

- Original Message -
 From: Sandro Bonazzola sbona...@redhat.com
 To: Sven Kieske s.kie...@mittwald.de, users@ovirt.org, Federico 
 Simoncelli fsimo...@redhat.com
 Sent: Tuesday, February 10, 2015 4:56:05 PM
 Subject: Re: [ovirt-users] New user intro  some questions

 Il 10/02/2015 16:49, Sven Kieske ha scritto:

  On 10/02/15 03:02, Jason Brooks wrote:
  The meaning of support is important here -- support from whom? It's
  true that there's no gluster+virt SKU of the RHEV downstream project.
  All configurations of ovirt proper are self-supported, or community-
  supported, and what we choose to support is up to us individuals.

  However, gluster + virt on the same nodes does work -- even w/ management
  through the engine. I do use gluster on my virt nodes, but I don't manage
  them w/ the engine, because, afaik, there isn't a way to have gluster and
  virt on separate networks this way, so I just manage gluster from the
  gluster cli.

  It's true, oVirt is happiest w/ separate machines for everything, and
  a rock-solid san of some sort, etc., but that's not the only way, and
  as you point out, hardware isn't free.

  Well you might be interested in these upcoming features:

  http://www.ovirt.org/Features/Self_Hosted_Engine_Hyper_Converged_Gluster_Support

  sadly the slides from this talk are not online:
  https://fosdem.org/2015/schedule/event/hyperconvergence/

  maybe brian(cc'ed) can put them somewhere?

 Or Federico (CCed) :-)

The slides of my sessions are available at:

http://www.ovirt.org/images/6/6c/2015-ovirt-glusterfs-hyperconvergence.pdf
http://www.ovirt.org/images/9/97/2015-docker-ovirt-iaas.pdf

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Reclaim/Trim/issue_discards : how to inform my LUNs about disk space waste?

2014-11-27 Thread Federico Simoncelli

Hi Nicolas,
 you can find more information on this at:

https://bugzilla.redhat.com/show_bug.cgi?id=981626

First of all an important note (that was already mentioned): vdsm
is not using lvm.conf, so whatever change you make there it won't
affect vdsm behavior.

Anyway long story short, enabling issue_discards in lvm would lead
to lvm commands starvation when the lv that you're removing is large
and the granularity is small.

The correct solution is to use blkdiscard on the lv and I happened
to submit a patch series for that yesterday:

http://gerrit.ovirt.org/#/q/status:open+project:vdsm+branch:master+topic:block-discard,n,z
(very much experimental)

The approach is to begin with issuing blkdiscard when wipe after
delete is selected on the disk.

That is because blkdiscard in the majority of the cases will wipe
the lv data and I know that someone in the past has been brave enough
to try and recover data from a mistakenly removed lv that wasn't
post-zeroed.

Anyway extending the support to non post-zero is trivial and it's
just a matter of agreement and expectations.

With regard to the legitimate question of why both post-zero and
block discard, the answer is that after discussing it with storage
array experts it seems that blkdiscard has no contract in
guaranteeing that the data will be blanked out and it could later
on show up even on a completely different LUN of the same storage.

-- 
Federico

- Original Message -
 From: Nicolas Ecarnot nico...@ecarnot.net
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: users users@ovirt.org
 Sent: Thursday, November 27, 2014 10:43:06 AM
 Subject: Re: [ovirt-users] Reclaim/Trim/issue_discards : how to inform my 
 LUNs about disk space waste?
 
 Le 22/07/2014 14:23, Federico Simoncelli a écrit :
  - Original Message -
  From: Nicolas Ecarnot nico...@ecarnot.net
  To: users users@ovirt.org
  Sent: Thursday, July 3, 2014 10:54:57 AM
  Subject: [ovirt-users] Reclaim/Trim/issue_discards : how to inform my LUNs
  about disk space waste?
 
  Hi,
 
  In my hosts, I see that /etc/lvm/lvm.conf shows :
 
  issue_discards = 0
 
  Can I enable it to 1 ?
 
  Thank you?
 
  You can change it but it's not going to affect the lvm behavior in VDSM
  since we don't use the host lvm config file.
 
 Frederico,
 
 May you describe a little more how it's done, explain the principle, or
 point us to a place where we can learn more about how LVM is used in
 oVirt, amongst the manager and the hosts.
 
 Thank you.
 
 
  This will be probably addressed as part of bz1017284 as we're considering
  to extend discard also to vdsm images (and not direct luns only).
 
  https://bugzilla.redhat.com/show_bug.cgi?id=1017284
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Hosted engine: sending ioctl 5401 to a partition!

2014-11-27 Thread Federico Simoncelli

- Original Message -
 From: Chris Adams c...@cmadams.net
 To: users@ovirt.org
 Sent: Friday, November 21, 2014 10:28:28 PM
 Subject: [ovirt-users] Hosted engine: sending ioctl 5401 to a partition!

 I have set up oVirt with hosted engine, on an iSCSI volume.  On both
 nodes, the kernel logs the following about every 10 seconds:

 Nov 21 15:27:49 node8 kernel: ovirt-ha-broker: sending ioctl 5401 to a
 partition!

 Is this a known bug, something that I need to address, etc.?

Is this on centos or fedora?

We may have to do some testing to identify where that's coming
from.

Feel free to ping me: fsimonce (#ovirt on OFTC) so we can check
what's going on.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Status libgfapi support in oVirt

2014-11-21 Thread Federico Simoncelli

- Original Message -
 From: noc n...@nieuwland.nl
 To: users@ovirt.org
 Sent: Friday, November 21, 2014 10:01:30 AM
 Subject: Re: [ovirt-users] Status libgfapi support in oVirt

 On 21-11-2014 9:47, noc wrote:
  The VM doesn't start but that can be caused by my L2 virt OR that the
  host name = .. port=0 ...  is wrong. Shouldn't there be a port in
  the 24007 or 49152 range?

 Sorry, forgot to install vdsm-gluster. Starting the VM on an el6 host
 now works and still the same line with the port=0 so that doesn't seem
 to matter.

 OK, back to setting up a el6 host except when you generate a el7 version
 too which would be awesome 8-) .

I updated the packages (rebasing on a newer master) and I provided an
el7 build as well:

https://fsimonce.fedorapeople.org/vdsm-libgfapi/

These rpms are less tested than the previous ones but the rebase was
straight forward.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Status libgfapi support in oVirt

2014-11-20 Thread Federico Simoncelli

- Original Message -
 From: noc n...@nieuwland.nl
 To: users@ovirt.org
 Sent: Thursday, November 20, 2014 8:46:01 AM
 Subject: Re: [ovirt-users] Status libgfapi support in oVirt

 On 19-11-2014 23:44, Darrell Budic wrote:
  Is there an el7 build of this available too?

 That would be nice too. Forgot that I updated my test env to el7 to see
 if that helped. Can test on F20 @home tonight if needed. Gonna try
 something else first and will let you all know how that went.

I've prepared a fedora 20 build as well:

https://fsimonce.fedorapeople.org/vdsm-libgfapi/fc20/

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Status libgfapi support in oVirt

2014-11-19 Thread Federico Simoncelli

- Original Message -
 From: Joop jvdw...@xs4all.nl
 To: users@ovirt.org
 Sent: Monday, November 17, 2014 9:39:36 AM
 Subject: [ovirt-users] Status libgfapi support in oVirt

 I have been trying to use libgfapi glusterfs support in oVirt but can't
 get it to work. After talks on IRC it seems I should apply a patch
 (http://gerrit.ovirt.org/33768) to enable libgf BUT I can't get it to
 work. Systems used:
 - hosts Centos7 or Fedora20 (so upto date qemu/libvirt/oVirt(3.5))
 - glusterfs-3.6.1
 - vdsm-4.16.0-524.gitbc618a4.el7.x86_64 (snapshot master 14-nov)
 - vdsm-4.16.7-1.gitdb83943.el7.x86_64 (official ovirt-3.5 vdsm, seems
 newer than master snapshot?? )

 Just adding the patch to vdsm-4.16.7-1.gitdb83943.el7.x86_64 doesn't
 work, vdsm doesn't start anymore due to an error in virt/vm.py.

 Q1: what is de exact status of libgf and oVirt.
 Q2: how do I test that patch?

Rebasing and applying patches could be tricky sometimes and if you
got an error in virt/vm.py it is most likely because the patch
didn't apply cleanly.

I prepared a build (el6) here:

https://fsimonce.fedorapeople.org/vdsm-libgfapi/

In case you want to try it on fedora you just need to get the
source rpm here:

https://fsimonce.fedorapeople.org/vdsm-libgfapi/source/

and rebuild it on fedora.

Let me know if you have any problem.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Status libgfapi support in oVirt

2014-11-19 Thread Federico Simoncelli

- Original Message -
 From: noc n...@nieuwland.nl
 To: users@ovirt.org
 Sent: Wednesday, November 19, 2014 9:36:28 AM
 Subject: Re: [ovirt-users] Status libgfapi support in oVirt

 On 18-11-2014 20:57, Christopher Young wrote:

 I'm replying to 'up' this as well as I'm most interested in this. I actually
 thought this was implemented and working too.

 On Mon, Nov 17, 2014 at 10:01 AM, Daniel Helgenberger 
 daniel.helgenber...@m-box.de  wrote:

 Hello Joop,

 thanks for raising the issue as it is one of the things I assumed are
 already implemented and working.

 Sadly I cannot provide any answer ...

 On 17.11.2014 09:39, Joop wrote:
  I have been trying to use libgfapi glusterfs support in oVirt but can't
  get it to work. After talks on IRC it seems I should apply a patch
  ( http://gerrit.ovirt.org/33768 ) to enable libgf BUT I can't get it to
  work. Systems used:
  - hosts Centos7 or Fedora20 (so upto date qemu/libvirt/oVirt(3.5))
  - glusterfs-3.6.1
  - vdsm-4.16.0-524.gitbc618a4.el7.x86_64 (snapshot master 14-nov)
  - vdsm-4.16.7-1.gitdb83943.el7.x86_64 (official ovirt-3.5 vdsm, seems
  newer than master snapshot?? )

  Just adding the patch to vdsm-4.16.7-1.gitdb83943.el7.x86_64 doesn't
  work, vdsm doesn't start anymore due to an error in virt/vm.py.

  Q1: what is de exact status of libgf and oVirt.
  Q2: how do I test that patch?

 I experimented a little more and found that if I create a VM in oVirt on a
 glusterfs storage domain and start it, it won't use libgfapi, BUT if I use
 virsh on the host where the VM runs and then add a disk the libgfapi way the
 VM will see the disk and can use it. So the underlying infra is capable of
 using libgf but oVirt isn't using it. Thats where the patch comes in I think
 but I can't get it to work.

Correct. oVirt up until now didn't use libgfapi because of missing features
(e.g. live snapshot). It seems that now all those gaps have been fixed and
we're trying to re-enable libgfapi.

I just mentioned that I uploaded an el6 build here:

https://fsimonce.fedorapeople.org/vdsm-libgfapi/

and sources here (to rebuild on fedora):

https://fsimonce.fedorapeople.org/vdsm-libgfapi/source/

Let me know if the most of you are using fedora and I'll make a build on
fedora as well.

Please let me know how it goes. Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] [QE] oVirt 3.5.1 status

2014-10-26 Thread Federico Simoncelli

- Original Message -
 From: Sven Kieske svenkie...@gmail.com
 To: users@ovirt.org
 Sent: Thursday, October 23, 2014 6:42:11 PM
 Subject: Re: [ovirt-users] [QE] oVirt 3.5.1 status

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Hi,

 please consider:
 https://bugzilla.redhat.com/show_bug.cgi?id=1156115
 as a backport to 3.5.1

 I don't know if you plan to release a new vdsm version
 though and I also don't know if this patch is already
 matured enough, if I can test or help make this patch
 better, please let me know, as I'm very interested in
 getting it into this release.

Hi Sven, first of all we need to merge the patch in master and yes,
as you suggested having some help would speed up the process.

The patch affects the move/copy of images with (one or more)
snapshots.

I already briefly tested the patch with qemu-img from rhel 6 so
we need to cover other platforms (fedora and centos) and test:

- cold/live move of disks from one storage to another (both
  nfs/iscsi and cold move from nfs to iscsi and backward)

- export vms (with snapshots) to export domain and re-import it
  (both nfs and iscsi)

Once the patch is in master and we have a feedback on how stable
it is we may consider it for backporting (maybe 3.5.2 or 3.5.3
if ever).

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

2014-10-08 Thread Federico Simoncelli

- Original Message -
 From: Ricardo Esteves ricardo.m.este...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, October 8, 2014 1:32:51 AM
 Subject: Re: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

 Hi, here it goes:

 ethtool -i eth3

 driver: bnx2
 version: 2.2.4g
 firmware-version: bc 5.2.3
 bus-info: :06:00.1
 supports-statistics: yes
 supports-test: yes
 supports-eeprom-access: yes
 supports-register-dump: yes
 supports-priv-flags: no

Thanks, can you also add the output of:

 # lspci -nn

I'd have expected the driver to be bnx2i (bnx2 is a regular ethernet driver,
no offloading).

Can you also check if you have the bnx2i driver loaded?

 # lsmod | grep bnx2

and eventually if there's any bnx2 related message in /var/log/messages

Check also the adapter bios (at boot time) if by any chance you have to enable
the offloading there first (check the specific manual too).

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

2014-10-07 Thread Federico Simoncelli

- Original Message -
 From: Ricardo Esteves ricardo.m.este...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: users@ovirt.org
 Sent: Tuesday, October 7, 2014 8:44:19 PM
 Subject: Re: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)
 
 cat /var/lib/iscsi/ifaces/eth3
 # BEGIN RECORD 6.2.0-873.10.el6
 iface.iscsi_ifacename = eth3
 iface.transport_name = tcp
 iface.vlan_id = 0
 iface.vlan_priority = 0
 iface.iface_num = 0
 iface.mtu = 0
 iface.port = 0
 # END RECORD
 
 Is there anyway to tell ovirt to use bnx2i instead of tcp?

Hi Ricardo, can you paste the output of:

 # ethtool -i eth3

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] [ovirt-devel] Building vdsm within Fedora

2014-09-26 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Sandro Bonazzola sbona...@redhat.com
 Cc: crobi...@redhat.com, users users@ovirt.org, de...@ovirt.org
 Sent: Thursday, September 25, 2014 3:06:01 PM
 Subject: Re: [ovirt-devel] [ovirt-users] Building vdsm within Fedora

 On Wed, Sep 24, 2014 at 10:57:21AM +0200, Sandro Bonazzola wrote:
  Il 24/09/2014 09:44, Sven Kieske ha scritto:

   On 24/09/14 09:13, Federico Simoncelli wrote:
   You probably missed the first part we were using qemu-kvm/qemu-img in
   the spec file. In that case you won't fail in any requirement.

   Basically the question is: was there any problem on centos6 before
   committing http://gerrit.ovirt.org/31214 ?

  Federico: as we checked a few minutes ago, it seems there's no problem in
  requiring qemu-kvm/qemu-img in the spec file.
  Only issue is that if non rhev version is installed a manual yum update
  is required for moving to the rhevm version.

 Right. Without the patch, RPM does not enforce qemu-kvm-rhev. So our
 code has to check for qemu-kvm-rhev functionality, instead of knowing
 that it is there.  Furthermore, we had several reports of users finding
 themselves without qemu-kvm-rhev on their node, and not understanding
 why they do not have live merge.

Live merge? The biggest problem with live merge is libvirt not qemu.

Anyway the qemu-kvm/qemu-kvm-rhev problem is relevant only for centos
and centos has a specific way to address these special needs:

http://www.centos.org/variants/

A CentOS variant is a special edition of CentOS Linux that starts with
the core distribution, then replaces or supplements a specific subset of
packages. This may include replacing everything down to the kernel,
networking, and other subsystems.

I think the plan was to have our own centos variant (shipping qemu-kvm-rhev).
I remember Doron participated to the centos meetings but I don't remember
the outcome.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Building vdsm within Fedora

2014-09-24 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Sandro Bonazzola sbona...@redhat.com, de...@ovirt.org, 
 fsimo...@redhat.com, dougsl...@redhat.com
 Cc: Sven Kieske s.kie...@mittwald.de, users users@ovirt.org
 Sent: Tuesday, September 23, 2014 11:21:18 PM
 Subject: Building vdsm within Fedora

 Since Vdsm was open-sourced, it was built and deployed via
 Fedora.

 Recently [http://gerrit.ovirt.org/31214] vdsm introduced a spec-file
 dependency onf qemu-kvm-rhev, and considered to backport it to the
 ovirt-3.4 brach.

 Requiring qemu-kvm-rhev, which is not part of Fedora's EPEL6 branch,
 violates Fedora's standards.

 So basically we have two options:

 1. Revert the qemu-kvm-rhev dependency.
 2. Drop vdsm from EPEL6 (or completely from Fedora); ship Vdsm only
within the oVirt repositories.

 A third option would be to have one rpm, with qemu-kvm-rhev, shipped in
 ovirt, and another without it - shipped in Fedora. I find this overly
 complex and confusing.

I think that until now (centos6) we were using qemu-kvm/qemu-img in the
spec file and then the ovirt repository was distributing qemu-*-rhev
from:

 http://resources.ovirt.org/pub/ovirt-3.4-snapshot/rpm/el6/x86_64/

It this not possible with centos7? Any problem with that?

I find being in fedora a way to keep the spec file and the rpm updated
and as clean as possible.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] [ovirt-devel] Building vdsm within Fedora

2014-09-24 Thread Federico Simoncelli

- Original Message -
 From: Sven Kieske s.kie...@mittwald.de
 To: de...@ovirt.org, users users@ovirt.org
 Sent: Wednesday, September 24, 2014 9:44:17 AM
 Subject: Re: [ovirt-devel] Building vdsm within Fedora

 On 24/09/14 09:13, Federico Simoncelli wrote:
  You probably missed the first part we were using qemu-kvm/qemu-img in
  the spec file. In that case you won't fail in any requirement.

  Basically the question is: was there any problem on centos6 before
  committing http://gerrit.ovirt.org/31214 ?

 Of course there was a problem, please follow the link in this very
 commit to the according bugzilla:

 https://bugzilla.redhat.com/show_bug.cgi?id=1127763

 In short: you can not use live snapshots without this updated spec file.

 And it's a PITA to install this package by hand, you must track
 it's versions yourself etc pp. you basically lose all the stuff
 a proper spec file gives you.

As soon as you have the ovirt repository installed there shouldn't be any
reason for you to have any of these problems.

Sandro, is there any reason why the rpm available here:

http://resources.ovirt.org/pub/ovirt-3.4/rpm/el6/x86_64/

are not published here?

http://resources.ovirt.org/releases/3.4/rpm/el6/x86_64/

Is there any additional repository (that provides qemu-*-rhev) that we are
missing from the ovirt.repo file?

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] [ovirt-devel] Building vdsm within Fedora

2014-09-24 Thread Federico Simoncelli

- Original Message -
 From: Sandro Bonazzola sbona...@redhat.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: de...@ovirt.org, users users@ovirt.org, Sven Kieske 
 s.kie...@mittwald.de
 Sent: Wednesday, September 24, 2014 11:01:35 AM
 Subject: Re: [ovirt-devel] Building vdsm within Fedora

 Il 24/09/2014 10:35, Federico Simoncelli ha scritto:
  - Original Message -
  From: Sven Kieske s.kie...@mittwald.de
  To: de...@ovirt.org, users users@ovirt.org
  Sent: Wednesday, September 24, 2014 9:44:17 AM
  Subject: Re: [ovirt-devel] Building vdsm within Fedora

  On 24/09/14 09:13, Federico Simoncelli wrote:
  You probably missed the first part we were using qemu-kvm/qemu-img in
  the spec file. In that case you won't fail in any requirement.

  Basically the question is: was there any problem on centos6 before
  committing http://gerrit.ovirt.org/31214 ?

  Of course there was a problem, please follow the link in this very
  commit to the according bugzilla:

  https://bugzilla.redhat.com/show_bug.cgi?id=1127763

  In short: you can not use live snapshots without this updated spec file.

  And it's a PITA to install this package by hand, you must track
  it's versions yourself etc pp. you basically lose all the stuff
  a proper spec file gives you.

  As soon as you have the ovirt repository installed there shouldn't be any
  reason for you to have any of these problems.

  Sandro, is there any reason why the rpm available here:

  http://resources.ovirt.org/pub/ovirt-3.4/rpm/el6/x86_64/

  are not published here?

  http://resources.ovirt.org/releases/3.4/rpm/el6/x86_64/

 this second link points to the previous layout, abandoned since we moved from
 /releases to /pub.
 /releases is still around for historical purpose, I think we should consider
 to drop it at some point avoinding confusion or renaming it to something
 that make it clear that it shouldn't be used anymore.

Sven can you let us know if you still have any problem using:

http://resources.ovirt.org/pub/yum-repo/ovirt-release34.rpm
(which should contain the correct ovirt.repo)

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Problem Refreshing/Using ovirt-image-repository / 3.5 RC2

2014-09-23 Thread Federico Simoncelli

- Original Message -
 From: Oved Ourfali ov...@redhat.com
 To: j...@internetx.com, Federico Simoncelli fsimo...@redhat.com
 Cc: users@ovirt.org, Allon Mureinik amure...@redhat.com
 Sent: Tuesday, September 23, 2014 9:56:28 AM
 Subject: Re: [ovirt-users] Problem Refreshing/Using ovirt-image-repository / 
 3.5  RC2

 - Original Message -
  From: InterNetX - Juergen Gotteswinter j...@internetx.com
  To: users@ovirt.org
  Sent: Tuesday, September 23, 2014 10:41:41 AM
  Subject: Re: [ovirt-users] Problem Refreshing/Using ovirt-image-repository
  / 3.5   RC2

  Am 23.09.2014 um 09:32 schrieb Oved Ourfali:

   - Original Message -
   From: InterNetX - Juergen Gotteswinter j...@internetx.com
   To: users@ovirt.org
   Sent: Tuesday, September 23, 2014 10:29:07 AM
   Subject: [ovirt-users] Problem Refreshing/Using ovirt-image-repository /
   3.5  RC2

   Hi,

   when trying to refresh the ovirt glance repository i get a 500 Error
   Message

   Operation Canceled

   Error while executing action: A Request to the Server failed with the
   following Status Code: 500

   engine.log says:

   2014-09-23 09:23:08,960 INFO
   [org.ovirt.engine.core.bll.provider.TestProviderConnectivityCommand]
   (ajp--127.0.0.1-8702-10) [7fffb4bd] Running command:
   TestProviderConnectivityCommand internal: false. Entities affected :
   ID: aaa0----123456789aaa Type: SystemAction group
   CREATE_STORAGE_POOL with role type ADMIN
   2014-09-23 09:23:08,975 INFO
   [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
   (ajp--127.0.0.1-8702-10) [7fffb4bd] Correlation ID: 7fffb4bd, Call
   Stack: null, Custom Event ID: -1, Message: Unrecognized audit log type
   has been used.
   2014-09-23 09:23:20,173 INFO
   [org.ovirt.engine.core.bll.aaa.LogoutUserCommand]
   (ajp--127.0.0.1-8702-11) [712895c3] Running command: LogoutUserCommand
   internal: false.
   2014-09-23 09:23:20,184 INFO
   [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
   (ajp--127.0.0.1-8702-11) [712895c3] Correlation ID: 712895c3, Call
   Stack: null, Custom Event ID: -1, Message: User admin logged out.
   2014-09-23 09:23:20,262 INFO
   [org.ovirt.engine.core.bll.aaa.LoginAdminUserCommand]
   (ajp--127.0.0.1-8702-6) Running command: LoginAdminUserCommand internal:
   false.

   All these message are good... no error here.
   Can you attach the full engine log?

  imho there is nothing else related to this :/ i attached the log
  starting from today. except firing up a test vm nothing else happened
  yet (and several tries refreshing the image repo)

 I don't see a refresh attempt in the log, but i'm not familiar enough with
 that.
 Federico - can you have a look?

I don't see any reference to glance or error 500 in the logs. My impression
is that the error 500 is between the ui and the engine... have you tried to
force-refresh the ovirt webadmin page?

You can try and use the rest-api to check if the listing is working there.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] Fedora 21 Test Day

2014-09-23 Thread Federico Simoncelli

FYI, in a couple of days it will be the Fedora 21 virtualization
test day:

http://fedoramagazine.org/5tftw-2014-09-02/

https://fedoraproject.org/wiki/Test_Day:2014-09-25_Virtualization

it's a good opportunity for us to check for regressions with vdsm
(qemu, libvirt, virt-tools), get more attention from the fedora
community, have quicker feedback on issues and get them fixed before
release.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Can I use qcow2?

2014-09-03 Thread Federico Simoncelli

- Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Demeter Tibor tdeme...@itsmart.hu
 Cc: users@ovirt.org, Allon Mureinik amure...@redhat.com, Federico 
 Simoncelli fsimo...@redhat.com
 Sent: Wednesday, September 3, 2014 12:50:30 PM
 Subject: Re: [ovirt-users] Can I use qcow2?

 On 09/03/2014 11:14 AM, Demeter Tibor wrote:
  On shared glusterfs.

 Allon/Federico - I remember on NFS, qcow2 isn't used by default, since
 raw is sparse by default.
 (but i don't remember if it won't work, or just not enabled by default).

 can one create a qcow2 disk for a VM with gluster storage?

Yes through rest-api. From the ovirt-shell you could run:

 $ add disk \
   --vm-identifier vm_name \
   --provisioned_size size_in_bytes \
   --interface virtio \
   --name disk_name \
   --format cow \
   --sparse true \
   --storage_domains-storage_domain 
storage_domain.name=gluster_domain_name

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] feature review - ReportGuestDisksLogicalDeviceName

2014-09-02 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Liron Aravot lara...@redhat.com
 Cc: users@ovirt.org, de...@ovirt.org, smizr...@redhat.com, 
 fsimo...@redhat.com, Michal Skrivanek
 mskri...@redhat.com, Vinzenz Feenstra vfeen...@redhat.com, Allon 
 Mureinik amure...@redhat.com
 Sent: Monday, September 1, 2014 11:23:45 PM
 Subject: Re: feature review - ReportGuestDisksLogicalDeviceName

 On Sun, Aug 31, 2014 at 07:20:04AM -0400, Liron Aravot wrote:
  Feel free to review the the following feature.

  http://www.ovirt.org/Features/ReportGuestDisksLogicalDeviceName

 Thanks for posting this feature page. Two things worry me about this
 feature. The first is timing. It is not reasonable to suggest an API
 change, and expect it to get to ovirt-3.5.0. We are two late anyway.

 The other one is the suggested API. You suggest placing volatile and
 optional infomation in getVMList. It won't be the first time that we
 have it (guestIPs, guestFQDN, clientIP, and displayIP are there) but
 it's foreign to the notion of conf reported by getVMList() - the set
 of parameters needed to recreate the VM.

At first sight this seems something belonging to getVmStats (which
is reporting already other guest agent information).

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] 答复: Error after changing IP of Node (FQDN is still the same)

2014-08-05 Thread Federico Simoncelli

What's the version of the vdsm and sanlock packages?

Can you please share the logs on the host side? We need vdsm.log
and sanlock.log containing the relevant errors (Cannot acquire host
id).

Thanks,
-- 
Federico


- Original Message -
 From: ml ml mliebher...@googlemail.com
 To: d...@redhat.com
 Cc: users@ovirt.org Users@ovirt.org
 Sent: Sunday, August 3, 2014 8:57:18 PM
 Subject: Re: [ovirt-users]答复: Error after changing IP of Node (FQDN is 
 still the same)
 
 ok, i now removed the nodes and added them again. Same FQDN. I still get
 Start SPM Task failed - result: cleanSuccess, message: VDSGenericException:
 VDSErrorException: Failed to HSMGetTaskStatusVDS, error = Cannot acquire
 host id, code = 661
 
 I also got some error.
 
 Now whats the deal with that host id?
 
 Can somone please point me to the way how to debug this instead of pressing
 some remove and add buttons?
 
 Does someone really know how ovirt works under the hood?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] SPM in oVirt 3.6

2014-08-05 Thread Federico Simoncelli

- Original Message -
 From: Nir Soffer nsof...@redhat.com
 To: Daniel Helgenberger daniel.helgenber...@m-box.de
 Cc: users@ovirt.org, Federico Simoncelli fsimo...@redhat.com
 Sent: Monday, July 28, 2014 6:43:30 PM
 Subject: Re: [ovirt-users] SPM in oVirt 3.6

 - Original Message -
  From: Daniel Helgenberger daniel.helgenber...@m-box.de
  To: users@ovirt.org
  Sent: Friday, July 25, 2014 7:51:33 PM
  Subject: [ovirt-users] SPM in oVirt 3.6

  just out of pure curiosity: In a BZ [1] Allon mentions SPM will go away
  in ovirt 3.6.

  This seems like a major change for me. I assume this will replace
  sanlock as well? What will SPM be replaced with?

 No, sanlock is not going anywhere.

 The change is that we will not have an SPM node, but any node that need to
 make meta data changes, will take a lock using sanlock while it make the
 changes.

 Federico: can you describe in more details how it is going to work?

Most of the information can be found on the feature page:

http://www.ovirt.org/Features/Decommission_Master_Domain_and_SPM

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Reclaim/Trim/issue_discards : how to inform my LUNs about disk space waste?

2014-07-22 Thread Federico Simoncelli

- Original Message -
 From: Nicolas Ecarnot nico...@ecarnot.net
 To: users users@ovirt.org
 Sent: Thursday, July 3, 2014 10:54:57 AM
 Subject: [ovirt-users] Reclaim/Trim/issue_discards : how to inform my LUNs 
 about disk space waste?

 Hi,

 In my hosts, I see that /etc/lvm/lvm.conf shows :

 issue_discards = 0

 Can I enable it to 1 ?

 Thank you?

You can change it but it's not going to affect the lvm behavior in VDSM
since we don't use the host lvm config file.

This will be probably addressed as part of bz1017284 as we're considering
to extend discard also to vdsm images (and not direct luns only).

https://bugzilla.redhat.com/show_bug.cgi?id=1017284

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Can HA Agent control NFS Mount?

2014-06-18 Thread Federico Simoncelli

- Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Doron Fediuck dfedi...@redhat.com, Andrew Lau 
 and...@andrewklau.com
 Cc: users users@ovirt.org, Federico Simoncelli fsimo...@redhat.com
 Sent: Saturday, June 14, 2014 1:29:54 AM
 Subject: Re: [ovirt-users] Can HA Agent control NFS Mount?

 But there may be more going on. Even if I stop vdsmd, the HA services,
 and libvirtd, and sleep 60 seconds, I still see a lock held on the
 Engine VM storage:

 daemon 6f3af037-d05e-4ad8-a53c-61627e0c2464.xion2.smar
 p -1 helper
 p -1 listener
 p -1 status
 s 
 003510e8-966a-47e6-a5eb-3b5c8a6070a9:1:/rhev/data-center/mnt/xion2.smartcity.net\:_export_VM__NewDataDomain/003510e8-966a-47e6-a5eb-3b5c8a6070a9/dom_md/ids:0
 s 
 hosted-engine:1:/rhev/data-center/mnt/xion2\:_export_vm_he1/18eeab54-e482-497f-b096-11f8a43f94f4/ha_agent/hosted-engine.lockspace:0

This output shows that the lockspaces are still acquired. When you put 
hosted-engine
in maintenance they must be released.
One by directly using rem_lockspace (since it's the hosted-engine one) and the 
other
one by stopMonitoringDomain.

I quickly looked at the ovirt-hosted-engine* projects and I haven't found 
anything
related to that.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] sanlock + gluster recovery -- RFE

2014-05-21 Thread Federico Simoncelli

- Original Message -
 From: Ted Miller tmil...@hcjb.org
 To: users users@ovirt.org
 Sent: Tuesday, May 20, 2014 11:31:42 PM
 Subject: [ovirt-users] sanlock + gluster recovery -- RFE

 As you are aware, there is an ongoing split-brain problem with running
 sanlock on replicated gluster storage. Personally, I believe that this is
 the 5th time that I have been bitten by this sanlock+gluster problem.

 I believe that the following are true (if not, my entire request is probably
 off base).

 * ovirt uses sanlock in such a way that when the sanlock storage is on a
 replicated gluster file system, very small storage disruptions can
 result in a gluster split-brain on the sanlock space

Although this is possible (at the moment) we are working hard to avoid it.
The hardest part here is to ensure that the gluster volume is properly
configured.

The suggested configuration for a volume to be used with ovirt is:

Volume Name: (...)
Type: Replicate
Volume ID: (...)
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
(...three bricks...)
Options Reconfigured:
network.ping-timeout: 10
cluster.quorum-type: auto

The two options ping-timeout and quorum-type are really important.

You would also need a build where this bug is fixed in order to avoid any
chance of a split-brain:

https://bugzilla.redhat.com/show_bug.cgi?id=1066996

 How did I get into this mess?

 ...

 What I would like to see in ovirt to help me (and others like me). Alternates
 listed in order from most desirable (automatic) to least desirable (set of
 commands to type, with lots of variables to figure out).

The real solution is to avoid the split-brain altogether. At the moment it
seems that using the suggested configurations and the bug fix we shouldn't
hit a split-brain.

 1. automagic recovery

 2. recovery subcommand

 3. script

 4. commands

I think that the commands to resolve a split-brain should be documented.
I just started a page here:

http://www.ovirt.org/Gluster_Storage_Domain_Reference

Could you add your documentation there? Thanks!

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] sanlock + gluster recovery -- RFE

2014-05-21 Thread Federico Simoncelli

- Original Message -
 From: Giuseppe Ragusa giuseppe.rag...@hotmail.com
 To: fsimo...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, May 21, 2014 5:15:30 PM
 Subject: sanlock + gluster recovery -- RFE

 Hi,

  - Original Message -
   From: Ted Miller tmiller at hcjb.org
   To: users users at ovirt.org
   Sent: Tuesday, May 20, 2014 11:31:42 PM
   Subject: [ovirt-users] sanlock + gluster recovery -- RFE

   As you are aware, there is an ongoing split-brain problem with running
   sanlock on replicated gluster storage. Personally, I believe that this is
   the 5th time that I have been bitten by this sanlock+gluster problem.

   I believe that the following are true (if not, my entire request is
   probably
   off base).

   * ovirt uses sanlock in such a way that when the sanlock storage is
   on a
   replicated gluster file system, very small storage disruptions can
   result in a gluster split-brain on the sanlock space

  Although this is possible (at the moment) we are working hard to avoid it.
  The hardest part here is to ensure that the gluster volume is properly
  configured.

  The suggested configuration for a volume to be used with ovirt is:

  Volume Name: (...)
  Type: Replicate
  Volume ID: (...)
  Status: Started
  Number of Bricks: 1 x 3 = 3
  Transport-type: tcp
  Bricks:
  (...three bricks...)
  Options Reconfigured:
  network.ping-timeout: 10
  cluster.quorum-type: auto

  The two options ping-timeout and quorum-type are really important.

  You would also need a build where this bug is fixed in order to avoid any
  chance of a split-brain:

  https://bugzilla.redhat.com/show_bug.cgi?id=1066996

 It seems that the aforementioned bug is peculiar to 3-bricks setups.

 I understand that a 3-bricks setup can allow proper quorum formation without
 resorting to first-configured-brick-has-more-weight convention used with
 only 2 bricks and quorum auto (which makes one node special, so not
 properly any-single-fault tolerant).

Correct.

 But, since we are on ovirt-users, is there a similar suggested configuration
 for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management
 properly configured and tested-working?
 I mean a configuration where any host can go south and oVirt (through the
 other one) fences it (forcibly powering it off with confirmation from IPMI
 or similar) then restarts HA-marked vms that were running there, all the
 while keeping the underlying GlusterFS-based storage domains responsive and
 readable/writeable (maybe apart from a lapse between detected other-node
 unresposiveness and confirmed fencing)?

We already had a discussion with gluster asking if it was possible to
add fencing to the replica 2 quorum/consistency mechanism.

The idea is that as soon as you can't replicate a write you have to
freeze all IO until either the connection is re-established or you
know that the other host has been killed.

Adding Vijay.
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

2014-05-14 Thread Federico Simoncelli

- Original Message -
 From: Ricardo Esteves ricardo.m.este...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, May 14, 2014 1:45:53 AM
 Subject: RE: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

 In attachment follows the defaults and the modified versions of the nodes
 files.

If selecting the relevant host in Hosts you see the bnx2i interface in the
Network Interfaces subtab, then you can try to:

1. create a new network in the Network tab, VM network checkbox should
   be disabled
2. select the relevant host in Hosts and use Setup Host Networks in the
   Network Interfaces subtab
3. configure the bnx2i interface and assign it to the new network you just
   created
4. in iSCSI Multipathing subtab of tab Data Center add a new entry where
   you bind the iscsi connection to the new network you created

Ping me on IRC if you need more help. My nick is fsimonce on #ovirt

-- 
Federico

 -Original Message-
 From: Federico Simoncelli [mailto:fsimo...@redhat.com]
 Sent: terça-feira, 13 de Maio de 2014 08:58
 To: Ricardo Esteves
 Cc: users@ovirt.org
 Subject: Re: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

 - Original Message -
  From: Ricardo Esteves ricardo.m.este...@gmail.com
  To: users@ovirt.org
  Sent: Friday, April 11, 2014 1:07:31 AM
  Subject: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

  Hi,

  I've put my host on maintenance, then i configured iscsi offload for
  my broadcom cards changing target's file's (192.168.12.2,3260 and
  192.168.12.4,3260) in my node
  iqn.1986-03.com.hp:storage.msa2324i.1226151a6 to use interface
  bnx2i.d8:d3:85:67:e3:bb, but after activating the host, configurations
  are back to default.

 Can you share the changes you made? Thanks.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

2014-05-14 Thread Federico Simoncelli

- Original Message -
 From: Federico Simoncelli fsimo...@redhat.com
 To: Ricardo Esteves ricardo.m.este...@gmail.com
 Cc: users@ovirt.org
 Sent: Wednesday, May 14, 2014 3:47:58 PM
 Subject: Re: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

 - Original Message -
  From: Ricardo Esteves ricardo.m.este...@gmail.com
  To: Federico Simoncelli fsimo...@redhat.com
  Cc: users@ovirt.org
  Sent: Wednesday, May 14, 2014 1:45:53 AM
  Subject: RE: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

  In attachment follows the defaults and the modified versions of the nodes
  files.

 If selecting the relevant host in Hosts you see the bnx2i interface in the
 Network Interfaces subtab, then you can try to:

Sorry I just noticed that you mentioned in the subject that you're using
oVirt 3.2. What I suggested is available only since oVirt 3.4.

-- 
Federico

 1. create a new network in the Network tab, VM network checkbox should
be disabled
 2. select the relevant host in Hosts and use Setup Host Networks in the
Network Interfaces subtab
 3. configure the bnx2i interface and assign it to the new network you just
created
 4. in iSCSI Multipathing subtab of tab Data Center add a new entry where
you bind the iscsi connection to the new network you created

 Ping me on IRC if you need more help. My nick is fsimonce on #ovirt

 --
 Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

2014-05-13 Thread Federico Simoncelli

- Original Message -
 From: Ricardo Esteves ricardo.m.este...@gmail.com
 To: users@ovirt.org
 Sent: Friday, April 11, 2014 1:07:31 AM
 Subject: [ovirt-users] oVirt 3.2 - iSCSI offload (broadcom - bnx2i)

 Hi,

 I've put my host on maintenance, then i configured iscsi offload for my
 broadcom cards changing target's file's (192.168.12.2,3260 and
 192.168.12.4,3260) in my node
 iqn.1986-03.com.hp:storage.msa2324i.1226151a6 to use interface
 bnx2i.d8:d3:85:67:e3:bb, but after activating the host, configurations
 are back to default.

Can you share the changes you made? Thanks.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Snapshot removal

2014-05-13 Thread Federico Simoncelli

- Original Message -
 From: Dafna Ron d...@redhat.com
 To: users@ovirt.org
 Sent: Thursday, April 17, 2014 12:52:04 PM
 Subject: Re: [ovirt-users] Snapshot removal

 NFS is wipe_after_delete=true always.
 so delete of snapshot will merge the data to upper level image + zero in
 on the data which is why this is taking a long time.

I double checked this and I was very much surprised by this finding!

We need a bz right away, Dafna can file it? Thanks!

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Sanlock log entry

2014-05-12 Thread Federico Simoncelli

- Original Message -
 From: Maurice James mja...@media-node.com
 To: users@ovirt.org
 Sent: Saturday, April 12, 2014 3:06:37 PM
 Subject: [ovirt-users] Sanlock log entry

 I'm seeing this about every 20 seconds in my sanlock.log.What does it mean?

 s1:r1694 resource
 a033c2ac-0d01-490c-9552-99ca53d6a64a:SDM:/rhev/data-center/mnt/ashtivh02.suprtekstic.com:_var_lib_exports_storage/a033c2ac-0d01-490c-9552-99ca53d6a64a/dom_md/leases:1048576
 for 3,14,3960

Can you provide more info? Anything weird in the VDSM logs? (errors/warnings)
Anything in the engine logs?

That message may be related to acquiring the SPM role but it shouldn't happen
so often.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Error creating Disks

2014-04-16 Thread Federico Simoncelli

- Original Message -
 From: Maurice James mja...@media-node.com
 To: d...@redhat.com
 Cc: users@ovirt.org
 Sent: Tuesday, April 15, 2014 6:54:11 PM
 Subject: Re: [ovirt-users] Error creating Disks
 
 Logs are attached.
 
 Live Migration failed

In the engine logs I see:

2014-04-15 12:51:07,420 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand] 
(org.ovirt.thread.pool-6-thread-42) START, SnapshotVDSCommand(HostName = 
vhost3, HostId = bc9c25e6-714e-4eac-8af0-860ac76fd195, 
vmId=ba49605b-fb7e-4a70-a380-6286d3903e50), log id: e953f9a
...
2014-04-15 12:51:07,496 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand] 
(org.ovirt.thread.pool-6-thread-42) Command SnapshotVDSCommand(HostName = 
vhost3, HostId = bc9c25e6-714e-4eac-8af0-860ac76fd195, 
vmId=ba49605b-fb7e-4a70-a380-6286d3903e50) execution failed. Exception: 
VDSErrorException: VDSGenericException: VDSErrorException: Failed to 
SnapshotVDS, error = Snapshot failed, code = 48

but I can't find anything related to that in the vdsm log. Are you sure
you attached the correct vdsm log?

Are the hosts running centos or fedora?

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] oVirt 3.5 planning

2014-03-20 Thread Federico Simoncelli

- Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Federico Alberto Sayd fs...@uncu.edu.ar, users@ovirt.org, Federico 
 Simoncelli fsimo...@redhat.com
 Sent: Thursday, March 20, 2014 6:47:14 PM
 Subject: Re: [Users] oVirt 3.5 planning

 On 03/20/2014 07:30 PM, Federico Alberto Sayd wrote:
  3 - Another question, when you convert a VM to template I see that it is
  created with preallocated disk even if the original VM had thinly
  provisioned disk. Is there no way to make the template with the same
  type of disk (thinly provisioned)??

 on NFS - it doesn't matter.
 on block storage - i don't remember why. maybe federico remembers.

It was for performance reasons since on block it would result in a qcow2
image. We wanted the access to the template to be as fast as possible.

The same question was brought up few months ago when we discussed importing
images from glance as templates. Anyway in that case we decided (for simplicity)
to allow the import as template also for qcow2 images.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] GSoC 14 Idea Discussion - virt-sparsify integration

2014-03-19 Thread Federico Simoncelli

- Original Message -
 From: Utkarsh Singh utkarshs...@gmail.com
 To: users@ovirt.org
 Cc: fsimo...@redhat.com
 Sent: Friday, March 7, 2014 6:16:46 PM
 Subject: GSoC 14 Idea Discussion - virt-sparsify integration

 Hello,

 I am Utkarsh, a 4th year undergrad from IIT Delhi and a GSoC-14
 aspirant. I have been lately involved in an ongoing project Baadal
 Cloud Computing Platform in my institute, which has got me interested
 in oVirt for a potential GSoC project.

 I was going through the virt-sparsify integration project idea. I have
 gone through the architecture documentation on the oVirt website. As
 far as I understand, the virt-sparsify integration needs to be done on
 the VDSM daemon, and it's control is either going to be completely
 independent of ovirt-engine (for example running it once every 24
 hours), or it's something that is centrally controlled by the
 ovirt-engine through XML/RPC calls. The details are not specified in
 the project ideas page. I would like to ask -

The request to sparsifying the image is controlled by ovirt-engine.
The user will pick one (or eventually more) disk(s) that are not in use
(vm down) and he'll request to sparsify it/them.

 1. What would be the proposed ideal implementation? (Central-Control
 or Independent-Control)

Central-Control

 2. Is virt-sparsify usage going to be automated or
 administrator-triggered, or a combination of both?

administrator-triggered

 There are some aspects of the idea, which I would like to discuss
 before I start working on a proposal.

 It's not necessary that an automated usage of virt-sparsify is limited
 to any simple idea. Architecture documentation states that
 ovirt-engine has features like Monitoring that would allow
 administrators (and possibly users) to be aware of vm-guest
 performance as well as vm-host performance. I am not very sure about
 how this data is collected, Is it done through MoM, or Is this
 directly done by VDSM, or is someone else doing this (for hosts). It
 would be great if someone can explain that to me. This information
 about vm-guest usage and vm-host health can help in determining how
 virt-sparsify is to be used.

The vm/hosts statistics are gathered and provided by VDSM.
Anyway I would leave this part out at the moment. The virt-sparsify
command is a long running task and in the current architecture it
can be only an SPM task.
There is some ongoing work to remove the pool and the SPM (making
virt-sparsify operable by any host) but I wouldn't block on that.

 I am also not very clear about the Shared Storage component in the
 architecture. Does oVirt make any assumptions about the Shared
 Storage. For example, the performance difference between running
 virt-sparsify on NFS as compared to running it (if possible) directly
 on storage hardware. If the Storage solution is necessarily a NAS
 instance, then virt-sparsify on NFS mount is the only option.

The storage connections are already managed by vdsm and the volume
chains are maintained transparently in /rhev/data-center/...
There are few differences between image files on NFS/Gluster and images
stored on LVs but with regard to how to reach them it is transparent
(similar path).

 Right now, I am in the process of setting up oVirt on my system, and
 getting more familiar with the architecture. Regarding my experience.
 I am acquainted with both Java and Python. I have little experience
 with JBoss, but I have worked on some other Web Application Servers
 like web2py and Play Framework. My involvement in Baadal Platform has
 got me acquainted with libvirt/QEMU, the details of which I have
 mentioned below (if anyone is interested).

Depending on the amount of time that you can dedicate to this project
it seems that you could tackle both the vdsm and ovirt-engine parts.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Volume Group does not exist. Blame device-mapper ?

2014-02-17 Thread Federico Simoncelli

- Original Message -
 From: Nicolas Ecarnot nico...@ecarnot.net
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: users users@ovirt.org
 Sent: Monday, February 17, 2014 10:14:56 AM
 Subject: Re: [Users] Volume Group does not exist. Blame device-mapper ?

 Le 14/02/2014 15:39, Federico Simoncelli a écrit :
  Hi Nicolas,
are you still able to reproduce this issue? Are you using fedora or
  centos?

  If providing the logs is problematic for you could you try to ping me
  on irc (fsimonce on #ovirt OFTC) so that we can work on the issue
  together?

  Thanks,

 Hi Frederico,

 Since I haven't changed anything related to the SAN or the network, I'm
 pretty sure I'll be able to reproduce the bug.
 We are using CentOS.
 I can provide the logs, no issue.

 This week, our oVirt setup will be strongly used, so this is not the
 better time to play with it. I'm very thankful you took the time to
 answer, but may I delay my answer about this bug to next week?

Ok, no problem. Feel free to contact me on IRC when you start testing.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Volume Group does not exist. Blame device-mapper ?

2014-02-14 Thread Federico Simoncelli

Hi Nicolas,
 are you still able to reproduce this issue? Are you using fedora or
centos?

If providing the logs is problematic for you could you try to ping me
on irc (fsimonce on #ovirt OFTC) so that we can work on the issue
together?

Thanks,
-- 
Federico

- Original Message -
 From: Nicolas Ecarnot nico...@ecarnot.net
 To: users users@ovirt.org
 Sent: Monday, January 20, 2014 11:06:21 AM
 Subject: [Users] Volume Group does not exist. Blame device-mapper ?
 
 Hi,
 
 oVirt 3.3, no big issue since the recent snapshot joke, but all in all
 running fine.
 
 All my VM are stored in a iSCSI SAN. The VM usually are using only one
 or two disks (1: system, 2: data) and it is OK.
 
 Friday, I created a new LUN. Inside a VM, I linked to it via iscsiadm
 and successfully login to the Lun (session, automatic attach on boot,
 read, write) : nice.
 
 Then after detaching it and shuting down the MV, and for the first time,
 I tried to make use of the feature direct attach to attach the disk
 directly from oVirt, login the session via oVirt.
 I connected nice and I saw the disk appear in my VM as /dev/sda or
 whatever. I was able to mount it, read and write.
 
 Then disaster stoke all this : many nodes suddenly began to become
 unresponsive, quickly migrating their VM to the remaining nodes.
 Hopefully, the migrations ran fine and I lost no VM nor downtime, but I
 had to reboot every concerned node (other actions failed).
 
 In the failing nodes, /var/log/messages showed the log you can read in
 the end of this message.
 I first get device-mapper warnings, then the host unable to collaborate
 with the logical volumes.
 
 The 3 volumes are the three main storage domains, perfectly up and
 running where I store my oVirt VMs.
 
 My reflexions :
 - I'm not sure device-mapper is to blame. I frequently see device mapper
 complaining and nothing is getting worse (not oVirt specifically)
 - I have not change my network settings for months (bonding, linking...)
 The only new factor is the usage of direct attach LUN.
 - This morning I was able to reproduce the bug, just by trying again
 this attachement, and booting the VM. No mounting of the LUN, just VM
 booting, waiting, and this is enough to crash oVirt.
 - when the disaster happens, usually, amongst the nodes, only three
 nodes gets stroke, the only one that run VMs. Obviously, after
 migration, different nodes are hosting the VMs, and those new nodes are
 the one that then get stroke.
 
 This is quite reproductible.
 
 And frightening.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[Users] Google Summer of Code 2014

2014-02-13 Thread Federico Simoncelli

Hi everyone, I started a wiki page to list ideas for Google Summer
of Code 2014:

http://www.ovirt.org/Summer_of_Code

The deadline for the submission is really soon (14th of Feb) but
please feel free to try and add any idea that you may have.

For more information about Google Summer of Code please refer to:

https://developers.google.com/open-source/soc/

If you can't edit the wiki page please follow up to this thread
with your proposals.

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Data Center stuck between Non Responsive and Contending

2014-01-27 Thread Federico Simoncelli

- Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Ted Miller tmil...@hcjb.org, users@ovirt.org, Federico Simoncelli 
 fsimo...@redhat.com
 Cc: Allon Mureinik amure...@redhat.com
 Sent: Sunday, January 26, 2014 11:17:04 PM
 Subject: Re: [Users] Data Center stuck between Non Responsive and 
 Contending
 
 On 01/27/2014 12:00 AM, Ted Miller wrote:
 
  On 1/26/2014 4:00 PM, Itamar Heim wrote:
  On 01/26/2014 10:51 PM, Ted Miller wrote:
 
  On 1/26/2014 3:10 PM, Itamar Heim wrote:
  On 01/26/2014 10:08 PM, Ted Miller wrote:
  is this gluster storage (guessing sunce you mentioned a 'volume')
  yes (mentioned under setup above)
  does it have a quorum?
  Volume Name: VM2
  Type: Replicate
  Volume ID: 7bea8d3b-ec2a-4939-8da8-a82e6bda841e
  Status: Started
  Number of Bricks: 1 x 3 = 3
  Transport-type: tcp
  Bricks:
  Brick1: 10.41.65.2:/bricks/01/VM2
  Brick2: 10.41.65.4:/bricks/01/VM2
  Brick3: 10.41.65.4:/bricks/101/VM2
  Options Reconfigured:
  cluster.server-quorum-type: server
  storage.owner-gid: 36
  storage.owner-uid: 36
  auth.allow: *
  user.cifs: off
  nfs.disa
  (there were reports of split brain on the domain metadata before when
  no quorum exist for gluster)
  after full heal:
 
  [root@office4a ~]$ gluster volume heal VM2 info
  Gathering Heal info on volume VM2 has been successful
 
  Brick 10.41.65.2:/bricks/01/VM2
  Number of entries: 0
 
  Brick 10.41.65.4:/bricks/01/VM2
  Number of entries: 0
 
  Brick 10.41.65.4:/bricks/101/VM2
  Number of entries: 0
  [root@office4a ~]$ gluster volume heal VM2 info split-brain
  Gathering Heal info on volume VM2 has been successful
 
  Brick 10.41.65.2:/bricks/01/VM2
  Number of entries: 0
 
  Brick 10.41.65.4:/bricks/01/VM2
  Number of entries: 0
 
  Brick 10.41.65.4:/bricks/101/VM2
  Number of entries: 0
 
  noticed this in host /var/log/messages (while looking for something
  else).  Loop seems to repeat over and over.
 
  Jan 26 15:35:52 office4a sanlock[3763]: 2014-01-26 15:35:52-0500 14678
  [30419]: read_sectors delta_leader offset 512 rv -90
  /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
 
 
  Jan 26 15:35:53 office4a sanlock[3763]: 2014-01-26 15:35:53-0500 14679
  [3771]: s1997 add_lockspace fail result -90
  Jan 26 15:35:58 office4a vdsm TaskManager.Task ERROR
  Task=`89885661-88eb-4ea3-8793-00438735e4ab`::Unexpected
  error#012Traceback
  (most recent call last):#012  File /usr/share/vdsm/storage/task.py,
  line
  857, in _run#012 return fn(*args, **kargs)#012  File
  /usr/share/vdsm/logUtils.py, line 45, in wrapper#012res = f(*args,
  **kwargs)#012  File /usr/share/vdsm/storage/hsm.py, line 2111, in
  getAllTasksStatuses#012allTasksStatus = sp.getAllTasksStatuses()#012
  File /usr/share/vdsm/storage/securable.py, line 66, in wrapper#012
  raise
  SecureError()#012SecureError
  Jan 26 15:35:59 office4a sanlock[3763]: 2014-01-26 15:35:59-0500 14686
  [30495]: read_sectors delta_leader offset 512 rv -90
  /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
 
 
  Jan 26 15:36:00 office4a sanlock[3763]: 2014-01-26 15:36:00-0500 14687
  [3772]: s1998 add_lockspace fail result -90
  Jan 26 15:36:00 office4a vdsm TaskManager.Task ERROR
  Task=`8db9ff1a-2894-407a-915a-279f6a7eb205`::Unexpected
  error#012Traceback
  (most recent call last):#012  File /usr/share/vdsm/storage/task.py,
  line
  857, in _run#012 return fn(*args, **kargs)#012  File
  /usr/share/vdsm/storage/task.py, line 318, in run#012return
  self.cmd(*self.argslist, **self.argsdict)#012 File
  /usr/share/vdsm/storage/sp.py, line 273, in startSpm#012
  self.masterDomain.acquireHostId(self.id)#012  File
  /usr/share/vdsm/storage/sd.py, line 458, in acquireHostId#012
  self._clusterLock.acquireHostId(hostId, async)#012  File
  /usr/share/vdsm/storage/clusterlock.py, line 189, in
  acquireHostId#012raise se.AcquireHostIdFailure(self._sdUUID,
  e)#012AcquireHostIdFailure: Cannot acquire host id:
  ('0322a407-2b16-40dc-ac67-13d387c6eb4c', SanlockException(90, 'Sanlock
  lockspace add failure', 'Message too long'))
 
 fede - thoughts on above?
 (vojtech reported something similar, but it sorted out for him after
 some retries)

Something truncated the ids file, as also reported by:

 [root@office4a ~]$ ls
 /rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
 -l
 total 1029
 -rw-rw 1 vdsm kvm 0 Jan 22 00:44 ids
 -rw-rw 1 vdsm kvm 0 Jan 16 18:50 inbox
 -rw-rw 1 vdsm kvm 2097152 Jan 21 18:20 leases
 -rw-r--r-- 1 vdsm kvm 491 Jan 21 18:20 metadata
 -rw-rw 1 vdsm kvm 0 Jan 16 18:50 outbox

In the past I saw that happening because of a glusterfs bug:

https://bugzilla.redhat.com/show_bug.cgi?id=862975

Anyway in general it seems that glusterfs is not always able to reconcile
the ids file (as it's written by all the hosts at the same time).

Maybe someone from gluster can identify easily what happened. Meanwhile if
you

[Users] oVirt 3.4 test day - Template Versions

2014-01-27 Thread Federico Simoncelli

Feature tested:

http://www.ovirt.org/Features/Template_Versions

- create a new vm vm1 and make a template template1 from it
- create a new vm vm2 based on template1 and make some changes
- upgrade to 3.4
- create a new template template1.1 from vm2
- create a new vm vm3 from template1 (clone) - content ok
- create a new vm vm4 from template1.1 (thin) - content ok
- create a new vm vm5 from template1 last (thin) - content ok (same as 1.1)
- try to remove template1 (failed as template1.1 is still present)
- try to remove template1.1 (failed as vm5 is still present)
- create a new vm vm6 and make a template blank1.1 as new version of the
  blank template (succeeded)
- create a vm pool vmpool1 with the latest template from template1
- create a vm pool vmpool2 with the template1.1 (last) template from template1
- start vmpool1 and vmpool2 and verify that the content is the same
- create a new template template1.2
- start vmpool1 and verify that the content is the same as latest (template1.2)
- start vmpool2 and verify that the content is the same as template1.1

Suggestions:

- the template blank is special, I am not sure if allowing versioning may
  be confusing (for example is not even editable)
- as far as I can see the Sub Version Name is not editable anymore (after
  picking it)

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Data Center stuck between Non Responsive and Contending

2014-01-27 Thread Federico Simoncelli

- Original Message -
From: Ted Miller tmil...@hcjb.org
To: Federico Simoncelli fsimo...@redhat.com, Itamar Heim
ih...@redhat.com
Cc: users@ovirt.org
Sent: Monday, January 27, 2014 7:16:14 PM
Subject: Re: [Users] Data Center stuck between Non Responsive and
Contending

On 1/27/2014 3:47 AM, Federico Simoncelli wrote:
Maybe someone from gluster can identify easily what happened. Meanwhile if
you just want to repair your data-center you could try with:

$ cd

/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
$ touch ids
$ sanlock direct init -s
0322a407-2b16-40dc-ac67-13d387c6eb4c:0:ids:1048576
Federico,

I won't be able to do anything to the ovirt setup for another 5 hours or so
(it is a trial system I am working on at home, I am at work), but I will try
your repair script and report back.

In bugzilla 862975 they suggested turning off write-behind caching and eager
locking on the gluster volume to avoid/reduce the problems that come from
many different computers all writing to the same file(s) on a very frequent
basis. If I interpret the comment in the bug correctly, it did seem to help
in that situation. My situation is a little different. My gluster setup is
replicate only, replica 3 (though there are only two hosts). I was not
stress-testing it, I was just using it, trying to figure out how I can import
some old VMWare VMs without an ESXi server to run them on.

Have you done anything similar to what is described here in comment 21?

https://bugzilla.redhat.com/show_bug.cgi?id=859589#c21

When did you realize that you weren't able to use the data-center anymore?
Can you describe exactly what you did and what happened, for example:

1. I created the data center (up and running)
2. I tried to import some VMs from VMWare
3. During the import (or after it) the data-center went in the contending state
...

Did something special happened? I don't know, power loss, split-brain?
For example also an excessive load on one of the servers could have triggered
a timeout somewhere (forcing the data-center to go back in the contending
state).

Could you check if any host was fenced? (Forcibly rebooted)

I am guessing that what makes cluster storage have the (Master) designation
is that this is the one that actually contains the sanlocks? If so, would it
make sense to set up a gluster volume to be (Master), but not use it for VM
storage, just for storing the sanlock info? Separate gluster volume(s) could
then have the VMs on it(them), and would not need the optimizations turned
off.

Any domain must be able to become the master at any time. Without a master
the data center is unusable (at the present time), that's why we migrate (or
reconstruct) it on another domain when necessary.

--
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] oVirt 3.4 test day - Template Versions

2014-01-27 Thread Federico Simoncelli

- Original Message -
 From: Omer Frenkel ofren...@redhat.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: oVirt Users List users@ovirt.org, Itamar Heim ih...@redhat.com
 Sent: Monday, January 27, 2014 4:31:56 PM
 Subject: Re: oVirt 3.4 test day - Template Versions
 
 Thanks for the feedback! much appreciated.
 
 - Original Message -
  From: Federico Simoncelli fsimo...@redhat.com
  To: oVirt Users List users@ovirt.org
  Cc: Omer Frenkel ofren...@redhat.com, Itamar Heim ih...@redhat.com
  Sent: Monday, January 27, 2014 5:12:38 PM
  Subject: oVirt 3.4 test day - Template Versions
  
  Feature tested:
  
  http://www.ovirt.org/Features/Template_Versions
  
  - create a new vm vm1 and make a template template1 from it
  - create a new vm vm2 based on template1 and make some changes
  - upgrade to 3.4
  - create a new template template1.1 from vm2
  - create a new vm vm3 from template1 (clone) - content ok
  - create a new vm vm4 from template1.1 (thin) - content ok
  - create a new vm vm5 from template1 last (thin) - content ok (same as 1.1)
  - try to remove template1 (failed as template1.1 is still present)
  - try to remove template1.1 (failed as vm5 is still present)
  - create a new vm vm6 and make a template blank1.1 as new version of the
blank template (succeeded)
  - create a vm pool vmpool1 with the latest template from template1
  - create a vm pool vmpool2 with the template1.1 (last) template from
  template1
  - start vmpool1 and vmpool2 and verify that the content is the same
  - create a new template template1.2
  - start vmpool1 and verify that the content is the same as latest
  (template1.2)
  - start vmpool2 and verify that the content is the same as template1.1
  
  Suggestions:
  
  - the template blank is special, I am not sure if allowing versioning may
be confusing (for example is not even editable)
 
 right, i also thought about this, and my thought was not to block the user
 from doing this,
 but if it was confusing we better block it.
 
  - as far as I can see the Sub Version Name is not editable anymore (after
picking it)
 
 thanks, i see its missing in the UI, do you care to open a bug on that?

https://bugzilla.redhat.com/show_bug.cgi?id=1058501 

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] wiki site out of date?

2014-01-13 Thread Federico Simoncelli

- Original Message -
 From: Sven Kieske s.kie...@mittwald.de
 To: Users@ovirt.org List Users@ovirt.org
 Sent: Monday, January 13, 2014 12:07:20 PM
 Subject: [Users] wiki site out of date?

 Hi,

 is this feature page up to date?

 http://www.ovirt.org/Features/Online_Virtual_Drive_Resize

 specifically the point:

 QEMU-GA

 support for notifying the guest and updating the size of the visible
 disk: To be integrated

That actually meant triggering the partition/lvm/filesystem resize
automatically (in the guest), so that you don't have to do it manually.

 because we used online drive resize and it does show up in ubuntu
 based vm, without qemu-ga installed?

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Broken Snapshots

2014-01-02 Thread Federico Simoncelli

- Original Message -
 From: Maurice James midnightst...@msn.com
 To: d...@redhat.com, Leonid Natapov lnata...@redhat.com
 Cc: eduardo Warszawski ewars...@redhat.com, Federico Simoncelli 
 fsimo...@redhat.com, Liron Aravot
 lara...@redhat.com, users@ovirt.org
 Sent: Thursday, January 2, 2014 2:04:06 PM
 Subject: RE: [Users] Broken Snapshots

 When I get home from work. I will attempt to delete the snapshot again then
 send the vdsm.log to you guys. Thanks

  Date: Thu, 2 Jan 2014 12:58:47 +
  From: d...@redhat.com
  To: lnata...@redhat.com
  CC: midnightst...@msn.com; ewars...@redhat.com; fsimo...@redhat.com;
  lara...@redhat.com; users@ovirt.org
  Subject: Re: [Users] Broken Snapshots

  Leo, there are two issues here:
  1. I want to know what happened in Maurice's environment in the first
  place (why is the snapshot broken).
  2. help with a workaround so that Maurice can delete the broken
  snapshots and continue working.

At least for me this is hard to track, can we open a bug and attach the logs
there?

Are you using fedora or centos? On centos there's a known issue:

https://bugzilla.redhat.com/show_bug.cgi?id=1009100

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Broken Snapshots

2014-01-02 Thread Federico Simoncelli

- Original Message -
 From: Leonid Natapov lnata...@redhat.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: Maurice James midnightst...@msn.com, d...@redhat.com, eduardo 
 Warszawski ewars...@redhat.com, Liron
 Aravot lara...@redhat.com, users@ovirt.org
 Sent: Thursday, January 2, 2014 2:09:13 PM
 Subject: Re: [Users] Broken Snapshots

 Fede,there is an open BZ about that. See if it's the same
 https://bugzilla.redhat.com/show_bug.cgi?id=996945

That's the second part (deleting the broken snapshot), I'm trying to
understand why the snapshot failed.

-- 
Federico

 - Original Message -
 From: Federico Simoncelli fsimo...@redhat.com
 To: Maurice James midnightst...@msn.com
 Cc: d...@redhat.com, Leonid Natapov lnata...@redhat.com, eduardo
 Warszawski ewars...@redhat.com, Liron Aravot lara...@redhat.com,
 users@ovirt.org
 Sent: Thursday, January 2, 2014 3:07:53 PM
 Subject: Re: [Users] Broken Snapshots

 - Original Message -
  From: Maurice James midnightst...@msn.com
  To: d...@redhat.com, Leonid Natapov lnata...@redhat.com
  Cc: eduardo Warszawski ewars...@redhat.com, Federico Simoncelli
  fsimo...@redhat.com, Liron Aravot
  lara...@redhat.com, users@ovirt.org
  Sent: Thursday, January 2, 2014 2:04:06 PM
  Subject: RE: [Users] Broken Snapshots

  When I get home from work. I will attempt to delete the snapshot again then
  send the vdsm.log to you guys. Thanks

   Date: Thu, 2 Jan 2014 12:58:47 +
   From: d...@redhat.com
   To: lnata...@redhat.com
   CC: midnightst...@msn.com; ewars...@redhat.com; fsimo...@redhat.com;
   lara...@redhat.com; users@ovirt.org
   Subject: Re: [Users] Broken Snapshots

   Leo, there are two issues here:
   1. I want to know what happened in Maurice's environment in the first
   place (why is the snapshot broken).
   2. help with a workaround so that Maurice can delete the broken
   snapshots and continue working.

 At least for me this is hard to track, can we open a bug and attach the logs
 there?

 Are you using fedora or centos? On centos there's a known issue:

 https://bugzilla.redhat.com/show_bug.cgi?id=1009100

 --
 Federico

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] AcquireHostId problem

2013-12-23 Thread Federico Simoncelli

- Original Message -
 From: Pascal Jakobi pascal.jak...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: users@ovirt.org
 Sent: Saturday, December 21, 2013 12:37:51 PM
 Subject: Re: [Users] AcquireHostId problem

 Nope. Nothing more in logs.
 My guess is that the timeout problem generates the error.
 However, in reality if you run mount, you have the target partitions
 mounted

If you still have a problem and something is not working there's an error
somewhere and we only have to find it. Look in the engine logs and in the
vdsm logs for any error (not only Traceback but also ERROR).

Try to describe with more details what you're trying to do, what you expect
to happen and what is happening instead.

 Therefore, I guess the problem is to understand why dev/watchdog0 failed
 to set timeout

The watchdog provided by your laptop is not working properly or it's
not able to set the timeout we need.

You inserted the softdog module and wdmd is now up and running as you
reported with:

 2013/12/21 Federico Simoncelli fsimo...@redhat.com

  - Original Message -
   From: Pascal Jakobi pascal.jak...@gmail.com
   To: Federico Simoncelli fsimo...@redhat.com, users@ovirt.org
   Sent: Friday, December 20, 2013 11:54:21 PM
   Subject: Re: [Users] AcquireHostId problem

   Dec 20 23:43:59 lab2 kernel: [183033.639261] softdog: Software Watchdog
   Timer: 0.08 initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0
   (nowayout=0)
   Dec 20 23:44:11 lab2 systemd[1]: Starting Watchdog Multiplexing Daemon...
   Dec 20 23:44:11 lab2 wdmd[25072]: wdmd started S0 H1 G179
   Dec 20 23:44:11 lab2 systemd-wdmd[25066]: Starting wdmd: [  OK  ]
   Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 failed to set timeout
   Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 disarmed
   Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog1 armed with fire_timeout 
   60
   Dec 20 23:44:11 lab2 systemd[1]: Started Watchdog Multiplexing Daemon.

So as far as I can see this part is working.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] AcquireHostId problem

2013-12-23 Thread Federico Simoncelli

- Original Message -
 From: Pascal Jakobi pascal.jak...@gmail.com
 To: users@ovirt.org
 Sent: Monday, December 23, 2013 4:33:23 PM
 Subject: Re: [Users] AcquireHostId problem

 Ok.
 What I am doing is just adding a new NFS domain that fails : Failed to add
 Storage Domain DataLab2. (User: admin@internal)
 And I thought that the /dev/watchdog0 failed to set timeout msg was
 signalling an error.

No that's just the attempt to use the laptop watchdog but then it
fallbacks to the softdog one:

Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog1 armed with fire_timeout 60
Dec 20 23:44:11 lab2 systemd[1]: Started Watchdog Multiplexing Daemon.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] AcquireHostId problem

2013-12-23 Thread Federico Simoncelli

- Original Message -
 From: Pascal Jakobi pascal.jak...@gmail.com
 To: users@ovirt.org
 Sent: Monday, December 23, 2013 5:34:55 PM
 Subject: Re: [Users] AcquireHostId problem

 Here is the message I get on the console : Error while executing action
 Attach Storage Domain: AcquireHostIdFailure
 The software seems to go pretty far : it reaches the locked stated before
 failing.

 In engine.log
 2013-12-23 16:56:49,497 ERROR
 [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand]
 (ajp--127.0.0.1-8702-2) Command
 org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand throw
 Vdc Bll exception. With error message VdcBLLException:
 org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException:
 VDSGenericException: VDSErrorException: Failed to CreateStoragePoolVDS,
 error = Cannot acquire host id: ('8c626a3f-5846-434e-83d8-6238e1ff9e03',
 SanlockException(-203, 'Sanlock lockspace add failure', 'Sanlock
 exception')) (Failed with error AcquireHostIdFailure and code 661)

 Can this help ?

What's the logs in VDSM? Is this the same host where wdmd was up and
running or another one? If you restarted your laptop and you didn't
persist the module loading (following the instruction in one of my
previous emails) you'll end up in the same problem every time.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] AcquireHostId problem

2013-12-20 Thread Federico Simoncelli

- Original Message -
 From: Pascal Jakobi pascal.jak...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: users@ovirt.org, David Teigland teigl...@redhat.com
 Sent: Friday, December 20, 2013 7:19:52 AM
 Subject: Re: [Users] AcquireHostId problem

 Here you go !
 I am running F19 on a Lenovo S30.
 Thxs

Thanks, can you open a bug on this issue?
(Attach also the files to the bug).

I suppose it will be later split into different ones, one for the failing
watchdog device and maybe an RFE to wdmd to automatically load the softdog
if there are no usable watchdog devices.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] AcquireHostId problem

2013-12-20 Thread Federico Simoncelli

- Original Message -
 From: Pascal Jakobi pascal.jak...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com, users@ovirt.org
 Sent: Friday, December 20, 2013 11:54:21 PM
 Subject: Re: [Users] AcquireHostId problem
 
 Dec 20 23:43:59 lab2 kernel: [183033.639261] softdog: Software Watchdog
 Timer: 0.08 initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0
 (nowayout=0)
 Dec 20 23:44:11 lab2 systemd[1]: Starting Watchdog Multiplexing Daemon...
 Dec 20 23:44:11 lab2 wdmd[25072]: wdmd started S0 H1 G179
 Dec 20 23:44:11 lab2 systemd-wdmd[25066]: Starting wdmd: [  OK  ]
 Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 failed to set timeout
 Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog0 disarmed
 Dec 20 23:44:11 lab2 wdmd[25072]: /dev/watchdog1 armed with fire_timeout 60
 Dec 20 23:44:11 lab2 systemd[1]: Started Watchdog Multiplexing Daemon.
 Dec 20 23:45:33 lab2 rpc.mountd[2819]: authenticated mount request from
 192.168.1.41:994 for /home/vdsm/data (/home/vdsm/data)
 Dec 20 23:45:39 lab2 rpc.mountd[2819]: authenticated mount request from
 192.168.1.41:954 for /home/vdsm/data (/home/vdsm/data)
 
 Seems to work a bit.
 However I still get unable to attach storage when creating a domain

It is probably a different error now. Anything interesting in vdsm.log?

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] AcquireHostId problem

2013-12-19 Thread Federico Simoncelli

- Original Message -
 From: Pascal Jakobi pascal.jak...@gmail.com
 To: users@ovirt.org
 Sent: Wednesday, December 18, 2013 9:32:00 PM
 Subject: Re: [Users] AcquireHostId problem
 
 sanlock.log :
 
 2013-12-18 21:23:32+0100 1900 [867]: s1 lockspace
 b2d69b22-a8b8-466c-bf1f-b6e565228238:250:/rhev/data-center/mnt/lab2.home:_home_vdsm_data/b2d69b22-a8b8-466c-bf1f-b6e565228238/dom_md/ids:0
 2013-12-18 21:23:52+0100 1920 [4238]: s1 wdmd_connect failed -111
 2013-12-18 21:23:52+0100 1920 [4238]: s1 create_watchdog failed -1
 2013-12-18 21:23:53+0100 1921 [867]: s1 add_lockspace fail result -203

Hi Pascal,
 is wdmd up and running?

# ps aux | grep wdmd
root  1650  0.0  0.2  13552  3320 ?SLs  03:49   0:00 wdmd -G sanlock

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] AcquireHostId problem

2013-12-19 Thread Federico Simoncelli

- Original Message -
 From: Pascal Jakobi pascal.jak...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: users@ovirt.org
 Sent: Thursday, December 19, 2013 4:05:07 PM
 Subject: Re: [Users] AcquireHostId problem

 Federico

 On may suspect wdmd isn't running as wdmd_connect failed (see sanlock.log).
 I have ran a ps command - no wdmd

 Any idea why wdmd_connect might fail ?

Are you using fedora? What is the version of sanlock?
Do you see any information about wdmd in /var/log/messages?
Is the wdmd service started?

# service wdmd status
# systemctl status wdmd.service

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Cinder Integration

2013-11-19 Thread Federico Simoncelli

- Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Udaya Kiran P ukiran...@yahoo.in, users@ovirt.org, Oved Ourfalli 
 oourf...@redhat.com, Federico
 Simoncelli fsimo...@redhat.com
 Sent: Tuesday, November 19, 2013 12:42:05 PM
 Subject: Re: [Users] Cinder Integration

 On 11/19/2013 06:00 AM, Udaya Kiran P wrote:
  Hi All,

  I want to consume the oVirt Storage Domains in OpenStack Cinder.

  Is this driver available or are there any resources pointing on how this
  can be done?

 Federico - was that your sample driver or Oved's?

It was Oved's. I have the feeling that I did some research in that area
as well but then I moved to glance.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Urgent: Export NFS Migration issue oVirt 3.0 - 3.2.1

2013-11-08 Thread Federico Simoncelli

- Original Message -
 From: Sven Knohsalla s.knohsa...@netbiscuits.com
 To: users@ovirt.org
 Sent: Friday, November 8, 2013 10:32:32 AM
 Subject: Re: [Users] Urgent: Export NFS Migration issue oVirt 3.0 - 3.2.1
 
 Hi,
 
 I could eliminate this issue to our oVirt 3.0 instance, as the pool_uuid 
 SHA checksum in metadata on NFS Export wasn't cleared properly from engine
 3.0.
 (/NFSmountpoint/storage-pool-id/dom_md/metadata)

Hi Sven,
 can you send the original metadata and the relevant vdsm logs?

If I read your engine logs correctly we need the vdsm logs from the host
deovn-a01 (vds id 66b546c2-ae62-11e1-b734-5254005cbe44) around the same
time this was issued:

2013-11-08 08:12:49,075 INFO  
[org.ovirt.engine.core.bll.storage.ConnectStorageToVdsCommand] 
(pool-5-thread-39) Running command: ConnectStorageToVdsCommand internal: true. 
Entities affected :  ID: aaa0----123456789aaa Type: System
2013-11-08 08:12:49,079 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] 
(pool-5-thread-39) START, ConnectStorageServerVDSCommand(vdsId = 
66b546c2-ae62-11e1-b734-5254005cbe44, storagePoolId = 
----, storageType = NFS, connectionList = [{ 
id: 2a84acc3-1700-45c4-bbf7-a3305b338f83, connection: 172.16.101.95:/ovirtmig02 
};]), log id: 2482b112
2013-11-08 08:12:52,092 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand] 
(pool-5-thread-50) FINISH, ConnectStorageServerVDSCommand, return: 
{2a84acc3-1700-45c4-bbf7-a3305b338f83=451}, log id: 7dcfb51f
2013-11-08 08:12:52,099 ERROR 
[org.ovirt.engine.core.bll.storage.NFSStorageHelper] (pool-5-thread-50) The 
connection with details 172.16.101.95:/ovirtmig02 failed because of error code 
451 and error message is: error storage server connection

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Live storage migration snapshot removal (fails)

2013-11-08 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Sander Grendelman san...@grendelman.com
 Cc: users@ovirt.org, fsimo...@redhat.com, ykap...@redhat.com
 Sent: Friday, November 8, 2013 4:06:53 PM
 Subject: Re: [Users] Live storage migration snapshot removal (fails)
 
 On Fri, Nov 08, 2013 at 02:20:39PM +0100, Sander Grendelman wrote:
 
 snip
 
  9d4e8a43-4851-42ff-a684-f3d802527cf7/c512267d-ebba-4907-a782-fec9b6c95116
  52178458-1764-4317-b85b-71843054aae9::WARNING::2013-11-08
  14:02:53,772::image::1164::Storage.Image::(merge) Auto shrink after
  merge failed
  Traceback (most recent call last):
File /usr/share/vdsm/storage/image.py, line 1162, in merge
  srcVol.shrinkToOptimalSize()
File /usr/share/vdsm/storage/blockVolume.py, line 320, in
  shrinkToOptimalSize
  qemuImg.FORMAT.QCOW2)
File /usr/lib64/python2.6/site-packages/vdsm/qemuImg.py, line 109, in
check
  raise QImgError(rc, out, err, unable to parse qemu-img check output)
  QImgError: ecode=0, stdout=['No errors were found on the image.'],
  stderr=[], message=unable to parse qemu-img check output
 
 
 I'm not sure that it's the only problem in this flow, but there's a
 clear bug in lib/vdsm/qemuImg.py's check() function: it fails to parse
 the output of qemu-img.
 
 Would you open a bug on that? I found no open one.

I remember that this was discussed and the agreement was that if the offset
is not reported by qemu-img we should have used the old method to calculate
the new volume size.

We'll probably need to verify it. Sander can you open a bug on this?

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] oVirt Weekly Meeting Minutes -- 2013-10-09

2013-10-10 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Mike Burns mbu...@redhat.com
 Cc: bo...@ovirt.org, users users@ovirt.org
 Sent: Wednesday, October 9, 2013 5:45:23 PM
 Subject: Re: [Users] oVirt Weekly Meeting Minutes -- 2013-10-09
 
 On Wed, Oct 09, 2013 at 11:15:41AM -0400, Mike Burns wrote:
  Minutes:
  http://ovirt.org/meetings/ovirt/2013/ovirt.2013-10-09-14.06.html
  Minutes (text):
  http://ovirt.org/meetings/ovirt/2013/ovirt.2013-10-09-14.06.txt
  Log:
  http://ovirt.org/meetings/ovirt/2013/ovirt.2013-10-09-14.06.log.html
  
  =
  #ovirt: oVirt Weekly sync
  =
  
  
  Meeting started by mburns at 14:06:41 UTC. The full logs are available
  at http://ovirt.org/meetings/ovirt/2013/ovirt.2013-10-09-14.06.log.html
  .
  
  Meeting summary
  ---
  * agenda and roll call  (mburns, 14:07:00)
* 3.3 updates  (mburns, 14:07:17)
* 3.4 planning  (mburns, 14:07:24)
* conferences and workshops  (mburns, 14:07:31)
* infra update  (mburns, 14:07:34)
  
  * 3.3 updates  (mburns, 14:08:42)
* 3.3.0.1 vdsm packages are posted to updates-testing  (mburns,
  14:09:04)
* LINK: https://bugzilla.redhat.com/show_bug.cgi?id=1009100
  (sbonazzo, 14:10:33)
* 2 open bugs blocking 3.3.0.1  (mburns, 14:29:35)
* 1 is deferred due to qemu-kvm feature set in el6  (mburns, 14:29:49)
* other is allowed versions for vdsm  (mburns, 14:30:01)
* vdsm version bug will be backported to 3.3.0.1 today  (mburns,
  14:30:13)
* ACTION: sbonazzo to build engine 3.3.0.1 tomorrow  (mburns,
  14:30:22)
* ACTION: mburns to post 3.3.0.1 to ovirt.org tomorrow  (mburns,
  14:30:32)
* expected release:  next week  (mburns, 14:30:46)
* ACTION: danken and sbonazzo to provide release notes for 3.3.0.1
  (mburns, 14:37:56)
 
 
 A vdsm bug (BZ#1007980) made it impossible to migrate or re-run a VM
 with a glusterfs-backed virtual disk if the VM was originally started
 with an empty cdrom.
 
 If you have encountered this bug, you would have to manually find the
 affected VMs with
 
 psql -U engine -d engine -c select distinct vm_name from vm_static,
 vm_device where vm_guid=vm_id and device='cdrom' and address ilike
 '%pci%';
 
 and remove their junk cdrom address with
 
 psql -U engine -d engine -c update vm_device set address='' where
 device='cdrom' and address ilike '%pci%';
 

We are currently building vdsm-4.12.1-4 that is carrying a new critical
fix related to drive resize (extension). Release notes:


A vdsm bug introduced in a specific case of disk resize (raw on nfs)
accidentally wipes the content of the virtual disk.
The issue was masked on the master (and ovirt-3.3) branch by an unrelated
change that happened to fix the problem leaving only vdsm-4.12 affected.
It is of critical importance to update all your machines to the new vdsm
release.


-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] VM wont restart after some NFS snapshot restore.

2013-10-04 Thread Federico Simoncelli

Hi Usman,
 can you paste somewhere the content of the meta files?

 $ cat 039a8482-c267-4051-b1e6-1c1dee49b3d7.meta 
8d48505d-846d-49a7-8b50-d972ee051145.meta

could you also provide the absolute path to those files? (in the vdsm host)

Thanks,
-- 
Federico

- Original Message -
 From: Usman Aslam us...@linkshift.com
 To: users@ovirt.org
 Sent: Thursday, October 3, 2013 4:29:43 AM
 Subject: [Users] VM wont restart after some NFS snapshot restore.
 
 I have some VM's that live on NFS share. Basically, I had to revert the VM
 disk to a backup from a few days ago. So I powered the VM down, copied over
 the following files
 
 039a8482-c267-4051-b1e6-1c1dee49b3d7
 039a8482-c267-4051-b1e6-1c1dee49b3d7.lease
 039a8482-c267-4051-b1e6-1c1dee49b3d7.meta
 8d48505d-846d-49a7-8b50-d972ee051145
 8d48505d-846d-49a7-8b50-d972ee051145.lease
 8d48505d-846d-49a7-8b50-d972ee051145.meta
 
 and now when I try to power the VM, it complains
 
 2013-Oct-02, 22:02:38
 Failed to run VM zabbix-prod-01 (User: admin@internal).
 2013-Oct-02, 22:02:38
 Failed to run VM zabbix-prod-01 on Host
 tss-tusk-ovirt-01-ovirtmgmt.tusk.tufts.edu .
 2013-Oct-02, 22:02:38
 VM zabbix-prod-01 is down. Exit message: 'truesize'.
 
 Any ideas on how I could resolve this? Perhaps a better way of approaching
 the restore on a filesystem level?
 
 I see the following the vsdm.log
 
 Thread-7843::ERROR::2013-10-02
 22:02:37,548::vm::716::vm.Vm::(_startUnderlyingVm)
 vmId=`8e8764ad-6b4c-48d8-9a19-fa5cf77208ef`::The vm start process failed
 Traceback (most recent call last):
 File /usr/share/vdsm/vm.py, line 678, in _startUnderlyingVm
 self._run()
 File /usr/share/vdsm/libvirtvm.py, line 1467, in _run
 devices = self.buildConfDevices()
 File /usr/share/vdsm/vm.py, line 515, in buildConfDevices
 self._normalizeVdsmImg(drv)
 File /usr/share/vdsm/vm.py, line 408, in _normalizeVdsmImg
 drv['truesize'] = res['truesize']
 KeyError: 'truesize'
 Thread-7843::DEBUG::2013-10-02 22:02:37,553::vm::1065::vm.Vm::(setDownStatus)
 vmId=`8e8764ad-6b4c-48d8-9a19-fa5cf77208ef`::Changed state to Down:
 'truesize'
 
 
 Any help would be really nice, thanks!
 --
 Usman
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] VM wont restart after some NFS snapshot restore.

2013-10-04 Thread Federico Simoncelli

Usman,
 the only thing that comes to my mind is something related to:

http://gerrit.ovirt.org/13529

which means that in some way the restored volumes are either inaccessible
(permissions?) or their metadata is corrupted (but it doesn't seem so).

There is probably another Traceback in the logs that should give us more
information.

Could you post somewhere the entire vdsm log?

Thanks.
-- 
Federico

- Original Message -
 From: Usman Aslam us...@linkshift.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: users@ovirt.org
 Sent: Friday, October 4, 2013 4:49:08 PM
 Subject: Re: [Users] VM wont restart after some NFS snapshot restore.
 
 Federico,
 
 The files reside on this mount on the hypervisor
 /rhev/data-center/mnt/xyz-02.tufts.edu:
 _vol_tusk__vm_tusk__vm/fa3279ec-2912-45ac-b7bc-9fe89151ed99/images/79ccd989-3033-4e6a-80da-ba210c94225a
 
 and are symlinked as described below
 
 [root@xyz-02 430cd986-6488-403b-8d46-29abbc3eba38]# pwd
 /rhev/data-center/430cd986-6488-403b-8d46-29abbc3eba38
 [root@xyz-02 430cd986-6488-403b-8d46-29abbc3eba38]# ll
 total 12
 lrwxrwxrwx 1 vdsm kvm 120 Oct  3 12:35 ee2ae498-6e45-448d-8f91-0efca377dcf6
 - /rhev/data-center/mnt/xyz-02.tufts.edu:
 _vol_tusk__iso_tusk__iso/ee2ae498-6e45-448d-8f91-0efca377dcf6
 lrwxrwxrwx 1 vdsm kvm 118 Oct  3 12:35 fa3279ec-2912-45ac-b7bc-9fe89151ed99
 - /rhev/data-center/mnt/xyz-02.tufts.edu:
 _vol_tusk__vm_tusk__vm/fa3279ec-2912-45ac-b7bc-9fe89151ed99
 lrwxrwxrwx 1 vdsm kvm 118 Oct  3 12:35 mastersd -
 /rhev/data-center/mnt/xyz-02.tufts.edu:
 _vol_tusk__vm_tusk__vm/fa3279ec-2912-45ac-b7bc-9fe89151ed99
 
 I did a diff and the contents of the the *Original *meta file (that works
 and VM starts but have bad file system) and the *Backup *meta file (the
 files being restored from nfs snapshot) *are exactly the same*.
 Contents are listed blow. Also the files sizes for all six related files
 are exactly the same.
 
 [root@xyz-02 images]# cat
 79ccd989-3033-4e6a-80da-ba210c94225a/039a8482-c267-4051-b1e6-1c1dee49b3d7.meta
 DOMAIN=fa3279ec-2912-45ac-b7bc-9fe89151ed99
 VOLTYPE=SHARED
 CTIME=1368457020
 FORMAT=RAW
 IMAGE=59b6a429-bd11-40c6-a218-78df840725c6
 DISKTYPE=2
 PUUID=----
 LEGALITY=LEGAL
 MTIME=1368457020
 POOL_UUID=
 DESCRIPTION=Active VM
 TYPE=SPARSE
 SIZE=104857600
 EOF
 [root@tss-tusk-ovirt-02 images]# cat
 79ccd989-3033-4e6a-80da-ba210c94225a/8d48505d-846d-49a7-8b50-d972ee051145.meta
 DOMAIN=fa3279ec-2912-45ac-b7bc-9fe89151ed99
 CTIME=1370303194
 FORMAT=COW
 DISKTYPE=2
 LEGALITY=LEGAL
 SIZE=104857600
 VOLTYPE=LEAF
 DESCRIPTION=
 IMAGE=79ccd989-3033-4e6a-80da-ba210c94225a
 PUUID=039a8482-c267-4051-b1e6-1c1dee49b3d7
 MTIME=1370303194
 POOL_UUID=
 TYPE=SPARSE
 EOF
 
 Any help would be greatly appreciated!
 
 Thanks,
 Usman
 
 
 On Fri, Oct 4, 2013 at 9:50 AM, Federico Simoncelli
 fsimo...@redhat.comwrote:
 
  Hi Usman,
   can you paste somewhere the content of the meta files?
 
   $ cat 039a8482-c267-4051-b1e6-1c1dee49b3d7.meta
  8d48505d-846d-49a7-8b50-d972ee051145.meta
 
  could you also provide the absolute path to those files? (in the vdsm host)
 
  Thanks,
  --
  Federico
 
  - Original Message -
   From: Usman Aslam us...@linkshift.com
   To: users@ovirt.org
   Sent: Thursday, October 3, 2013 4:29:43 AM
   Subject: [Users] VM wont restart after some NFS snapshot restore.
  
   I have some VM's that live on NFS share. Basically, I had to revert the
  VM
   disk to a backup from a few days ago. So I powered the VM down, copied
  over
   the following files
  
   039a8482-c267-4051-b1e6-1c1dee49b3d7
   039a8482-c267-4051-b1e6-1c1dee49b3d7.lease
   039a8482-c267-4051-b1e6-1c1dee49b3d7.meta
   8d48505d-846d-49a7-8b50-d972ee051145
   8d48505d-846d-49a7-8b50-d972ee051145.lease
   8d48505d-846d-49a7-8b50-d972ee051145.meta
  
   and now when I try to power the VM, it complains
  
   2013-Oct-02, 22:02:38
   Failed to run VM zabbix-prod-01 (User: admin@internal).
   2013-Oct-02, 22:02:38
   Failed to run VM zabbix-prod-01 on Host
   tss-tusk-ovirt-01-ovirtmgmt.tusk.tufts.edu .
   2013-Oct-02, 22:02:38
   VM zabbix-prod-01 is down. Exit message: 'truesize'.
  
   Any ideas on how I could resolve this? Perhaps a better way of
  approaching
   the restore on a filesystem level?
  
   I see the following the vsdm.log
  
   Thread-7843::ERROR::2013-10-02
   22:02:37,548::vm::716::vm.Vm::(_startUnderlyingVm)
   vmId=`8e8764ad-6b4c-48d8-9a19-fa5cf77208ef`::The vm start process failed
   Traceback (most recent call last):
   File /usr/share/vdsm/vm.py, line 678, in _startUnderlyingVm
   self._run()
   File /usr/share/vdsm/libvirtvm.py, line 1467, in _run
   devices = self.buildConfDevices()
   File /usr/share/vdsm/vm.py, line 515, in buildConfDevices
   self._normalizeVdsmImg(drv)
   File /usr/share/vdsm/vm.py, line 408, in _normalizeVdsmImg
   drv['truesize'] = res['truesize']
   KeyError: 'truesize'
   Thread-7843::DEBUG::2013-10-02
  22:02:37,553::vm::1065::vm.Vm::(setDownStatus)
   vmId=`8e8764ad-6b4c

Re: [Users] Resizing disks destroys contents

2013-10-01 Thread Federico Simoncelli

Hi Martijn,
 can you post somewhere the relevant vdsm logs (from the spm host)?

Thanks,
-- 
Federico


- Original Message -
 From: Martijn Grendelman martijn.grendel...@isaac.nl
 To: users@ovirt.org
 Sent: Tuesday, October 1, 2013 5:01:39 PM
 Subject: [Users] Resizing disks destroys contents
 
 Hi,
 
 I just tried out another feature of oVirt and again, I am shocked by the
 results.
 
 I did the following:
 - create new VM based on an earlier created template, with 20 GB disk
 - Run the VM - boots fine
 - Shut down the VM
 - Via Disks - Edit - Extend size by(GB) add 20 GB to the disk
 - Run the VM
 
 Result: no bootable device. Linux installation gone.
 
 Just to be sure, I booted the VM with a gparted live iso, and gparted
 reports the entire 40 GB as unallocated space.
 
 Where's my data? What's wrong with my oVirt installation? What am I
 doing wrong?
 
 Regards,
 Martijn.
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] vdsm live migration errors in latest master

2013-09-26 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: Dead Horse deadhorseconsult...@gmail.com, users users@ovirt.org, 
 vdsm-de...@fedorahosted.org,
 aba...@redhat.com
 Sent: Thursday, September 26, 2013 1:38:15 AM
 Subject: Re: [Users] vdsm live migration errors in latest master

 On Tue, Sep 24, 2013 at 12:04:14PM -0400, Federico Simoncelli wrote:
  - Original Message -
   From: Dan Kenigsberg dan...@redhat.com
   To: Dead Horse deadhorseconsult...@gmail.com
   Cc: users@ovirt.org users@ovirt.org, vdsm-de...@fedorahosted.org,
   fsimo...@redhat.com, aba...@redhat.com
   Sent: Tuesday, September 24, 2013 11:44:48 AM
   Subject: Re: [Users] vdsm live migration errors in latest master

   On Mon, Sep 23, 2013 at 04:05:34PM -0500, Dead Horse wrote:
Seeing failed live migrations and these errors in the vdsm logs with
latest
VDSM/Engine master.
Hosts are EL6.4

   Thanks for posting this report.

   The log is from the source of migration, right?
   Could you trace the history of the hosts of this VM? Could it be that it
   was started on an older version of vdsm (say ovirt-3.3.0) and then (due
   to migration or vdsm upgrade) got into a host with a much newer vdsm?

   Would you share the vmCreate (or vmMigrationCreate) line for this Vm in
   your log? I smells like an unintended regression of
   http://gerrit.ovirt.org/17714
   vm: extend shared property to support locking

   solving it may not be trivial, as we should not call
   _normalizeDriveSharedAttribute() automatically on migration destination,
   as it may well still be apart of a 3.3 clusterLevel.

   Also, migration from vdsm with extended shared property, to an ovirt 3.3
   vdsm is going to explode (in a different way), since the destination
   does not expect the extended values.

   Federico, do we have a choice but to revert that patch, and use
   something like shared3 property instead?

  I filed a bug at:

  https://bugzilla.redhat.com/show_bug.cgi?id=1011608

  A possible fix could be:

  http://gerrit.ovirt.org/#/c/19509

 Beyond this, we must make sure that on Engine side, the extended shared
 values would be used only for clusterLevel 3.4 and above.

 Are the extended shared values already used by Engine?

Yes. That's the idea. Actually to be fair, the second case you mentioned
(migrating from extended shared property to old vdsm) it wouldn't have been
possible I suppose (the issue here is that Dead Horse has one or more
hosts running on master instead of 3.3). The extended shared property would
have appeared only in 3.4 and to allow the migration you would have had to
upgrade all the nodes.

But anyway since we were also talking about a new 3.3.1 barnch I just went
ahead and covered all cases.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] vdsm live migration errors in latest master

2013-09-26 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: Dead Horse deadhorseconsult...@gmail.com, users users@ovirt.org, 
 vdsm-de...@fedorahosted.org,
 aba...@redhat.com
 Sent: Thursday, September 26, 2013 2:09:15 PM
 Subject: Re: [Users] vdsm live migration errors in latest master

 On Thu, Sep 26, 2013 at 05:35:46AM -0400, Federico Simoncelli wrote:
  - Original Message -
   From: Dan Kenigsberg dan...@redhat.com
   To: Federico Simoncelli fsimo...@redhat.com
   Cc: Dead Horse deadhorseconsult...@gmail.com, users
   users@ovirt.org, vdsm-de...@fedorahosted.org,
   aba...@redhat.com
   Sent: Thursday, September 26, 2013 1:38:15 AM
   Subject: Re: [Users] vdsm live migration errors in latest master

   On Tue, Sep 24, 2013 at 12:04:14PM -0400, Federico Simoncelli wrote:
- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Dead Horse deadhorseconsult...@gmail.com
 Cc: users@ovirt.org users@ovirt.org,
 vdsm-de...@fedorahosted.org,
 fsimo...@redhat.com, aba...@redhat.com
 Sent: Tuesday, September 24, 2013 11:44:48 AM
 Subject: Re: [Users] vdsm live migration errors in latest master

 On Mon, Sep 23, 2013 at 04:05:34PM -0500, Dead Horse wrote:
  Seeing failed live migrations and these errors in the vdsm logs
  with
  latest
  VDSM/Engine master.
  Hosts are EL6.4

 Thanks for posting this report.

 The log is from the source of migration, right?
 Could you trace the history of the hosts of this VM? Could it be that
 it
 was started on an older version of vdsm (say ovirt-3.3.0) and then
 (due
 to migration or vdsm upgrade) got into a host with a much newer vdsm?

 Would you share the vmCreate (or vmMigrationCreate) line for this Vm
 in
 your log? I smells like an unintended regression of
 http://gerrit.ovirt.org/17714
 vm: extend shared property to support locking

 solving it may not be trivial, as we should not call
 _normalizeDriveSharedAttribute() automatically on migration
 destination,
 as it may well still be apart of a 3.3 clusterLevel.

 Also, migration from vdsm with extended shared property, to an ovirt
 3.3
 vdsm is going to explode (in a different way), since the destination
 does not expect the extended values.

 Federico, do we have a choice but to revert that patch, and use
 something like shared3 property instead?

I filed a bug at:

https://bugzilla.redhat.com/show_bug.cgi?id=1011608

A possible fix could be:

http://gerrit.ovirt.org/#/c/19509

   Beyond this, we must make sure that on Engine side, the extended shared
   values would be used only for clusterLevel 3.4 and above.

   Are the extended shared values already used by Engine?

  Yes. That's the idea. Actually to be fair, the second case you mentioned
  (migrating from extended shared property to old vdsm) it wouldn't have been
  possible I suppose (the issue here is that Dead Horse has one or more
  hosts running on master instead of 3.3). The extended shared property would
  have appeared only in 3.4 and to allow the migration you would have had to
  upgrade all the nodes.

  But anyway since we were also talking about a new 3.3.1 barnch I just went
  ahead and covered all cases.

 I do not see how the 3.3.1 branch is relevant to the discussion, as its
 Vdsm is NOT going to support clusterLevel 3.4.

That is what I was referring to.

If 3.3.1 was 3.3.0 + backported patches then we just wouldn't backport the
extended shared attributes patch and that's it. But from what I understood
3.3.1 will be rebased on master (where instead we have the extended shared
attributes) and that is why we have to cover both migration direction cases
(instead of just the simple getattr one).

 Pardon my slowliness, but would you confirm that this feature is to be
 used only on clusterLevel 3.4 and above? If so, I'm +2ing your patch.

Yes, the extended attributes will be used in the hosted engine and cluster
level 3.4.
But what the engine does is not relevant to +2ing correct vdsm patches.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] vdsm live migration errors in latest master

2013-09-24 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Dead Horse deadhorseconsult...@gmail.com
 Cc: users@ovirt.org users@ovirt.org, vdsm-de...@fedorahosted.org, 
 fsimo...@redhat.com, aba...@redhat.com
 Sent: Tuesday, September 24, 2013 11:44:48 AM
 Subject: Re: [Users] vdsm live migration errors in latest master

 On Mon, Sep 23, 2013 at 04:05:34PM -0500, Dead Horse wrote:
  Seeing failed live migrations and these errors in the vdsm logs with latest
  VDSM/Engine master.
  Hosts are EL6.4

 Thanks for posting this report.

 The log is from the source of migration, right?
 Could you trace the history of the hosts of this VM? Could it be that it
 was started on an older version of vdsm (say ovirt-3.3.0) and then (due
 to migration or vdsm upgrade) got into a host with a much newer vdsm?

 Would you share the vmCreate (or vmMigrationCreate) line for this Vm in
 your log? I smells like an unintended regression of
 http://gerrit.ovirt.org/17714
 vm: extend shared property to support locking

 solving it may not be trivial, as we should not call
 _normalizeDriveSharedAttribute() automatically on migration destination,
 as it may well still be apart of a 3.3 clusterLevel.

 Also, migration from vdsm with extended shared property, to an ovirt 3.3
 vdsm is going to explode (in a different way), since the destination
 does not expect the extended values.

 Federico, do we have a choice but to revert that patch, and use
 something like shared3 property instead?

I filed a bug at:

https://bugzilla.redhat.com/show_bug.cgi?id=1011608

A possible fix could be:

http://gerrit.ovirt.org/#/c/19509

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] oVirt 3.3.0-4/F19 - Extending VM disk gives correct size but appears to wipe the drive contents

2013-09-23 Thread Federico Simoncelli

Hi Chris,
 can you post the vdsm logs (spm host) somewhere?
Thanks.

-- 
Federico

- Original Message -
 From: Chris SULLIVAN (WGK) chris.sulli...@woodgroupkenny.com
 To: users@ovirt.org
 Sent: Monday, September 23, 2013 9:08:26 PM
 Subject: [Users] oVirt 3.3.0-4/F19 - Extending VM disk gives correct size but 
 appears to wipe the drive contents

 Hi,

 I had a number of Windows VMs running in oVirt 3.3 that required their
 preallocated OS disks to be extended. Each OS disk had a single partition
 taking up the entire drive. As per
 http://www.ovirt.org/Features/Online_Virtual_Drive_Resize I shut down all
 the VMs, extended each OS disk by 10GB (total 25GB) via the web interface,
 then clicked OK. The tasks appeared to complete successfully and each of the
 OS disks had the expected real size on the Gluster storage volume.

 On startup however none of the VMs would recognize their OS disk as being a
 bootable device. Checking one of the OS disks via TestDisk (both quick and
 deep scans) revealed no partitions and the error ‘Partition sector doesn’t
 have the endmark 0xAA55’. It appears that each OS disk was wiped as part of
 the extension process although I’m really hoping that this isn’t the case!

 Are there any other approaches I could use to attempt to recover the OS disk
 data or at least verify whether the original disk partitions are
 recoverable?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Unable to attach to storage domain (Ovirt 3.2)

2013-09-22 Thread Federico Simoncelli

Hi Dan, it looks like one of the domains is missing:

6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50

Is there any target missing? (disconnected or somehow faulty or
unreachable)

-- 
Federico

- Original Message -
 From: Dan Ferris dfer...@prometheusresearch.com
 To: users@ovirt.org
 Sent: Friday, September 20, 2013 4:01:06 AM
 Subject: [Users] Unable to attach to storage domain (Ovirt 3.2)
 
 Hi,
 
 This is my first post to the list.  I am happy to say that we have been
 using Ovirt for 6 months with a few bumps, but it's mostly been ok.
 
 Until tonight that is...
 
 I had to do a maintenance that required rebooting both of our Hypervisor
 nodes.  Both of them run Fedora Core 18 and have been happy for months.
   After rebooting them tonight, they will not attach to the storage.  If
 it matters, the storage is a server running LIO with a Fibre Channel target.
 
 Vdsm log:
 
 Thread-22::DEBUG::2013-09-19
 21:57:09,392::misc::84::Storage.Misc.excCmd::(lambda) '/usr/bin/dd
 iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata
 bs=4096 count=1' (cwd None)
 Thread-22::DEBUG::2013-09-19
 21:57:09,400::misc::84::Storage.Misc.excCmd::(lambda) SUCCESS: err =
 '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied,
 0.000547161 s, 7.5 MB/s\n'; rc = 0
 Thread-23::DEBUG::2013-09-19
 21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm
 reload operation' got the operation mutex
 Thread-23::DEBUG::2013-09-19
 21:57:16,587::misc::84::Storage.Misc.excCmd::(lambda) u'/usr/bin/sudo
 -n /sbin/lvm vgs --config  devices { preferred_names =
 [\\^/dev/mapper/\\] ignore_suspended_devices=1 write_cache_state=0
 disable_after_error_count=3 filter = [
 \\a%360014055193f840cb3743f9befef7aa3%\\, \\r%.*%\\ ] }  global {
 locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {
 retain_min = 50  retain_days = 0 }  --noheadings --units b --nosuffix
 --separator | -o
 uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free
 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None)
 Thread-23::DEBUG::2013-09-19
 21:57:16,643::misc::84::Storage.Misc.excCmd::(lambda) FAILED: err =
 '  Volume group 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50 not found\n';
 rc = 5
 Thread-23::WARNING::2013-09-19
 21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 []
 ['  Volume group 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50 not found']
 Thread-23::DEBUG::2013-09-19
 21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm
 reload operation' released the operation mutex
 Thread-23::ERROR::2013-09-19
 21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain)
 Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50
 monitoring information
 Traceback (most recent call last):
File /usr/share/vdsm/storage/domainMonitor.py, line 182, in
 _monitorDomain
  self.domain = sdCache.produce(self.sdUUID)
File /usr/share/vdsm/storage/sdc.py, line 97, in produce
  domain.getRealDomain()
File /usr/share/vdsm/storage/sdc.py, line 52, in getRealDomain
  return self._cache._realProduce(self._sdUUID)
File /usr/share/vdsm/storage/sdc.py, line 121, in _realProduce
  domain = self._findDomain(sdUUID)
File /usr/share/vdsm/storage/sdc.py, line 152, in _findDomain
  raise se.StorageDomainDoesNotExist(sdUUID)
 StorageDomainDoesNotExist: Storage domain does not exist:
 (u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)
 
 vgs output (Note that I don't know what the device
 (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is) :
 
 [root@node01 vdsm]# vgs
Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
VG   #PV #LV #SN Attr   VSize   VFree
b358e46b-635b-4c0e-8e73-0a494602e21d   1  39   0 wz--n-   8.19t  5.88t
build  2   2   0 wz-pn- 299.75g 16.00m
fedora 1   3   0 wz--n- 557.88g 0
 
 lvs output:
 
 [root@node01 vdsm]# lvs
Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
LV   VG
   Attr  LSizePool Origin Data%  Move Log Copy%  Convert
0b8cca47-313f-48da-84f2-154810790d5a
 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
0f6f7572-8797-4d84-831b-87dbc4e1aa48
 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g
19a1473f-c375-411f-9a02-c6054b9a28d2
 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   50.00g
221144dc-51dc-46ae-9399-c0b8e030f38a
 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
2386932f-5f68-46e1-99a4-e96c944ac21b
 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a   40.00g
3e027010-931b-43d6-9c9f-eeeabbdcd47a
 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a2.00g
4257ccc2-94d5-4d71-b21a-c188acbf7ca1
 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  200.00g
4979b2a4-04aa-46a1-be0d-f10be0a1f587
 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a  100.00g

Re: [Users] NFS troubleshooting page

2013-09-11 Thread Federico Simoncelli

- Original Message -
 From: Itamar Heim ih...@redhat.com
 To: Markus Stockhausen stockhau...@collogia.de
 Cc: Ayal Baron aba...@redhat.com, Federico Simoncelli 
 fsimo...@redhat.com, users@ovirt.org, Allon Mureinik
 amure...@redhat.com
 Sent: Monday, September 9, 2013 3:45:56 PM
 Subject: Re: AW: [Users] NFS troubleshooting page

 On 09/09/2013 04:42 PM, Markus Stockhausen wrote:
  ayal/federico - thoughts on how we can make things better? warn? etc?

  Got my account and already modified the wiki.

  Thanks for the help.

  Markus

 thanks markus. my question to ayal/federico is on how to fix the
 original issue to at least warn about the issue.

Good question. I don't have an answer yet, but I'll look into it.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[Users] oVirt - Glance Integration Deep Dive Session

2013-07-29 Thread Federico Simoncelli

BEGIN:VCALENDAR
PRODID:Zimbra-Calendar-Provider
VERSION:2.0
METHOD:REQUEST
BEGIN:VTIMEZONE
TZID:Europe/Berlin
BEGIN:STANDARD
DTSTART:16010101T03
TZOFFSETTO:+0100
TZOFFSETFROM:+0200
RRULE:FREQ=YEARLY;WKST=MO;INTERVAL=1;BYMONTH=10;BYDAY=-1SU
TZNAME:CET
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:16010101T02
TZOFFSETTO:+0200
TZOFFSETFROM:+0100
RRULE:FREQ=YEARLY;WKST=MO;INTERVAL=1;BYMONTH=3;BYDAY=-1SU
TZNAME:CEST
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:8ea8a413-e9de-46be-be2d-8f83ca22df3a
SUMMARY:oVirt - Glance Integration Deep Dive Session
ATTENDEE;CN=oVirt Users;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=TRUE
 :mailto:users@ovirt.org
ATTENDEE;CN=engine-devel;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=TRU
 E:mailto:engine-de...@ovirt.org
ORGANIZER;CN=Federico Simoncelli:mailto:fsimo...@redhat.com
DTSTART;TZID=Europe/Berlin:20130730T15
DTEND;TZID=Europe/Berlin:20130730T16
STATUS:CONFIRMED
CLASS:PUBLIC
X-MICROSOFT-CDO-INTENDEDSTATUS:BUSY
TRANSP:OPAQUE
LAST-MODIFIED:20130729T094421Z
DTSTAMP:20130729T094421Z
SEQUENCE:0
DESCRIPTION:The following is a new meeting request:\n\nSubject: oVirt - Glan
 ce Integration Deep Dive Session \nOrganizer: Federico Simoncelli fsimonc
 e...@redhat.com \n\nTime: Tuesday\, July 30\, 2013\, 3:00:00 PM - 4:00:00 PM G
 MT +01:00 Amsterdam\, Berlin\, Bern\, Rome\, Stockholm\, Vienna\n \nInvitees
 : users@ovirt.org\; engine-de...@ovirt.org \n\n\n*~*~*~*~*~*~*~*~*~*\n\nHi e
 veryone\,\n on Tuesday at 3pm (CEST) I will be presenting the recent work do
 ne in\nintegrating OpenStack Glance into oVirt 3.3.\n\nThe presentation will
  include both a high level overview (usage in webadmin)\nand a deep dive abo
 ut the low level implementation details.\n\nWhen:\nTue 30 Jul 2013 15:00 - 1
 6:00 (CEST)\n\nWhere:\nhttps://sas.elluminate.com/m.jnlp?sid=819password=M.
 9E565882E4EA0288E3479F3D2141BD\n\nBridge: 8425973915#\nPhone numbers: http:/
 /www.ovirt.org/Intercall\n\n-- \nFederico\n
BEGIN:VALARM
ACTION:DISPLAY
TRIGGER;RELATED=START:-PT5M
DESCRIPTION:Reminder
END:VALARM
END:VEVENT
END:VCALENDAR___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] VM crashes and doesn't recover

2013-04-09 Thread Federico Simoncelli

- Original Message -
 From: Yuval M yuva...@gmail.com
 To: Dan Kenigsberg dan...@redhat.com
 Cc: users@ovirt.org, Nezer Zaidenberg nzaidenb...@mac.com
 Sent: Friday, March 29, 2013 2:19:43 PM
 Subject: Re: [Users] VM crashes and doesn't recover

 Any ideas on what can cause that storage crash?
 could it be related to using a SSD?

What the logs say is that the IO on the storage domain are failing (both
the oop timeouts and the sanlock log) and this triggers the VDSM restart.

 On Sun, Mar 24, 2013 at 09:50:02PM +0200, Yuval M wrote:
  I am running vdsm from packages as my interest is in developing for the
  I noticed that when the storage domain crashes I can't even do df -h
  (hangs)

This is also consistent with the unreachable domain.

The dmesg log that you attached doesn't contain timestamps so it's hard to
correlate with the rest.

If you want you can try to reproduce the issue and resubmit the logs:

/var/log/vdsm/vdsm.log
/var/log/sanlock.log
/var/log/messages

(Maybe stating also the exact time when the issue begins to appear)

In the logs I noticed that you're using only one NFS domain, and I think that
the SSD (on the storage side) shouldn't be a problem. When you experience such
failure are you able to read/write from/to the SSD on machine that is serving
the share? (If it's the same machine check that using the real path where
it's mounted, not the nfs share)

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Failing to attach NFS data storage domain (Ovirt 3.2)

2013-03-20 Thread Federico Simoncelli

- Original Message -
 From: Limor Gavish lgav...@gmail.com
 To: Eli Mesika emes...@redhat.com
 Cc: Yuval M yuva...@gmail.com, users@ovirt.org, Nezer Zaidenberg 
 nzaidenb...@mac.com
 Sent: Wednesday, March 20, 2013 11:47:49 AM
 Subject: Re: [Users] Failing to attach NFS data storage domain (Ovirt 3.2)

 Thank you very much for your reply.
 I attached the vdsm.log

Hi Limor, can you please inspect the status of the NFS mount?

# mkdir /mnt/tmp
# mount -t nfs your_nfs_share /mnt/tmp
# cd /mnt/tmp/1902354b-4c39-4707-ac6c-3637aaf1943b/dom_md

And please report the output of:

# ls -l

# sanlock direct dump ids

Can you also include more vdsm logs? More specifically the ones where the
NFS domain has been created?
(createStorageDomain with sdUUID='1902354b-4c39-4707-ac6c-3637aaf1943b')

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Failing to attach NFS data storage domain (Ovirt 3.2)

2013-03-20 Thread Federico Simoncelli

- Original Message -
 From: Limor Gavish lgav...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: Yuval M yuva...@gmail.com, users@ovirt.org, Nezer Zaidenberg 
 nzaidenb...@mac.com, Eli Mesika
 emes...@redhat.com, Maor Lipchuk mlipc...@redhat.com
 Sent: Wednesday, March 20, 2013 9:02:35 PM
 Subject: Re: [Users] Failing to attach NFS data storage domain (Ovirt 3.2)

 Thank you very much for your response.

 Attached VDSM logs as you requested (The VDSM logs where the NFS
 domain was created were missing so we had to recreate the NFS
 domain, therefore the sdUUID has changed).
 Here is the rest of the commands you asked:

 [root@bufferoverflow wil]# mount -t nfs
 bufferoverflow:/home/BO_Ovirt_Storage /mnt/tmp
 [root@bufferoverflow wil]# cd
 /mnt/tmp/1083422e-a5db-41b6-b667-b9ef1ef244f0/dom_md/
 [root@bufferoverflow dom_md]# ls -l
 total 2052
 -rw-rw 1 vdsm kvm 1048576 Mar 20 21:46 ids
 -rw-rw 1 vdsm kvm 0 Mar 20 21:45 inbox
 -rw-rw 1 vdsm kvm 2097152 Mar 20 21:45 leases
 -rw-r--r-- 1 vdsm kvm 311 Mar 20 21:45 metadata
 -rw-rw 1 vdsm kvm 0 Mar 20 21:45 outbox
 [root@bufferoverflow dom_md]# sanlock direct dump ids

Sorry I should have mentioned that if you use root_squash for your nfs
share you have to switch to the vdsm user:

(root)# su -s /bin/sh vdsm
(vdsm)$ cd /mnt/tmp/sduuid/dom_md/

(vdsm)$ sanlock direct dump ids
(and now you should be able to see the output)

If the output is still empty then used hexdump -C to inspect it
(and eventually post it here compressed).

Another important thing that you should check is:

# ps fax | grep sanlock

If the output doesn't look like the following:

 1966 ?SLs0:00 wdmd -G sanlock
 2036 ?SLsl   0:00 sanlock daemon -U sanlock -G sanlock
 2037 ?S  0:00  \_ sanlock daemon -U sanlock -G sanlock

Then I suggest you to update sanlock to the latest build:

http://koji.fedoraproject.org/koji/buildinfo?buildID=377815
(sanlock-2.6-7.fc18)

And eventually if after rebooting the problem persists, please post
also the sanlock log (/var/log/sanlock.log)

 Please note, the VDSM is running as a system service (it was
 installed from a package) while ovirt-engine was built from sources
 and thus is not running as root. Is this an issue?

It shouldn't be.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] oVirt 3.2 on CentOS with Gluster 3.3

2013-03-13 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Balamurugan Arumugam barum...@redhat.com, Federico Simoncelli 
 fsimo...@redhat.com, Mike Burns
 mbu...@redhat.com
 Cc: Rob Zwissler r...@zwissler.org, users@ovirt.org, a...@ovirt.org, 
 Aravinda VK avish...@redhat.com, Ayal
 Baron aba...@redhat.com
 Sent: Wednesday, March 13, 2013 9:03:39 PM
 Subject: Re: [Users] oVirt 3.2 on CentOS with Gluster 3.3

 On Mon, Mar 11, 2013 at 12:34:51PM +0200, Dan Kenigsberg wrote:
  On Mon, Mar 11, 2013 at 06:09:56AM -0400, Balamurugan Arumugam
  wrote:
 Rob,

 It seems that a bug in vdsm code is hiding the real issue.
 Could you do a

  sed -i s/ParseError/ElementTree.ParseError
  /usr/share/vdsm/gluster/cli.py

 restart vdsmd, and retry?

 Bala, would you send a patch fixing the ParseError issue
 (and
 adding a

 Ok, both issues have fixes which are in the ovirt-3.2 git branch.
 I believe this deserves a respin of vdsm, as having an undeclated
 requirement is impolite.

 Federico, Mike, would you take care for that?

Since we're at it... I have the feeling that this might be important
enough to be backported to 3.2 too:

http://gerrit.ovirt.org/#/c/12178/

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] F18 iSCSI/FC and latest systemd/udev

2013-02-16 Thread Federico Simoncelli

- Original Message -
 From: Jeff Bailey bai...@cs.kent.edu
 To: users@ovirt.org
 Sent: Saturday, February 16, 2013 5:17:49 AM
 Subject: [Users] F18 iSCSI/FC and latest systemd/udev

 While not an actual problem with oVirt, the latest systemd/udev
 packages for F18 (197-1) break permissions on LVM volumes and stop
 vdsm/qemu/etc from accessing them.  I just downgraded them and everything
 seems OK but I thought I'd let people know (easier to just avoid rather
 than repair :) ).  There's a bugzilla for it from a week or two ago but
 since 3.2 came out I figured a lot more people might be installing it on
 new F18 installations with all the updates and running into problems.

Please test and give karma to:

https://admin.fedoraproject.org/updates/FEDORA-2013-1775/vdsm-4.10.3-7.fc18

which is requiring the correct systemd package.
If we reach 3 points of karma the package will be released.

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] 3.2 beta and f18 host on dell R815 problem

2013-01-31 Thread Federico Simoncelli

Hi Gianluca can you post/attach/provide the output of cpuid?

# cpuid

In case it's not installed it's provided by the rpm:

cpuid-20120601-2.fc18.x86_64

Thanks,
-- 
Federico

- Original Message -
 From: Gianluca Cecchi gianluca.cec...@gmail.com
 To: users users@ovirt.org
 Sent: Thursday, January 31, 2013 12:24:41 PM
 Subject: [Users] 3.2 beta and f18 host on dell R815 problem
 
 during install of server I get this
 Host installation failed. Fix installation issues and try to
 Re-Install
 
 In deploy log
 
 
 2013-01-31 12:17:30 DEBUG
 otopi.plugins.ovirt_host_deploy.vdsm.hardware
 hardware._isVirtualizationEnabled:144 virtualization support
 GenuineIntel (cpu: False, bios: True)
 2013-01-31 12:17:30 DEBUG otopi.context context._executeMethod:127
 method exception
 Traceback (most recent call last):
   File /tmp/ovirt-SfEARpd3h4/pythonlib/otopi/context.py, line 117,
 in _executeMethod
 method['method']()
   File
   /tmp/ovirt-SfEARpd3h4/otopi-plugins/ovirt-host-deploy/vdsm/hardware.py,
 line 170, in _validate_virtualization
 _('Hardware does not support virtualization')
 RuntimeError: Hardware does not support virtualization
 2013-01-31 12:17:30 ERROR otopi.context context._executeMethod:136
 Failed to execute stage 'Setup validation': Hardware does not support
 virtualization
 
 note the GenuineIntel above... ??
 But actually it is AMD
 
 [root@f18ovn03 ~]# lsmod|grep kvm
 kvm_amd59623  0
 kvm   431794  1 kvm_amd
 
 cat /proc/cpuinfo
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] could not add local storage domain

2012-11-15 Thread Federico Simoncelli

Hi Jorick and Cristian,
 the error you posted here looks like an issue we recently fixed (vdsm
on nfs with kernel 3.6).
Anyway it's quite difficult to make a comprehensive list of things to
report and tests to execute. For this particular issue (and not as a
general rule) I suggest you to contact me on IRC (fsimonce on #ovirt
OFTC) so that we can sort out the issue together.

We will report back to the ML our findings so that it will be helpful
for everyone else.

-- 
Federico

- Original Message -
 From: Jorick Astrego jor...@netbulae.eu
 To: users@ovirt.org
 Sent: Wednesday, November 14, 2012 1:45:21 PM
 Subject: Re: [Users] could not add local storage domain
 
 - I'm not the original submitter of this issue, but I have exactly
 the same problem with the latest nightly all-in-one installation.
 
 We don't use public key auth for sshd on this machine so that's not
 the problem. This is what I see in the vdsm.log:
 

[...]

 Thread-17:: INFO::2012-11-14
 12:46:14,129::logUtils::37::dispatcher::(wrapper) Run and protect:
 connectStorageServer(domType=4,
 spUUID='----',
 conList=[{'connection': '/data', 'iqn': '', 'portal': '', 'user':
 '', 'password': '**', 'id':
 '----', 'port': ''}], options=None)
 Thread-17::ERROR::2012-11-14
 12:46:14,212::hsm::2057::Storage.HSM::(connectStorageServer) Could
 not connect to storageServer
 Traceback (most recent call last):
 File /usr/share/vdsm/storage/hsm.py, line 2054, in
 connectStorageServer
 conObj.connect()
 File /usr/share/vdsm/storage/storageServer.py, line 462, in connect
 if not self.checkTarget():
 File /usr/share/vdsm/storage/storageServer.py, line 449, in
 checkTarget
 fileSD.validateDirAccess(self._path))
 File /usr/share/vdsm/storage/fileSD.py, line 51, in
 validateDirAccess
 getProcPool().fileUtils.validateAccess(dirPath)
 File /usr/share/vdsm/storage/remoteFileHandler.py, line 274, in
 callCrabRPCFunction
 *args, **kwargs)
 File /usr/share/vdsm/storage/remoteFileHandler.py, line 180, in
 callCrabRPCFunction
 rawLength = self._recvAll(LENGTH_STRUCT_LENGTH, timeout)
 File /usr/share/vdsm/storage/remoteFileHandler.py, line 149, in
 _recvAll
 timeLeft):
 File /usr/lib64/python2.7/contextlib.py, line 84, in helper
 return GeneratorContextManager(func(*args, **kwds))
 File /usr/share/vdsm/storage/remoteFileHandler.py, line 136, in
 _poll
 raise Timeout()
 Timeout
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] SELinux policy issue with oVirt/sanlock

2012-10-24 Thread Federico Simoncelli

- Original Message -
 From: Haim Ateya hat...@redhat.com
 To: Brian Vetter bjvet...@gmail.com
 Cc: users@ovirt.org, seli...@lists.fedoraproject.org
 Sent: Wednesday, October 24, 2012 7:03:39 PM
 Subject: Re: [Users] SELinux policy issue with oVirt/sanlock

 - Original Message -
  From: Brian Vetter bjvet...@gmail.com
  To: Haim Ateya hat...@redhat.com
  Cc: users@ovirt.org, seli...@lists.fedoraproject.org
  Sent: Wednesday, October 24, 2012 6:24:31 PM
  Subject: Re: [Users] SELinux policy issue with oVirt/sanlock

  I removed lock_manager=sanlock from the settings file, restarted
  the
  daemons, and all works fine right now. I'm guessing that means
  there
  is no locking of the VMs (the default?).

 that's right, i'm glad it works for you, but it just a workaround
 since we expect this configuration to work, it would be much
 appreciated if you
 could open a bug on that issue so we can track and resolve when
 possible.
 please attach all required logs such as: vdsm.log, libvirtd.log,
 qemu.log (under /var/log/libvirt/qemu/), audit.log, sanlock.log and
 /var/log/messages.

What's the bug number? To clarify/recap:

- the lock_manager=sanlock configuration is correct (and it shouldn't
  be removed)
- you should set setenforce 0 (with lock_manager=sanlock) and try to
  start a VM; all the avc errors that you find in /var/log/messages
  and in /var/log/audit/audit.log should be used to open a selinux
  policy bug

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Error creating the first storage domain (NFS)

2012-10-23 Thread Federico Simoncelli

- Original Message -
 From: Vered Volansky ve...@redhat.com
 To: Brian Vetter bjvet...@gmail.com
 Cc: users@ovirt.org
 Sent: Tuesday, October 23, 2012 11:38:42 AM
 Subject: Re: [Users] Error creating the first storage domain (NFS)

 Hi Brian,

 We'll need your engine  host (full) logs at the very least to look
 into the problem.
 Can you try it with nfs3 and tell us if it works?

 Note, more comments in the email body.

 Regards,
 Vered

Hi Brian,
 we also need the sanlock logs (/var/log/sanlock.log).

Adding David to the thread as he might be able to help us debugging
your problem (sanlock-2.4-2.fc17).

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Error creating the first storage domain (NFS)

2012-10-23 Thread Federico Simoncelli

Hi Brian,
 I hate progressing by guesses but could you try to disable selinux:

 # setenforce 0

If that works you could go on, re-enable it and try something more
specific:

 # setenforce 1
 # setsebool sanlock_use_nfs on

I have the feeling that the vdsm patch setting the sanlock_use_nfs
sebool flag didn't made it to fedora 17 yet.
-- 
Federico

- Original Message -
 From: Brian Vetter bjvet...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: Vered Volansky ve...@redhat.com, users@ovirt.org, David Teigland 
 teigl...@redhat.com
 Sent: Tuesday, October 23, 2012 6:10:36 PM
 Subject: Re: [Users] Error creating the first storage domain (NFS)
 
 Ok. Here's four log files:
 
 engine.log from my ovirt engine server.
 vdsm.log from my host
 sanlock.log from my host
 messages from my host
 
 The errors occur around the 20:17:57 time frame. You might see other
 errors from either previous attempts or for the time after when I
 tried to attach the storage domain. It looks like everything starts
 with an error -13 in sanlock. If the -13 maps to 13/EPERM in
 errno.h, then it is likely be some kind of permission or other
 access error. I saw things that were related to the nfs directories
 not being owned by vdsm:kvm, but that is not the case here.
 
 I did see a note online about some issues with sanlock and F17 (which
 I am running), but those bugs were related to sanlock crashing.
 
 Brian
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Error creating the first storage domain (NFS)

2012-10-23 Thread Federico Simoncelli

- Original Message -
 From: Brian Vetter bjvet...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: Vered Volansky ve...@redhat.com, users@ovirt.org, David Teigland 
 teigl...@redhat.com
 Sent: Wednesday, October 24, 2012 4:54:11 AM
 Subject: Re: [Users] Error creating the first storage domain (NFS)

 That was the problem. I checked the sanlock_use_nfs boolean and it
 was off. I set it and then created and attached the storage and it
 all works.

Thanks for testing.
Do you have a way of verifying a scratch build?

http://koji.fedoraproject.org/koji/taskinfo?taskID=4620480

This should fix your problem (on a brand new installation).

-- 
Federico

 On Oct 23, 2012, at 8:55 PM, Federico Simoncelli wrote:

  Hi Brian,
  I hate progressing by guesses but could you try to disable selinux:

  # setenforce 0

  If that works you could go on, re-enable it and try something more
  specific:

  # setenforce 1
  # setsebool sanlock_use_nfs on

  I have the feeling that the vdsm patch setting the sanlock_use_nfs
  sebool flag didn't made it to fedora 17 yet.
  --
  Federico

  - Original Message -
  From: Brian Vetter bjvet...@gmail.com
  To: Federico Simoncelli fsimo...@redhat.com
  Cc: Vered Volansky ve...@redhat.com, users@ovirt.org, David
  Teigland teigl...@redhat.com
  Sent: Tuesday, October 23, 2012 6:10:36 PM
  Subject: Re: [Users] Error creating the first storage domain (NFS)

  Ok. Here's four log files:

  engine.log from my ovirt engine server.
  vdsm.log from my host
  sanlock.log from my host
  messages from my host

  The errors occur around the 20:17:57 time frame. You might see
  other
  errors from either previous attempts or for the time after when I
  tried to attach the storage domain. It looks like everything
  starts
  with an error -13 in sanlock. If the -13 maps to 13/EPERM in
  errno.h, then it is likely be some kind of permission or other
  access error. I saw things that were related to the nfs
  directories
  not being owned by vdsm:kvm, but that is not the case here.

  I did see a note online about some issues with sanlock and F17
  (which
  I am running), but those bugs were related to sanlock crashing.

  Brian
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Error creating the first storage domain (NFS)

2012-10-23 Thread Federico Simoncelli

- Original Message -
 From: Brian Vetter bjvet...@gmail.com
 To: Federico Simoncelli fsimo...@redhat.com
 Cc: Vered Volansky ve...@redhat.com, users@ovirt.org, David Teigland 
 teigl...@redhat.com
 Sent: Wednesday, October 24, 2012 5:48:21 AM
 Subject: Re: [Users] Error creating the first storage domain (NFS)

 Ugh. Spoke a little too soon. While I got past my problem creating a
 storage domain, I ran into a new sanlock issue.

 When trying to run a VM (the first one so I can create a template), I
 get an error in the admin UI:

 VM DCC4.0 is down. Exit message: Failed to acquire lock: Permission
 denied.

 On a lark, I turned off selinux enforcement and tried it again. It
 worked just fine.

 So what selinux option do I need to enable to get it to work? The
 only other sanlock specific settings I saw are:

 sanlock_use_fusefs -- off
 sanlock_use_nfs -- on
 sanlock_use_samba -- off

 Do I turn these all on or is there some other setting I need to
 enable?

No for nfs you just need sanlock_use_nfs.
I'd say that if you could verify the scratch build that I prepared at:

http://koji.fedoraproject.org/koji/taskinfo?taskID=4620480

(up until starting a vm), then all the new selinux errors/messages that
you see in the audit log (/var/log/audit/audit.log) are issues that
should be reported to the selinux-policy package.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Can't start a VM - sanlock permission denied

2012-10-15 Thread Federico Simoncelli

- Original Message -
 From: Dan Kenigsberg dan...@redhat.com
 To: Mike Burns mbu...@redhat.com
 Cc: Federico Simoncelli fsimo...@redhat.com, users@ovirt.org
 Sent: Monday, October 15, 2012 11:02:45 AM
 Subject: Re: [Users] Can't start a VM - sanlock permission denied

 On Sun, Oct 14, 2012 at 09:53:51PM -0400, Mike Burns wrote:
  On Sun, 2012-10-14 at 19:11 -0400, Federico Simoncelli wrote:
   - Original Message -
From: Alexandre Santos santosa...@gmail.com
To: Dan Kenigsberg dan...@redhat.com
Cc: Haim Ateya hat...@redhat.com, users@ovirt.org,
Federico Simoncelli fsimo...@redhat.com
Sent: Sunday, October 14, 2012 7:23:36 PM
Subject: Re: [Users] Can't start a VM - sanlock permission
denied

2012/10/13 Dan Kenigsberg  dan...@redhat.com 

On Sat, Oct 13, 2012 at 11:25:37AM +0100, Alexandre Santos
wrote:
 Hi,
 after getting to the oVirt Node console (F2) I figured out
 that
 selinux
 wasn't allowing the sanlock, so I entered the setsebool
 virt_use_sanlock 1
 and the problem is fixed.

Which version of vdsm is istalled on your node? and which
selinux-policy? sanlock should work out-of-the-box.

vdsm-4.10.0-10.fc17

on /etc/sysconfig/selinux
SELINUX=enforcing
SELINUXTYPE=targeted

   As far as I understand the selinux policies for the ovirt-node
   are set
   by recipe/common-post.ks (in the ovirt-node repo):

   semanage  boolean -m -S targeted -F /dev/stdin   \EOF_semanage
   allow_execstack=0
   virt_use_nfs=1
   EOF_semanage

   We should update it with what vdsm is currently setting:

   virt_use_sanlock=1
   sanlock_use_nfs=1

  Shouldn't vdsm be setting these if they're needed?

 It should - I'd like to know which vdsm version was it, and why this
 was skipped.

The version was 4.10.0-10.fc17 and what I thought (but I didn't test yesterday
night) is that the ovirt-node was overriding what we were setting.
Anyway this is not the case.

  I can certainly set
  the values, but IMO, if vdsm needs it, vdsm should set it.

 virt_use_nfs=1 made it into the node. Maybe there was a good reason
 for it that applies to virt_use_sanlock as well. (I really hate to
 persist the policy files, and dislike the idea of setting virt_use_sanlock
 every time vdsmd starts - it's slow).

We set them when we install vdsm (not when the service starts) so they should
be good to go in the iso. It might be a glitch during the vdsm package
installation, it could be something like semanage taking the boolean from the
host where the iso is built rather than the root where the package is installed.

Do we have the iso build logs?

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Can't start a VM - sanlock permission denied

2012-10-14 Thread Federico Simoncelli

- Original Message -
 From: Alexandre Santos santosa...@gmail.com
 To: Dan Kenigsberg dan...@redhat.com
 Cc: Haim Ateya hat...@redhat.com, users@ovirt.org, Federico Simoncelli 
 fsimo...@redhat.com
 Sent: Sunday, October 14, 2012 7:23:36 PM
 Subject: Re: [Users] Can't start a VM - sanlock permission denied

 2012/10/13 Dan Kenigsberg  dan...@redhat.com 

 On Sat, Oct 13, 2012 at 11:25:37AM +0100, Alexandre Santos wrote:
  Hi,
  after getting to the oVirt Node console (F2) I figured out that
  selinux
  wasn't allowing the sanlock, so I entered the setsebool
  virt_use_sanlock 1
  and the problem is fixed.

 Which version of vdsm is istalled on your node? and which
 selinux-policy? sanlock should work out-of-the-box.

 vdsm-4.10.0-10.fc17

 on /etc/sysconfig/selinux
 SELINUX=enforcing
 SELINUXTYPE=targeted

As far as I understand the selinux policies for the ovirt-node are set
by recipe/common-post.ks (in the ovirt-node repo):

semanage  boolean -m -S targeted -F /dev/stdin   \EOF_semanage
allow_execstack=0
virt_use_nfs=1
EOF_semanage

We should update it with what vdsm is currently setting:

virt_use_sanlock=1
sanlock_use_nfs=1

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Fwd: Re: oVirt live snapshot problem

2012-06-13 Thread Federico Simoncelli

- Original Message -
 From: Haim Ateya hat...@redhat.com
 To: Neil nwilson...@gmail.com, Federico Simoncelli 
 fsimo...@redhat.com, Kiril Nesenko ki...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, June 13, 2012 7:00:34 AM
 Subject: Re: [Users] Fwd: Re: oVirt live snapshot problem
 
 Federico\Kiril,
 
 is this problem known to you ?

I can't say without the logs. Please look at the relevant log parts that
I quote in my bug comment and check if they are the same:

https://bugzilla.redhat.com/show_bug.cgi?id=829645#c3

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] image ownership

2012-05-10 Thread Federico Simoncelli

- Original Message -
 From: Jacob Wyatt jwy...@ggc.edu
 To: users@ovirt.org
 Sent: Wednesday, May 9, 2012 8:08:54 PM
 Subject: [Users] image ownership

 Greetings all,

 I've set up a new oVirt installation and it's behaving strangely with
 regard to virtual machine image files on the NFS storage.  Whenever
 I shut down a machine it's changing the owner of the image to
 root:root (0:0) instead of vdsm:kvm (36:36).  After that it can't
 start or do anything with that image again until I manually change
 the ownership back.  Everything works fine again until I shut the
 machine down.  I assume this is some mistake I've made in
 installation.  I did not have this problem in the test environment,
 but I'm stumped as to what went wrong.

 -Jacob

Hi Jacob,
 could you check the dynamic_ownership in /etc/libvirt/qemu.conf:

# grep dynamic_ownership /etc/libvirt/qemu.conf 
#dynamic_ownership = 1
dynamic_ownership=0 # by vdsm

Thanks,
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] error in multipath.py

2012-04-24 Thread Federico Simoncelli

Could you check if this fixes your problem?

http://gerrit.ovirt.org/3863

Thanks,
-- 
Federico

- Original Message -
 From: ov...@qip.ru
 To: users@ovirt.org
 Sent: Tuesday, April 24, 2012 8:54:58 AM
 Subject: [Users] error in multipath.py
 
 
 i tried the latest releases of vdsm from jenkins . ovirt .org, but
 found that they did'n work. V dsm d cycles and corrupted my
 multipath . conf .
 
 error is in
 
 / usr /share/ vdsm /storage/ multipath .py
 
 this is the diff with my py
 
 # diff multipath .py.bad multipath .py
 88,89c88,89
  first = mpath conf [0]
  second = mpath conf [1]
 ---
  first = mpath conf .split('\n', 1)[0]
  second = mpath conf .split('\n', 1)[1]
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Can't start vm

2012-03-12 Thread Federico Simoncelli



- Original Message -
 From: kumar shantanu k.shantanu2...@gmail.com
 To: users users@ovirt.org
 Sent: Monday, March 12, 2012 9:16:34 AM
 Subject: [Users] Can't start vm
 
 
 Hi all,
 
 I created host from ovirt manager but when trying to run it's failing
 with the error,
 
 == vdsm.log ==
 Thread-180651::DEBUG::2012-03-12
 13:42:12,482::vm::577::vm.Vm::(_startUnderlyingVm)
 vmId=`c13c4c09-f696-47e1-b8cd-8d499242e151`::_ongoingCreations
 released
 Thread-180651::ERROR::2012-03-12
 13:42:12,482::vm::601::vm.Vm::(_startUnderlyingVm)
 vmId=`c13c4c09-f696-47e1-b8cd-8d499242e151`::The vm start process
 failed
 Traceback (most recent call last):
 File /usr/share/vdsm/vm.py, line 567, in _startUnderlyingVm
 self._run()
 File /usr/share/vdsm/libvirtvm.py, line 1306, in _run
 self._connection.createXML(domxml, flags),
 File /usr/share/vdsm/libvirtconnection.py, line 82, in wrapper
 ret = f(*args, **kwargs)
 File /usr/lib64/python2.6/site-packages/libvirt.py, line 2087, in
 createXML
 if ret is None:raise libvirtError('virDomainCreateXML() failed',
 conn=self)
 libvirtError: internal error Process exited while reading console log
 output: Supported machines are:
 pc RHEL 6.2.0 PC (alias of rhel6.2.0)
 
 
 
 Pythong version running is
 
 [root@ovirt ~]# python -V
 Python 2.7
 
 Can anyone please suggest .

Hi Kumar,
 when the engine starts a VM it also specifies a machine type.
The machine types supported by an host depend on the system (RHEL/Fedora)
and you can get the list with:

 # vdsClient 0 getVdsCaps | grep emulatedMachines
emulatedMachines = ['pc-1.1', 'pc', 'pc-1.0', 'pc-0.15', ...

Once you discovered the types supported by your hosts you can configure
the engine with the correct value:

http://www.ovirt.org/wiki/Engine_Node_Integration

psql -U postgres engine -c update vdc_options set option_value='pc-0.14' 
 where option_name='EmulatedMachine' and version='3.0';

I assume that you ran the command above but your VDSM hosts are rhel6,
so you would need to use the rhel6.2.0 value instead.

I believe that the value pc is an alias that works both for RHEL and
Fedora and it might be handy for testing, but in general I really discourage
its use because it would allow a mixed cluster of RHEL and Fedora hosts
which could be problematic in case of live migrations.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

81 matches

Mail list logo