Re: [Users] OVirt 3.3.2 Snapshot Pane empty in Firefox

2014-01-27 Thread Markus Stockhausen
> Could be indeed related to the session timeout, some issues
> regarding have been addressed in 3.4. The default timeout
> is 30 minutes defined by UserSessionTimeOutInterval in
> vdc_options table. Can you consistently reproduce the issue
> on timeout? Which version of Firefox are you using?

Opened BZ 1058618 for that.

Markus
Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-27 Thread Ted Miller

Federico, thank you for your help so far.  Lots of more information below.

On 1/27/2014 4:46 PM, Federico Simoncelli wrote:

- Original Message -

From: "Ted Miller" 

On 1/27/2014 3:47 AM, Federico Simoncelli wrote:

Maybe someone from gluster can identify easily what happened. Meanwhile if
you just want to repair your data-center you could try with:

   $ cd 
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
   $ touch ids
   $ sanlock direct init -s 0322a407-2b16-40dc-ac67-13d387c6eb4c:0:ids:1048576


I tried your suggestion, and it helped, but it was not enough.

   [root@office4a ~]$ cd
   
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/

   [root@office4a dom_md]$ touch ids

   [root@office4a dom_md]$ sanlock direct init -s 
0322a407-2b16-40dc-ac67-13d387c6eb4c:0:ids:1048576

   init done 0

Let me explain a little.

When the problem originally happened, the sanlock.log started having -223 
error messages.  10 seconds later the log switched from -223 messages to -90 
messages.  Running your little script changed the error from -90 back to -223.


I hope you can send me another script that will get rid of the -223 messages.

Here is the sanlock.log as I ran your script:

   2014-01-27 19:40:41-0500 39281 [3803]: s13 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

   2014-01-27 19:40:41-0500 39281 [22751]: 0322a407 aio collect 0 
0x7f54240008c0:0x7f54240008d0:0x7f5424101000 result 0:0 match len 512

   2014-01-27 19:40:41-0500 39281 [22751]: read_sectors delta_leader offset 512 
rv -90 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids

   2014-01-27 19:40:42-0500 39282 [3803]: s13 add_lockspace fail result -90

   2014-01-27 19:40:47-0500 39287 [3803]: s14 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

   2014-01-27 19:40:47-0500 39287 [22795]: 0322a407 aio collect 0 
0x7f54240008c0:0x7f54240008d0:0x7f5424101000 result 0:0 match len 512

   2014-01-27 19:40:47-0500 39287 [22795]: read_sectors delta_leader offset 512 
rv -90 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids

   2014-01-27 19:40:48-0500 39288 [3803]: s14 add_lockspace fail result -90

   2014-01-27 19:40:56-0500 39296 [3802]: s15 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

   2014-01-27 19:40:56-0500 39296 [22866]: verify_leader 2 wrong magic 0 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids

   2014-01-27 19:40:56-0500 39296 [22866]: leader1 delta_acquire_begin error 
-223 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c host_id 2

   2014-01-27 19:40:56-0500 39296 [22866]: leader2 path 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
 offset 0

   2014-01-27 19:40:56-0500 39296 [22866]: leader3 m 0 v 0 ss 0 nh 0 mh 0 oi 0 
og 0 lv 0

   2014-01-27 19:40:56-0500 39296 [22866]: leader4 sn  rn  ts 0 cs 0

   2014-01-27 19:40:57-0500 39297 [3802]: s15 add_lockspace fail result -223

   2014-01-27 19:40:57-0500 39297 [3802]: s16 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

   2014-01-27 19:40:57-0500 39297 [22870]: verify_leader 2 wrong magic 0 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids

   2014-01-27 19:40:57-0500 39297 [22870]: leader1 delta_acquire_begin error 
-223 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c host_id 2

   2014-01-27 19:40:57-0500 39297 [22870]: leader2 path 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
 offset 0

   2014-01-27 19:40:57-0500 39297 [22870]: leader3 m 0 v 0 ss 0 nh 0 mh 0 oi 0 
og 0 lv 0

   2014-01-27 19:40:57-0500 39297 [22870]: leader4 sn  rn  ts 0 cs 0

   2014-01-27 19:40:58-0500 39298 [3802]: s16 add_lockspace fail result -223

   2014-01-27 19:41:07-0500 39307 [3802]: s17 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

Unfortunately, I think the error looks about the same to vdsm, because 
/var/log/messages shows the same two lines in the calling scripts on the 
callback lists (66 & 425, if I remember right).


When I get up in the morning, I will be looking for another magic potion from 
your pen. :)



Federico,

I won't be able to do anything to the ovirt setup for another 5 hours or so
(it is a trial system I am working on  at home, I am at work), but I will try
your repair script and report back.

In bugzilla 86297

Re: [Users] Subject: Outage :: Mailman, downloads :: 2014-01-27 23:00 UTC

2014-01-27 Thread Karsten Wade
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 01/27/2014 03:44 PM, Karsten Wade wrote:
> Outage complete, no problems.
> 
> I boosted the disk space to 100 MB and kept about 10 MB in case we 
> want to have another image or something -- better to have a buffer 
> than not.

s/MB/GB/g

> 
> One thing that is different is that in the last week we've added
> the reverse DNS including IPv6 for this host. I see that ovirtbot
> joined #ovirt with a different hostmask than it left with:
> 
> ovirtbot [~supy...@linode01.ovirt.org] has quit [Ping timeout: 480 
> seconds] 15:18 -!- ovirtbot
> [~supybot@2600:3c01::f03c:91ff:fe93:4b0d] has joined #ovirt
> 
> On 01/27/2014 02:10 PM, Karsten Wade wrote:
>> There us an outage of Mailman (lists.oviry.org) and downloads 
>> (resources.ovirt.org) for about 15 minutes.
> 
>> The outage will occur in one about one hour from now at
>> 2014-01-27 23:00 UTC. To view in your local time:
> 
>> date -d '2014-01-27  23:00 UTC'
> 
>> == Details ==
> 
>> We have a no-cost upgrade for RAM and disk space available.
> 
>> Since this host has run out of disk space a few times recently, 
>> which has affected Mailman services, it seems like a good idea
>> to grab the extra space immediately.
> 
>> == Affected services ==
> 
>> * Mailman (lists.ovirt.org) * Downloads (resources.ovirt.org) +
>> yum repos * Some redirects.
> 
>> === Not-affected services ==
> 
>> * www.ovirt.org * jenkins.ovirt.org * gerrit.ovirt.org * etc.
> 
>> == Future plans ==
> 
>> This host is due to be de-provisioned, when possible.
> 
>> ___ Infra mailing
>> list in...@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/infra
> 
> 
> ___ Infra mailing list 
> in...@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
> 

- -- 
Karsten 'quaid' Wade.^\CentOS Engineering Manager
http://TheOpenSourceWay.org\  http://community.redhat.com
@quaid (identi.ca/twitter/IRC)  \v' gpg: AD0E0C41
-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlLnVZgACgkQ2ZIOBq0ODEGaiQCgnWWW00enlFtz2fZc5Bsi0Zq2
5SoAoN27GsPDrDOcjDwC4461MqOz73OZ
=Nkwu
-END PGP SIGNATURE-
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] two node ovirt cluster with HA

2014-01-27 Thread Jaison peter
Thank you all for your valuable feedback .

Can you please specify some of the supported fencing devices in ovirt ?


On Mon, Jan 27, 2014 at 9:10 PM, Eli Mesika  wrote:

>
>
> - Original Message -
> > From: "Tareq Alayan" 
> > To: "Andrew Lau" , "Eli Mesika" <
> emes...@redhat.com>
> > Cc: d...@redhat.com, "Karli Sjöberg" ,
> users@ovirt.org
> > Sent: Monday, January 27, 2014 2:59:02 PM
> > Subject: Re: [Users] two node ovirt cluster with HA
> >
> > Adding Eli.
>
> I just want to summarize the requirement as I understand it:
>
> In the case that a Host that is running HA VMs and have PM configured is
> turned off manually :
>
> 1) The non-responsive treatment should be modified to check Host status
> via PM agent
> 2) If Host is off , HA VMs will attempt to run on another host ASAP
> 3) The host status should be set to DOWN
> 4) No attempt to restart vdsm (soft fencing) or restart the host (hard
> fencing) will be done
>
> Is the above correct? if so , a RFE on that can be opened
>
> >
> >
> > On 01/27/2014 02:50 PM, Andrew Lau wrote:
> > > Hi,
> > >
> > > I think he was asking what if the power management device reported
> > > that the host was powered off. Then VMs should be brought back up as
> > > being off would essentially be the same as running a power
> cycle/reboot?
> > >
> > > Another example I'm seeing is what happens if the whole host loses
> > > power and it's power management device then becomes unavailable (ie.
> > > not reachable) then you're stuck in the case where it requires manual
> > > intervention.
> > >
> > > I would be interested to potentially see something like a timeout on
> > > those problematic VMs (eg. if nothing was read or write after x amount
> > > of time) then you could consider the host as offline? I guess then
> > > that adds a lot of risk..
> > >
> > >
> > > On Mon, Jan 27, 2014 at 11:43 PM, Tareq Alayan  > > > wrote:
> > >
> > > Hi,
> > >
> > > Power management makes use of special *dedicated* hardware in
> > > order to restart hosts independently of host OS. The engine
> > > connects to a power management devices using a *dedicated* network
> > > IP address.
> > > The engine is capable of rebooting hosts that have entered a
> > > non-operational or non-responsive state,
> > > The abilities provided by all power management devices are: check
> > > status, start, stop and recycle (restart)...
> > >
> > > In the case of non-responsive host: all of the VMs that are
> > > currently running on that host can also become non-responsive.
> > > However, the non-responsive host keeps locking the VM hard disk
> > > for all VMs it is running. Attempting to start a VM on a different
> > > host and assign the second host write privileges for the virtual
> > > machine hard disk image can cause data corruption.
> > > Rebooting allows the engine to assume that the lock on a VM hard
> > > disk image has been released.
> > > The engine can know for sure that the problematic host has been
> > > rebooted via the power management device and then it can start a
> > > VM from the problematic host on another host without risking data
> > > corruption.
> > > Important note: A virtual machine that has been marked
> > > highly-available can not be safely started on a different host
> > > without the certainty that doing so will not cause data corruption.
> > >
> > > N-joy,
> > >
> > > --Tareq
> > >
> > >
> > >
> > >
> > > On 01/27/2014 02:05 PM, Dafna Ron wrote:
> > >
> > > I am adding Tareq for the Power Management implementation.
> > >
> > > Dafna
> > >
> > >
> > > On 01/27/2014 11:48 AM, Karli Sjöberg wrote:
> > >
> > > On Mon, 2014-01-27 at 11:11 +, Dafna Ron wrote:
> > >
> > > Powering off the host will never trigger vm migration.
> > > As far as engine is concerned it just lost connection
> > > to the host, but
> > > has no way of telling if the host is down or if a
> > > router is down.
> > >
> > > Can´t it at least check with power management if the Host
> > > status is down
> > > first?
> > >
> > > I mean, if the network is down there will be no response
> > > from either PM
> > > or Host. But if PM is up and can tell you that the Host is
> > > down, sounds
> > > rather clear cut to me...
> > >
> > > Seems to me the VM's would be restarted sooner if the flow
> > > was altered
> > > to first check with PM if it´s a network or Host issue,
> > > and if Host
> > > issue, immediately restart VM's on another Host, instead
> > > of waiting for
> > > a potentially problematic Host to boot up eventually.
> > >
> > > /K
> > >
> > > s

[Users] VM install failures on a stateless node

2014-01-27 Thread David Li
Hi,

I have been trying to install my first VM on a stateless node.  so far I have 
failed twice with the node ending up in the "Non-responsive" mode. I had to 
reboot to recover and it took a while to reconfigure everything since this is 
stateless. 

I can still get into the node via the console. It's not dead.  But the 
ovirtmgmt interface seems to be dead. The other iSCSI interface is running ok.  


Can anyone recommend ways how to debug this problem? 



Thanks.

David

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] oVirt 3.4 - testing days report [iproute2 configurator]

2014-01-27 Thread Douglas Schilling Landgraf

Hi,

During the tests I have faced the below bug:

vdsm-4.14.1-2 unable to restart on reboot after a network is defined on 
ovirt-node

https://bugzilla.redhat.com/show_bug.cgi?id=1057657

Additionally (not related to iproute2 tests), I have faced:

[RFE] report BOOTPROTO and BONDING_OPTS independent of netdevice.cfg
https://bugzilla.redhat.com/show_bug.cgi?id=987813
(I have workaround creating manually ifcfg-em1 and ifcfg-ovirtmgmt)

firefox seg faults when using the Admin Portal on RHEL 6.5
https://bugzilla.redhat.com/show_bug.cgi?id=1044010
(Updated to firefox-24.2.0-6.el6_5.x86_64 resolved the problem.)


Test data for iproute2:


- Setup Node -> put it in maintenance
- Changed the vdsm.conf on node to:

[vars]
ssl = true
net_configurator = iproute2
net_persistence = unified

[addresses]
management_port = 54321

- Restart vdsm/supervdsm
- Host is UP again, no problems


- DataCenter -> Logical Network -> New
  - Name: net25 -> [x] Enable Vlan tagging  [  ] VM Network

- Since I have just one nic at host I have added dummy interface.
#ip link add name dummy_interface type dummy

- Put the host in maintenance and put again UP to recognize the new 
interface


- Host -> Network -> Setup Host Networks
  -> drag/drop net25 to dummy_interace
  -> [x] save network interface

On host vdsClient -s 0 getVdsCaps appears [net25]

* Rebooted to check if the new net25 will be persistent.


--
Cheers
Douglas
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Subject: Outage :: Mailman, downloads :: 2014-01-27 23:00 UTC

2014-01-27 Thread Karsten Wade
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Outage complete, no problems.

I boosted the disk space to 100 MB and kept about 10 MB in case we
want to have another image or something -- better to have a buffer
than not.

One thing that is different is that in the last week we've added the
reverse DNS including IPv6 for this host. I see that ovirtbot joined
#ovirt with a different hostmask than it left with:

ovirtbot [~supy...@linode01.ovirt.org] has quit [Ping timeout: 480
seconds]
15:18 -!- ovirtbot [~supybot@2600:3c01::f03c:91ff:fe93:4b0d] has
joined #ovirt

On 01/27/2014 02:10 PM, Karsten Wade wrote:
> There us an outage of Mailman (lists.oviry.org) and downloads 
> (resources.ovirt.org) for about 15 minutes.
> 
> The outage will occur in one about one hour from now at 2014-01-27 
> 23:00 UTC. To view in your local time:
> 
> date -d '2014-01-27  23:00 UTC'
> 
> == Details ==
> 
> We have a no-cost upgrade for RAM and disk space available.
> 
> Since this host has run out of disk space a few times recently,
> which has affected Mailman services, it seems like a good idea to
> grab the extra space immediately.
> 
> == Affected services ==
> 
> * Mailman (lists.ovirt.org) * Downloads (resources.ovirt.org) + yum
> repos * Some redirects.
> 
> === Not-affected services ==
> 
> * www.ovirt.org * jenkins.ovirt.org * gerrit.ovirt.org * etc.
> 
> == Future plans ==
> 
> This host is due to be de-provisioned, when possible.
> 
> ___ Infra mailing list 
> in...@ovirt.org http://lists.ovirt.org/mailman/listinfo/infra
> 

- -- 
Karsten 'quaid' Wade.^\CentOS Engineering Manager
http://TheOpenSourceWay.org\  http://community.redhat.com
@quaid (identi.ca/twitter/IRC)  \v' gpg: AD0E0C41
-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlLm71QACgkQ2ZIOBq0ODEHNFQCgyDHUjbOVZJH3jRRfqKEZAnvT
AeEAniO35vEm8tyekIuljKHGe4F6noAu
=PriU
-END PGP SIGNATURE-
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Setup Networks: Unexpected exception

2014-01-27 Thread Frank Wall
Hi,

still testing 3.4 and I'm unable to save a node's 
network configuration in webadmin:

Error while executing action Setup Networks: Unexpected exception

My configuration on the ovirt node:
- manually added net2 bridge, attached to eth1

My configuration in ovirt-engine webadmin:
- added new network net2
- noticed that ovirt failed to find this network on node
- tried to add net2 to node with "Setup Host Networks"

Error in engine.log [1].
Error in vdsm.log [2].

I think it could be related to BZ 1054195:
https://bugzilla.redhat.com/show_bug.cgi?id=1054195 ([NetworkLabels] Attaching 
two labeled networks to a cluster result in failure of the latter)

I'm not sure, because I only wanted to add *one* new network.
Please note that this is a self-hosted engine setup. Just in
case this makes a difference...

ovirt-engine:
ovirt-engine-3.4.0-0.5.beta1.el6.noarch

ovirt node:
vdsm-4.14.1-17.gitcf59a55.el6.x86_64
ovirt-hosted-engine-setup-1.2.0-0.0.master.20140117.gitfaf77a5.el6.noarch


Thanks
- Frank

[1]
2014-01-27 23:41:08,813 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand] 
(ajp--127.0.0.1-8702-10) [1
783e132] START, SetupNetworksVDSCommand(HostName = mgt03rn.example.com, HostId 
= a11f5383-b8f7-4bed-b1f3-3c7c46ecbe7a, for
ce=false, checkConnectivity=true, conectivityTimeout=120,
networks=[net2 {id=db98fa95-e922-4060-8d39-f9ac0cb2f16f, 
description=Jumphost Network, comment=null, subnet=null, 
gateway=null, type=null, vlanId=null, stp=false, 
dataCenterId=0002-0002-0002-0002-0002, mtu=0, vmNetwork=true, cl
uster=NetworkCluster {id={clusterId=null, networkId=null}, status=OPERATIONAL, 
display=false, required=true, migration=false}
, providedBy=null, label=null, qosId=null}],
bonds=[],
interfaces=[bond001 {id=c5b50ccf-5b74-4737-b7cd-980c9c8acf51, 
vdsId=a11f5383-b8f7-4bed-b1f3-3c7c46ecbe7a, name=bond001, 
macAddress=2c:44:fd:82:f5:5f, networkName=null, bondOptions=mode=802.3ad, 
bootProtocol=STATIC_IP, address=10.0.0.103, subnet=255.255.255.0, gateway=null, 
mtu=1500, bridged=false, type=0, networkImplementationDetails=null},
eth3 {id=7aaf1ac1-944a-4fe6-9d22-7dc41c6e275c, 
vdsId=a11f5383-b8f7-4bed-b1f3-3c7c46ecbe7a, name=eth3, 
macAddress=2C:44:FD:82:F5:5F, networkName=null, bondName=bond001, 
bootProtocol=NONE, address=, subnet=, gateway=null, mtu=1500, bridged=false, 
speed=1000, type=0, networkImplementationDetails=null},
eth4 {id=0c23834d-97ae-462a-9701-e89b3dc6a83a, 
vdsId=a11f5383-b8f7-4bed-b1f3-3c7c46ecbe7a, name=eth4, 
macAddress=D8:9D:67:22:B6:4C, networkName=null, bondName=bond001, 
bootProtocol=NONE, address=, subnet=, gateway=null, mtu=1500, bridged=false, 
speed=1000, type=0, networkImplementationDetails=null},
eth1 {id=54cb3cf6-c4bd-4907-bf28-9020022965d5, 
vdsId=a11f5383-b8f7-4bed-b1f3-3c7c46ecbe7a, name=eth1, 
macAddress=2c:44:fd:82:f5:5d, networkName=net2, bondName=null, 
bootProtocol=NONE, address=, subnet=, gateway=null, mtu=0, bridged=true, 
speed=1000, type=0, networkImplementationDetails=null},
eth2 {id=a53c448f-8061-460f-9c24-3081a2376de7, 
vdsId=a11f5383-b8f7-4bed-b1f3-3c7c46ecbe7a, name=eth2, 
macAddress=2c:44:fd:82:f5:5e, networkName=null, bondName=null, 
bootProtocol=NONE, address=, subnet=, gateway=null, mtu=1500, bridged=false, 
speed=1000, type=0, networkImplementationDetails=null},
eth5 {id=e9f15827-bb15-41d9-8ccc-49d812cde8a6, 
vdsId=a11f5383-b8f7-4bed-b1f3-3c7c46ecbe7a, name=eth5, 
macAddress=d8:9d:67:22:b6:4d, networkName=null, bondName=null, 
bootProtocol=DHCP, address=, subnet=, gateway=null, mtu=1500, bridged=false, 
speed=0, type=0, networkImplementationDetails=null},
eth0 {id=b4aea8bc-bdde-4e1e-a206-46ee853220c0, 
vdsId=a11f5383-b8f7-4bed-b1f3-3c7c46ecbe7a, name=eth0, 
macAddress=2c:44:fd:82:f5:5c, networkName=ovirtmgmt, bondName=null, 
bootProtocol=STATIC_IP, address=10.0.0.103, subnet=255.255.0.0, 
gateway=10.0.0.1, mtu=1500, bridged=true, speed=1000, type=2, 
networkImplementationDetails={inSync=true, managed=true}}],
removedNetworks=[],
removedBonds=[]), log id: 78e823a3
2014-01-27 23:41:08,817 INFO  
[org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand] 
(ajp--127.0.0.1-8702-10) [1783e132] FINISH, SetupNetworksVDSCommand, log id: 
78e823a3
2014-01-27 23:41:09,323 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand] 
(ajp--127.0.0.1-8702-10) [1783e132] Failed in SetupNetworksVDS method
2014-01-27 23:41:09,323 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand] 
(ajp--127.0.0.1-8702-10) [1783e132] 
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: 
VDSGenericException: VDSErrorException: Failed to SetupNetworksVDS, error = 
Unexpected exception, code = 16
2014-01-27 23:41:09,324 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SetupNetworksVDSCommand] 
(ajp--127.0.0.1-8702-10) [1783e132] Command SetupNetworksVDSCommand(Hos

Re: [Users] oVirt 3.4 test day - Template Versions

2014-01-27 Thread Federico Simoncelli
- Original Message -
> From: "Omer Frenkel" 
> To: "Federico Simoncelli" 
> Cc: "oVirt Users List" , "Itamar Heim" 
> Sent: Monday, January 27, 2014 4:31:56 PM
> Subject: Re: oVirt 3.4 test day - Template Versions
> 
> Thanks for the feedback! much appreciated.
> 
> - Original Message -
> > From: "Federico Simoncelli" 
> > To: "oVirt Users List" 
> > Cc: "Omer Frenkel" , "Itamar Heim" 
> > Sent: Monday, January 27, 2014 5:12:38 PM
> > Subject: oVirt 3.4 test day - Template Versions
> > 
> > Feature tested:
> > 
> > http://www.ovirt.org/Features/Template_Versions
> > 
> > - create a new vm vm1 and make a template template1 from it
> > - create a new vm vm2 based on template1 and make some changes
> > - upgrade to 3.4
> > - create a new template template1.1 from vm2
> > - create a new vm vm3 from template1 (clone) - content ok
> > - create a new vm vm4 from template1.1 (thin) - content ok
> > - create a new vm vm5 from template1 last (thin) - content ok (same as 1.1)
> > - try to remove template1 (failed as template1.1 is still present)
> > - try to remove template1.1 (failed as vm5 is still present)
> > - create a new vm vm6 and make a template blank1.1 as new version of the
> >   blank template (succeeded)
> > - create a vm pool vmpool1 with the "latest" template from template1
> > - create a vm pool vmpool2 with the "template1.1" (last) template from
> > template1
> > - start vmpool1 and vmpool2 and verify that the content is the same
> > - create a new template template1.2
> > - start vmpool1 and verify that the content is the same as latest
> > (template1.2)
> > - start vmpool2 and verify that the content is the same as template1.1
> > 
> > Suggestions:
> > 
> > - the template blank is special, I am not sure if allowing versioning may
> >   be confusing (for example is not even editable)
> 
> right, i also thought about this, and my thought was not to block the user
> from doing this,
> but if it was confusing we better block it.
> 
> > - as far as I can see the "Sub Version Name" is not editable anymore (after
> >   picking it)
> 
> thanks, i see its missing in the UI, do you care to open a bug on that?

https://bugzilla.redhat.com/show_bug.cgi?id=1058501 

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Hosted-engine runtime issues (3.4 BETA)

2014-01-27 Thread Yedidyah Bar David
- Original Message -
> From: "Frank Wall" 
> To: "Yedidyah Bar David" 
> Cc: "Itamar Heim" , users@ovirt.org, "Doron Fediuck" 
> , "Oved Ourfalli"
> , "Sandro Bonazzola" 
> Sent: Tuesday, January 28, 2014 12:02:49 AM
> Subject: Re: [Users] Hosted-engine runtime issues (3.4 BETA)
> 
> On Sun, Jan 26, 2014 at 04:12:05PM -0500, Yedidyah Bar David wrote:
> > Please report about any results/issues. If it fails, you can try first,
> > after
> > removing from the engine,
> > # yum remove libvirt vdsm
> > # rm -rf /etc/vdsm /etc/libvirt
> > 
> > (or, of course, just reinstall if you do not care).
> 
> I reinstalled the host and run `engine-setup --deploy` again.
> This time adding the second host worked OK. Though I had to
> workaround bugs BZ 1055153 and BZ 1055059 - I know you are already
> aware of them :-)
> https://bugzilla.redhat.com/show_bug.cgi?id=1055153 (vdsmd not starting on
> first run since vdsm logs are not included in rpm)
> https://bugzilla.redhat.com/show_bug.cgi?id=1055059 (The --vm-start function
> does not call the createvm command but --vm-start-paused does)
> 
> So adding a second host basically works when using a fresh install.

Very well. Thanks for the report!
-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Subject: Outage :: Mailman, downloads :: 2014-01-27 23:00 UTC

2014-01-27 Thread Karsten Wade
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

There us an outage of Mailman (lists.oviry.org) and downloads
(resources.ovirt.org) for about 15 minutes.

The outage will occur in one about one hour from now at 2014-01-27
23:00 UTC. To view in your local time:

date -d '2014-01-27  23:00 UTC'

== Details ==

We have a no-cost upgrade for RAM and disk space available.

Since this host has run out of disk space a few times recently, which
has affected Mailman services, it seems like a good idea to grab the
extra space immediately.

== Affected services ==

* Mailman (lists.ovirt.org)
* Downloads (resources.ovirt.org) + yum repos
* Some redirects.

=== Not-affected services ==

* www.ovirt.org
* jenkins.ovirt.org
* gerrit.ovirt.org
* etc.

== Future plans ==

This host is due to be de-provisioned, when possible.

- -- 
Karsten 'quaid' Wade.^\CentOS Engineering Manager
http://TheOpenSourceWay.org\  http://community.redhat.com
@quaid (identi.ca/twitter/IRC)  \v' gpg: AD0E0C41
-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlLm2UAACgkQ2ZIOBq0ODEHoxQCcD7hA9E+dEGtoyB38Gh0ZmncS
BNIAnipN9fwZmDxxPPYn8DDJjSfqzWFJ
=W6Ga
-END PGP SIGNATURE-
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Hosted-engine runtime issues (3.4 BETA)

2014-01-27 Thread Frank Wall
On Sun, Jan 26, 2014 at 04:12:05PM -0500, Yedidyah Bar David wrote:
> Please report about any results/issues. If it fails, you can try first, after
> removing from the engine,
> # yum remove libvirt vdsm
> # rm -rf /etc/vdsm /etc/libvirt
> 
> (or, of course, just reinstall if you do not care).

I reinstalled the host and run `engine-setup --deploy` again.
This time adding the second host worked OK. Though I had to 
workaround bugs BZ 1055153 and BZ 1055059 - I know you are already
aware of them :-)
https://bugzilla.redhat.com/show_bug.cgi?id=1055153 (vdsmd not starting on 
first run since vdsm logs are not included in rpm)
https://bugzilla.redhat.com/show_bug.cgi?id=1055059 (The --vm-start function 
does not call the createvm command but --vm-start-paused does)

So adding a second host basically works when using a fresh install.


Regards
- Frank
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-27 Thread Federico Simoncelli
- Original Message -
> From: "Ted Miller" 
> To: "Federico Simoncelli" , "Itamar Heim" 
> 
> Cc: users@ovirt.org
> Sent: Monday, January 27, 2014 7:16:14 PM
> Subject: Re: [Users] Data Center stuck between "Non Responsive" and 
> "Contending"
> 
> 
> On 1/27/2014 3:47 AM, Federico Simoncelli wrote:
> > Maybe someone from gluster can identify easily what happened. Meanwhile if
> > you just want to repair your data-center you could try with:
> >
> >   $ cd
> >   
> > /rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
> >   $ touch ids
> >   $ sanlock direct init -s
> >   0322a407-2b16-40dc-ac67-13d387c6eb4c:0:ids:1048576
> Federico,
> 
> I won't be able to do anything to the ovirt setup for another 5 hours or so
> (it is a trial system I am working on  at home, I am at work), but I will try
> your repair script and report back.
> 
> In bugzilla 862975 they suggested turning off write-behind caching and "eager
> locking" on the gluster volume to avoid/reduce the problems that come from
> many different computers all writing to the same file(s) on a very frequent
> basis.  If I interpret the comment in the bug correctly, it did seem to help
> in that situation.  My situation is a little different.  My gluster setup is
> replicate only, replica 3 (though there are only two hosts).  I was not
> stress-testing it, I was just using it, trying to figure out how I can import
> some old VMWare VMs without an ESXi server to run them on.

Have you done anything similar to what is described here in comment 21?

https://bugzilla.redhat.com/show_bug.cgi?id=859589#c21

When did you realize that you weren't able to use the data-center anymore?
Can you describe exactly what you did and what happened, for example:

1. I created the data center (up and running)
2. I tried to import some VMs from VMWare
3. During the import (or after it) the data-center went in the contending state
...

Did something special happened? I don't know, power loss, split-brain?
For example also an excessive load on one of the servers could have triggered
a timeout somewhere (forcing the data-center to go back in the contending
state).

Could you check if any host was fenced? (Forcibly rebooted)

> I am guessing that what makes cluster storage have the (Master) designation
> is that this is the one that actually contains the sanlocks?  If so, would it
> make sense to set up a gluster volume to be (Master), but not use it for VM
> storage, just for storing the sanlock info?  Separate gluster volume(s) could
> then have the VMs on it(them), and would not need the optimizations turned
> off.

Any domain must be able to become the master at any time. Without a master
the data center is unusable (at the present time), that's why we migrate (or
reconstruct) it on another domain when necessary.

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Spice-proxy questions

2014-01-27 Thread David Li
Do I need to generate and install a x509 key pair for the squid proxy?  How can 
I find out if the key pair has already been done? 


- Original Message -
> From: Gianluca Cecchi 
> To: David Li 
> Cc: "users@ovirt.org" ; "dj...@redhat.com" 
> Sent: Friday, January 24, 2014 2:25 PM
> Subject: Re: [Users] Spice-proxy questions
> 
> On Fri, Jan 24, 2014 at 8:45 PM, David Li  wrote:
>>  David
>> 
>>  I set up the squid proxy on the same machine as ovirt-engine. I have this 
> in squid.conf:
>> 
>> 
>> 
>>  ---
>>  acl localhost src 10.10.2.143/32 # for the machine running the browser
>> 
>> 
>>  #safe ports
>>  acl SSL_ports port 443
>>  acl Safe_ports port 80          # http
>>  acl Safe_ports port 21          # ftp
>>  acl Safe_ports port 443         # https
>>  acl Safe_ports port 70          # gopher
>>  acl Safe_ports port 210         # wais
>>  acl Safe_ports port 1025-65535  # unregistered ports <-- will 
> this allow connections to spice port range (5900-6144 IIRC).???
>>  acl Safe_ports port 280         # http-mgmt
>>  acl Safe_ports port 488         # gss-http
>>  acl Safe_ports port 591         # filemaker
>>  acl Safe_ports port 777         # multiling http
>> 
>> 
>> 
>>  # Squid normally listens to port 3128
>>  http_port 3128
>> 
>>  # Deny requests to certain unsafe ports
>>  http_access deny !Safe_ports
>> 
>>  -
>> 
>>  and set my SpiceProxyDefault=http://10.10.2.143:3128
>> 
>> 
>> 
>>  So far, this is still not working. The Spice popup window still fails to 
> connect to the graphics server and html5 browser window remains blank.
>>  Are there any log files that can be used to debug this?
>> 
>>  Thanks.
>> 
>> 
> 
> There is something I don't understand or that you are doing incorrectly.
> 
> From what you write it seems that:
> 
> - your engine has ip 10.10.2.143
> 
> - From which ip do you run your browser?
> 
> - Can this ip connect to engine on port 3128? Perhaps your engine
> setup already configured iptables (or firewalld) and it is blocking
> you?
> You can easily verify at runtime by putting this line on engine:
> 
> iptables -I INPUT -s xxx.yyy.www.zzz -j ACCEPT
> where xxx.yyy.www.zzz is the ip of the client from where you run the browser
> so that you put this accept rule on top of INPUT chain and retry to
> connect to VM console
> 
> - Which ip have the hosts where VMs are running?
> - Is engine (so your proxy in your configuration) capable to reach ip
> of your hosts on spice ports (5900-..)?
> 
> ALso see my previous thread here:
> http://lists.ovirt.org/pipermail/users/2013-December/018554.html
> 
> and the useful answers.
> 
> I cannot test your config, because I have no control on my network and
> network admins only allow 80 and 443 so that they are already taken by
> engine itself and I can't test putting the proxy on engine itself...
> 
> HIH anyway,
> Gianluca
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-27 Thread Ted Miller


On 1/27/2014 3:47 AM, Federico Simoncelli wrote:

- Original Message -

From: "Itamar Heim" 
To: "Ted Miller" , users@ovirt.org, "Federico Simoncelli" 

Cc: "Allon Mureinik" 
Sent: Sunday, January 26, 2014 11:17:04 PM
Subject: Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

On 01/27/2014 12:00 AM, Ted Miller wrote:

On 1/26/2014 4:00 PM, Itamar Heim wrote:

On 01/26/2014 10:51 PM, Ted Miller wrote:

On 1/26/2014 3:10 PM, Itamar Heim wrote:

On 01/26/2014 10:08 PM, Ted Miller wrote:
is this gluster storage (guessing sunce you mentioned a 'volume')

yes (mentioned under "setup" above)

does it have a quorum?

Volume Name: VM2
Type: Replicate
Volume ID: 7bea8d3b-ec2a-4939-8da8-a82e6bda841e
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.41.65.2:/bricks/01/VM2
Brick2: 10.41.65.4:/bricks/01/VM2
Brick3: 10.41.65.4:/bricks/101/VM2
Options Reconfigured:
cluster.server-quorum-type: server
storage.owner-gid: 36
storage.owner-uid: 36
auth.allow: *
user.cifs: off
nfs.disa

(there were reports of split brain on the domain metadata before when
no quorum exist for gluster)

after full heal:

[root@office4a ~]$ gluster volume heal VM2 info
Gathering Heal info on volume VM2 has been successful

Brick 10.41.65.2:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/101/VM2
Number of entries: 0
[root@office4a ~]$ gluster volume heal VM2 info split-brain
Gathering Heal info on volume VM2 has been successful

Brick 10.41.65.2:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/101/VM2
Number of entries: 0

noticed this in host /var/log/messages (while looking for something else).  
Loop seems to repeat over and over.

Jan 26 15:35:52 office4a sanlock[3763]: 2014-01-26 15:35:52-0500 14678 [30419]: 
read_sectors delta_leader offset 512 rv -90 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids


Jan 26 15:35:53 office4a sanlock[3763]: 2014-01-26 15:35:53-0500 14679 [3771]: 
s1997 add_lockspace fail result -90
Jan 26 15:35:58 office4a vdsm TaskManager.Task ERROR Task=`89885661-88eb-4ea3-8793-00438735e4ab`::Unexpected 
error#012Traceback (most recent call last):#012  File "/usr/share/vdsm/storage/task.py", line 857, in 
_run#012 return fn(*args, **kargs)#012  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper#012res = 
f(*args, **kwargs)#012  File "/usr/share/vdsm/storage/hsm.py", line 2111, in getAllTasksStatuses#012
allTasksStatus = sp.getAllTasksStatuses()#012 File "/usr/share/vdsm/storage/securable.py", line 66, in 
wrapper#012
raise SecureError()#012SecureError
Jan 26 15:35:59 office4a sanlock[3763]: 2014-01-26 15:35:59-0500 14686 [30495]: 
read_sectors delta_leader offset 512 rv -90 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids


Jan 26 15:36:00 office4a sanlock[3763]: 2014-01-26 15:36:00-0500 14687 [3772]: 
s1998 add_lockspace fail result -90
Jan 26 15:36:00 office4a vdsm TaskManager.Task ERROR Task=`8db9ff1a-2894-407a-915a-279f6a7eb205`::Unexpected error#012Traceback 
(most recent call last):#012  File "/usr/share/vdsm/storage/task.py", line 857, in _run#012 return fn(*args, 
**kargs)#012  File "/usr/share/vdsm/storage/task.py", line 318, in run#012return self.cmd(*self.argslist, 
**self.argsdict)#012 File "/usr/share/vdsm/storage/sp.py", line 273, in startSpm#012 
self.masterDomain.acquireHostId(self.id)#012  File "/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId#012 
self._clusterLock.acquireHostId(hostId, async)#012  File "/usr/share/vdsm/storage/clusterlock.py", line 189, in 
acquireHostId#012raise se.AcquireHostIdFailure(self._sdUUID, e)#012AcquireHostIdFailure: Cannot acquire host id: 
('0322a407-2b16-40dc-ac67-13d387c6eb4c', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long'))

fede - thoughts on above?
(vojtech reported something similar, but it sorted out for him after
some retries)

Something truncated the ids file, as also reported by:


[root@office4a ~]$ ls
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
-l
total 1029
-rw-rw 1 vdsm kvm 0 Jan 22 00:44 ids
-rw-rw 1 vdsm kvm 0 Jan 16 18:50 inbox
-rw-rw 1 vdsm kvm 2097152 Jan 21 18:20 leases
-rw-r--r-- 1 vdsm kvm 491 Jan 21 18:20 metadata
-rw-rw 1 vdsm kvm 0 Jan 16 18:50 outbox

In the past I saw that happening because of a glusterfs bug:

https://bugzilla.redhat.com/show_bug.cgi?id=862975

Anyway in general it seems that glusterfs is not always able to reconcile
the ids file (as it's written by all the hosts at the same time).

Maybe someone from gluster can identify easily what happened. Meanwhile if
you just want to repair your data-center you could try with:

  $ cd 
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
  $ t

Re: [Users] Multi-Host Network Configuration

2014-01-27 Thread Dan Kenigsberg
On Mon, Jan 27, 2014 at 04:23:47PM +0100, Piotr Kliczewski wrote:
> Hi,
> 
> I played with multi host network config and used two boxes to test it.
> One el6 (vdsm) and f19 (engine, vdsm). During the test I noticed that
> vdsm on f19 haven't joined to cluster (known issue). I performed
> modification of vlan and MTU. Both boxes were modified but I noticed
> that by accident I modified ovirtmgmt network and I noticed that I
> lost connectivity. el6 box recovered whereas f19 haven't. I think it
> was because the vdsm was local to the engine.

That may be so. To make sure, I'd love to see your vdsm.log and
supervdsm.log of the time of the modification.

> I spend sometime trying
> to recover network configuration.

Thanks for testing this feature.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Cluster compatibility

2014-01-27 Thread Dan Kenigsberg
On Sun, Jan 26, 2014 at 01:38:50PM +0200, Itamar Heim wrote:
> On 01/23/2014 11:54 AM, Sandro Bonazzola wrote:
> >Il 23/01/2014 10:49, Piotr Kliczewski ha scritto:
> >>I wanted to install two hosts one on f19 and the second on el6. I
> >>created additional cluster for el6.
> >>Host installation for el6 worked well and it joined the cluster
> >>without any issues. Whereas host in f19was successfully deployed
> >>but it failed to join the cluster due to:
> >>
> >>Host fedora is compatible with versions (3.0,3.1,3.2,3.3) and cannot
> >>join Cluster Default which is set to version 3.4.
> >
> >known issue, on F19 please enable fedora-virt-preview repo, update 
> >"libvirt*" and retry
> >
> 
> why would a newer libvirt be required to get 3.4 compat mode?

http://gerrit.ovirt.org/#/c/23628/4/vdsm/caps.py

"""
VIR_MIGRATE_ABORT_ON_ERROR not found in libvirt,
support for clusterLevel >= 3.4 is disabled.
For Fedora 19 users, please consider upgrading
libvirt from the virt-preview repository
"""
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Storage unresponsive after sanlock

2014-01-27 Thread Trey Dockendorf
I setup my first oVirt instance since 3.0 a few days ago and it went
very well, and I left the single host cluster running with 1 VM over
the weekend.  Today I come back and the primary data storage is marked
as unresponsive.  The logs are full of entries [1] that look very
similar to a knowledge base article on RHEL's website [2].

This setup is using NFS over RDMA and so far the ib interfaces report
no errors (via `ibcheckerrs -v  1`).  Based on a doc on ovirt
site [3] it seems this could be due to response problems.  The storage
system is a new purchase and not yet in production so if there's any
advice on how to track down the cause that would be very helpful.
Please let me know what additional information would be helpful as
it's been about a year since I've been active in the oVirt community.

Thanks
- Trey

[1]: http://pastebin.com/yRpSLKxJ

[2]: https://access.redhat.com/site/solutions/400463

[3]: http://www.ovirt.org/SANLock
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Centos 6.5 and bonding: "A slave interface is not properly configured"

2014-01-27 Thread Federico Alberto Sayd

On 27/01/14 13:02, Moti Asayag wrote:

Hi Federico,

- Original Message -

From: "Federico Alberto Sayd" 
To: users@ovirt.org
Sent: Thursday, January 23, 2014 4:56:22 PM
Subject: Re: [Users] Centos 6.5 and bonding: "A slave interface is not properly 
configured"

On 22/01/14 21:31, Dan Kenigsberg wrote:

On Wed, Jan 22, 2014 at 01:57:35PM -0300, Federico Alberto Sayd wrote:

On 22/01/14 12:13, Dan Kenigsberg wrote:

On Wed, Jan 22, 2014 at 07:43:52AM +, Karli Sjöberg wrote:

On Tue, 2014-01-21 at 21:22 -0300, Federico Sayd wrote:

Hello:

I am having problems with bonding

I have installed Centos 6.5 in order to use it as host. I configured
eth0 with the vlan of the management network (Vlan 70). Then I
registered the host to the engine (3.3.2-1-el6) and the engine
installed oVirt in the host without problem.  Ovirtmgmt was created
automatically and bridged with eth0.70.

Now I need to bond a second network interface (eth1) with eth0. But
when I try to bond the nics, I get the next error:

"A slave interface is not properly configured. Please verify slaves do
not contain any of the following properties: network name, boot
protocol, IP address, netmask, gateway or vlan-ID notation (as part of
interface's name or explicitly)"

Federico, where exactly do you get this error? Would you attach the
setupNetwork log from supervdsmd.log?

I get the error in the setup-network dialog in ovirt-engine.

Today I solved the issue copying the network config of other host
(same hardware), and it worked.

The supervdsm.log whith the lines logged yesterday:

http://pastebin.com/kpXrRd2w

It would be nice if the error could be more explicit, i.e. telling
the ifcfg-* that are conflictive.

I do not understand the error yet... I believe that in the text you have
quoted, Engine complains that an interface has not joined a bond. But
Engine's command to Vdsm

MainProcess|Thread-15::DEBUG::2014-01-21
13:13:21,166::supervdsmServer::95::SuperVdsm.ServerCallback::(wrapper)
call setupNetworks with ({'ovirtmgmt': {'nic': 'eth0', 'vlan': '70',
'ipaddr': '192.168.1.101', 'netmask': '255.255.255.0', 'STP': 'no',
'bridged': 'true'}}, {}, {'connectivityCheck': 'true',
'connectivityTimeout': 120}

contains no reference to a bond device, and seems to have succeeded.

One notable problem is that the network definitions lack a 'gateway'
parameter, which is very important for ovirtmgmt.

Would you share your vdsm.log, too? The output of getCapabilities before
and after setupNetworks may shed some light on the circumstances.

Regads,
Dan.


Exactly , Engine doesn't want to create the bond because the
configuration of a nic have "unacceptable" parameters.

But, specifically what parameters? Which interface? Could the error be
more explicit?

Engine complains about the contents of ifcfg-* or actual network config?
or both? In any case  I restarted network service after edit ifcfg-* files.

I got the error when I tried to bond the interfaces. The ovirtmgmt was
created by oVirt and bridged to eth0.70 when the host was installed via
oVirt Engine. After, I tried to bond the two interfaces: eth0, (with
ovirtmgmt attached to it) and eth1 (without config), then the gui showed
the error about ifcfg-* parameters.

I guess that the text that you quoted corresponds to the creation of
ovirmgmnt network at install time. I don't find in supervdsm.log any
references to the bond creation.

vdsm.log: http://pastebin.com/AGSMBnkN


I took a closer look at the vdsm.log file and i've noticed that the 
'getCapabilities'
reports the following for the 'eth1' interface:

nics': {'eth1': {'netmask': '', 'addr': '', 'hwaddr': 'e4:1f:13:1a:5b:da', 
'cfg': {'UUID': '3d63cd78-57e5-4f26-81c4-8a342a342ef4', 'NM_CONTROLLED': 'yes', 
'HWADDR': 'E4:1F:13:1A:5B:DA', 'BOOTPROTO': 'dhcp', 'DEVICE': 'eth1', 'TYPE': 
'Ethernet', 'ONBOOT': 'no'}, 'ipv6addrs': ['fe80::e61f:13ff:fe1a:5bda/64'], 
'speed': 1000, 'mtu': '1500'},

This in interface is configured with boot-protocol as 'dhcp' and cannot serve 
as a
slave. In addition, it is marked as managed by the network manager which i'm 
not sure
is advisable.

This somehow defers from the output of the ifcfg-eth1 content which didn't 
specify any
value for that device:

ifcfg-eth1:
DEVIC E=eth1
TYPE=Ethernet
ONBOOT=yes

Adding to this file:
NM_CONTROLLED=no
BOOTPROTO=none

and restarting the network service && vdsm would reflect this information to 
the engine
which will allow to refer to such a nic as a slave when constructing a bond.

The reason for not seeing any bond created in the [super]vdsm.log is due to the 
fact this
action was blocked on the ovirt-engine side and never sent to vdsm.

Regards,
Moti


Thanks
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



I restarted network service with the config from ifcfg-* that previously 
posted. But I didn't restart the vdsm service.


Regards

Federico
__

[Users] oVirt 3.4 test day - results

2014-01-27 Thread Alexander Wels
Hi, I tested the following items during the test day and here are my results:

1. "reboot VM" functionality

The related feature page is: http://www.ovirt.org/Features/Guest_Reboot
The feature page mentions a policy selection checkbox which I was unable to 
find in 
the web admin UI at all. I checked the patches that implement the feature and 
did not 
see the check box implementation. The patches did show me that all I need to 
use the 
feature was to install the guest agent on the guest. So for my test I installed 
a fedora 
guest, and I installed the guest agent on the guest. After about a minute after 
starting 
the guest, the reboot button was enabled and pressing it started the reboot 
sequence 
on the guest. 

I had a console open on the guest and it informed me that the admin had started 
the 
reboot process and the guest would be rebooted in a minute. I did not find a 
way to 
change the time it took for the reboot to happen.

I did the same test with the REST api, with the same result. The reboot was 
scheduled 
for a minute after I issued the command. I did not find a way to change the 
time with 
the REST api either. I am guessing that is a future feature.

2. Fix Control-Alt-Delete functionality in console options

I had trouble getting spice to work in my test setup, but no issues with VNC. 
So I 
tested VNC. I checked the VM console options to make sure that 'Map 
ctrl-alt-del 
shortcut to ctrl+alt+end' was checked. Then I connected to a running VM with 
VNC. I 
pressed ctrl-+alt+end expected it to issue a ctrl-alt-del to the guest. Nothing 
happened. I pressed ctrl-alt-del and it properly issued ctrl-alt-del to the 
guest. I made 
sure there was no issue with my client by using the menu to issue a 
ctrl-alt-del to the 
guest which also resulted in the proper action on the guest. I opened a bug for 
this:
/https://bugzilla.redhat.com/show_bug.cgi?id=1057763/[1]

I did this test on my Fedora machine, and the description mentions that certain 
OSes 
capture the ctrl-alt-del before sending it to the guest, Fedora is not one of 
those OSes, 
so maybe my test was not valid?

3. Show name of the template in General tab for a VM if the VM is 
deployed from template via clone allocation.

This is a very straight forward test. I created a template from a VM. I named 
the 
template. Then created a VM from that template using clone allocation. I 
verified that 
the name of the template is now properly shown in the VM general sub tab. Works 
as 
expected.

Overall I had issues getting engine installed due to the shmmax issue reported 
in 
other threads, and then I had a really hard time adding new hosts from a blank 
fedora 
minimum install. I was successful one out of three attempts, which I feel was 
probably 
an yum repository issue as I was getting conflicting python-cpopen issues 
causing 
VDSM to not start.

Thanks,
Alexander


[1] https://bugzilla.redhat.com/show_bug.cgi?id=1057763
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Centos 6.5 and bonding: "A slave interface is not properly configured"

2014-01-27 Thread Moti Asayag
Hi Federico,

- Original Message -
> From: "Federico Alberto Sayd" 
> To: users@ovirt.org
> Sent: Thursday, January 23, 2014 4:56:22 PM
> Subject: Re: [Users] Centos 6.5 and bonding: "A slave interface is not 
> properly configured"
> 
> On 22/01/14 21:31, Dan Kenigsberg wrote:
> > On Wed, Jan 22, 2014 at 01:57:35PM -0300, Federico Alberto Sayd wrote:
> >> On 22/01/14 12:13, Dan Kenigsberg wrote:
> >>> On Wed, Jan 22, 2014 at 07:43:52AM +, Karli Sjöberg wrote:
>  On Tue, 2014-01-21 at 21:22 -0300, Federico Sayd wrote:
> > Hello:
> >
> > I am having problems with bonding
> >
> > I have installed Centos 6.5 in order to use it as host. I configured
> > eth0 with the vlan of the management network (Vlan 70). Then I
> > registered the host to the engine (3.3.2-1-el6) and the engine
> > installed oVirt in the host without problem.  Ovirtmgmt was created
> > automatically and bridged with eth0.70.
> >
> > Now I need to bond a second network interface (eth1) with eth0. But
> > when I try to bond the nics, I get the next error:
> >
> > "A slave interface is not properly configured. Please verify slaves do
> > not contain any of the following properties: network name, boot
> > protocol, IP address, netmask, gateway or vlan-ID notation (as part of
> > interface's name or explicitly)"
> >>> Federico, where exactly do you get this error? Would you attach the
> >>> setupNetwork log from supervdsmd.log?
> >> I get the error in the setup-network dialog in ovirt-engine.
> >>
> >> Today I solved the issue copying the network config of other host
> >> (same hardware), and it worked.
> >>
> >> The supervdsm.log whith the lines logged yesterday:
> >>
> >> http://pastebin.com/kpXrRd2w
> >>
> >> It would be nice if the error could be more explicit, i.e. telling
> >> the ifcfg-* that are conflictive.
> > I do not understand the error yet... I believe that in the text you have
> > quoted, Engine complains that an interface has not joined a bond. But
> > Engine's command to Vdsm
> >
> > MainProcess|Thread-15::DEBUG::2014-01-21
> > 13:13:21,166::supervdsmServer::95::SuperVdsm.ServerCallback::(wrapper)
> > call setupNetworks with ({'ovirtmgmt': {'nic': 'eth0', 'vlan': '70',
> > 'ipaddr': '192.168.1.101', 'netmask': '255.255.255.0', 'STP': 'no',
> > 'bridged': 'true'}}, {}, {'connectivityCheck': 'true',
> > 'connectivityTimeout': 120}
> >
> > contains no reference to a bond device, and seems to have succeeded.
> >
> > One notable problem is that the network definitions lack a 'gateway'
> > parameter, which is very important for ovirtmgmt.
> >
> > Would you share your vdsm.log, too? The output of getCapabilities before
> > and after setupNetworks may shed some light on the circumstances.
> >
> > Regads,
> > Dan.
> >
> 
> Exactly , Engine doesn't want to create the bond because the
> configuration of a nic have "unacceptable" parameters.
> 
> But, specifically what parameters? Which interface? Could the error be
> more explicit?
> 
> Engine complains about the contents of ifcfg-* or actual network config?
> or both? In any case  I restarted network service after edit ifcfg-* files.
> 
> I got the error when I tried to bond the interfaces. The ovirtmgmt was
> created by oVirt and bridged to eth0.70 when the host was installed via
> oVirt Engine. After, I tried to bond the two interfaces: eth0, (with
> ovirtmgmt attached to it) and eth1 (without config), then the gui showed
> the error about ifcfg-* parameters.
> 
> I guess that the text that you quoted corresponds to the creation of
> ovirmgmnt network at install time. I don't find in supervdsm.log any
> references to the bond creation.
> 
> vdsm.log: http://pastebin.com/AGSMBnkN
> 

I took a closer look at the vdsm.log file and i've noticed that the 
'getCapabilities'
reports the following for the 'eth1' interface:

nics': {'eth1': {'netmask': '', 'addr': '', 'hwaddr': 'e4:1f:13:1a:5b:da', 
'cfg': {'UUID': '3d63cd78-57e5-4f26-81c4-8a342a342ef4', 'NM_CONTROLLED': 'yes', 
'HWADDR': 'E4:1F:13:1A:5B:DA', 'BOOTPROTO': 'dhcp', 'DEVICE': 'eth1', 'TYPE': 
'Ethernet', 'ONBOOT': 'no'}, 'ipv6addrs': ['fe80::e61f:13ff:fe1a:5bda/64'], 
'speed': 1000, 'mtu': '1500'},

This in interface is configured with boot-protocol as 'dhcp' and cannot serve 
as a
slave. In addition, it is marked as managed by the network manager which i'm 
not sure
is advisable.

This somehow defers from the output of the ifcfg-eth1 content which didn't 
specify any
value for that device:

ifcfg-eth1:
DEVIC E=eth1
TYPE=Ethernet
ONBOOT=yes

Adding to this file:
NM_CONTROLLED=no
BOOTPROTO=none

and restarting the network service && vdsm would reflect this information to 
the engine
which will allow to refer to such a nic as a slave when constructing a bond.

The reason for not seeing any bond created in the [super]vdsm.log is due to the 
fact this
action was blocked on the ovirt-engine side and never sent to vdsm.

Regards,
Moti

> Thanks
> 

[Users] ovirt test day: HA VM Reservation feature test summary

2014-01-27 Thread Moti Asayag
Hi All,

In the latest ovirt-test-day i've tested the HA VM resource reservation
feature [1] according to the basic scenarios as described on [2].

The new feature notifies the admin via an event log about his cluster
inability to preserve resources for HA VMs. I've reported 2 bugs based
on the behavior: The cluster check doesn't consider the state of the
cluster's hosts when it calculates the resources [3] and a minor issue
of the audit log translation into a message [4].

[1] http://www.ovirt.org/Features/HA_VM_reservation
[2] http://www.ovirt.org/OVirt_3.4_TestDay#SLA
[3] Bug 1057579 -HA Vm reservation check ignores host status
https://bugzilla.redhat.com/show_bug.cgi?id=1057579
[4] Bug 1057584 -HA Vm reservation event log is not well resolved
https://bugzilla.redhat.com/show_bug.cgi?id=1057584

Thanks,
Moti
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] networking: basic vlan help

2014-01-27 Thread Lior Vernia


On 26/01/14 15:40, Mike Kolesnik wrote:
> - Original Message -
>> On 01/23/2014 08:34 PM, Juan Pablo Lorier wrote:
>>> Hi Itamar,
>>>
>>> I don't know if I get your post right, but to me, it seems that if so
>>> many users hit the same rock, it should mean that this should be
>>> documented somewhere visible and in my opinion, push on getting bug
>>> 1049476  solved asap.
>>> Regards,
>>
>> 1. yes, too many issues on this one, hinting we should provide better
>> text explaining this in the UI.
>>
>> 2. the bug you referenced[1]
>> Bug 1049476 - [RFE] Mix untagged and tagged Logical Networks on the same NIC
>>
>> is actually supported, as long as the untagged logical network is not a
>> VM network (so VMs associated with it would not be able to see/create
>> other logical networks traffic).
>>
>> 3. considering how prevalent this is, maybe we should allow doing this,
>> even for VM networks, with a big red warning, rather than block it,
>> which seems to be failing everyone.
> 
> Besides that it's technically not possible in the way we currently use the 
> Linux Bridge [1],
> I'm not sure what's to gain from representing a single "flat" network with 
> multiple representations.
> 
> Seems to me like there may be a couple different points here:
> * ovirtmgmt is VM network by default - should be configurable on setup and/or 
> DC creation.
>   If it's such a prevalent issue, we should consider a default of non VM 
> network (users can create a flat network and use it quite easily anyway, if 
> they want).

>From a UX point of view I don't think this would be desireable. I think
it's convenient for a new user to be able to use just the one default
network for everything (including connection to VMs).

> * if people want to represent different L3 networks on the same L2 network, 
> it is worthwhile to design a proper solution
> 
> Either way, I wouldn't push for allowing multiple bridged networks on the 
> same physical interface (or bond).
> 
> [1] and also not allowed in OpenStack Neutron IIUC.
> 
>>
>> cc-ing some more folks for their thoughts.
>>
>>
>> [1] in the future, please use number-name formatso not everyone would
>> have to open it to understand
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] [ATTN] Future oVirt Weekly Meeting 2014-02-05

2014-01-27 Thread Doron Fediuck
Hi All,
next week many of us will be traveling for FOSDEM, cfgmgmtcmp, infra.next
and other ovirt related conferences, as mentioned in previous sync meeting.

As a result many will not be available, and it seems appropriate to skip
next week's meeting. So I'd like to suggest we cancel this instance.

Feedback is more than welcome,
Doron
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Centos 6.5 and bonding: "A slave interface is not properly configured"

2014-01-27 Thread Lior Vernia


On 25/01/14 01:24, Moti Asayag wrote:
> 
> 
> - Original Message -
>> From: "Federico Alberto Sayd" 
>> To: users@ovirt.org
>> Sent: Friday, January 24, 2014 4:06:11 PM
>> Subject: Re: [Users] Centos 6.5 and bonding: "A slave interface is not 
>> properly configured"
>>
>> On 24/01/14 08:31, Moti Asayag wrote:
>>>
>>> - Original Message -
 From: "Federico Alberto Sayd" 
 To: users@ovirt.org
 Sent: Thursday, January 23, 2014 3:57:53 PM
 Subject: Re: [Users] Centos 6.5 and bonding: "A slave interface is not
 properly configured"

 On 23/01/14 07:02, Moti Asayag wrote:
> - Original Message -
>> From: "Federico Sayd" 
>> To: users@ovirt.org
>> Sent: Wednesday, January 22, 2014 2:22:01 AM
>> Subject: [Users] Centos 6.5 and bonding: "A slave interface is not
>> properly configured"
>>
>> Hello:
>>
>> I am having problems with bonding
>>
>> I have installed Centos 6.5 in order to use it as host. I configured
>> eth0
>> with the vlan of the management network (Vlan 70). Then I registered the
>> host to the engine (3.3.2-1-el6) and the engine installed oVirt in the
>> host
>> without problem. Ovirtmgmt was created automatically and bridged with
>> eth0.70.
>>
>> Now I need to bond a second network interface (eth1) with eth0. But when
>> I
>> try to bond the nics, I get the next error:
>>
> Could you describe how you've created the bond ? via webadmin setup
> networks
> dialog or via api ?
>
>
 Via webadmin Setup Network (Web GUI)
>>> Does the setup dialog presents the new configuration when you create the
>>> bond ?
>>> Meaning, does it draw the following ?
>>> eth0 --
>>> |--bond0 --- ovirtmgmt (vlan 70)
>>> eth1 --
>>>
>>> If it does, this is simply bug in the UI which should have better construct
>>> the parameters to the setup networks.
>>>
>>> Could you open a bug for it ?
>>>
>>> You may use the setup networks api (via rest-client or using the sdk) to
>>> specify
>>> the target configuration.
>>>
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

>>>
>> No, the bond is never displayed in the web UI, the error prevents the
>> bond creation. I already solved the problem, only I think that would be
>> fine if the UI error be more explicit and include info about to the
>> interface or interfaces with problematic configurations.
>>
>> I don't know if supervdsm.log or vdsm.log make reference to the
>> interfaces that have unacceptable configurations.
>>
> 
> The error indicates that the parameters sent from the UI to that engine
> weren't constructed properly. In this case the engine failed the action
> and it never reached the host.
> 
> Lior, could you verify this scenario and see which fields left with
> improper values on the UI side ?
> 

Apologies for the delay in responding. I tried running a similar
scenario via the GUI on my deployment and it worked fine (and this part
of the code shouldn't have changed recently). So I suspect the problem
isn't with the Setup Networks GUI.

> _
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] two node ovirt cluster with HA

2014-01-27 Thread Eli Mesika


- Original Message -
> From: "Tareq Alayan" 
> To: "Andrew Lau" , "Eli Mesika" 
> Cc: d...@redhat.com, "Karli Sjöberg" , users@ovirt.org
> Sent: Monday, January 27, 2014 2:59:02 PM
> Subject: Re: [Users] two node ovirt cluster with HA
> 
> Adding Eli.

I just want to summarize the requirement as I understand it:

In the case that a Host that is running HA VMs and have PM configured is turned 
off manually :

1) The non-responsive treatment should be modified to check Host status via PM 
agent 
2) If Host is off , HA VMs will attempt to run on another host ASAP
3) The host status should be set to DOWN
4) No attempt to restart vdsm (soft fencing) or restart the host (hard fencing) 
will be done 

Is the above correct? if so , a RFE on that can be opened 

> 
> 
> On 01/27/2014 02:50 PM, Andrew Lau wrote:
> > Hi,
> >
> > I think he was asking what if the power management device reported
> > that the host was powered off. Then VMs should be brought back up as
> > being off would essentially be the same as running a power cycle/reboot?
> >
> > Another example I'm seeing is what happens if the whole host loses
> > power and it's power management device then becomes unavailable (ie.
> > not reachable) then you're stuck in the case where it requires manual
> > intervention.
> >
> > I would be interested to potentially see something like a timeout on
> > those problematic VMs (eg. if nothing was read or write after x amount
> > of time) then you could consider the host as offline? I guess then
> > that adds a lot of risk..
> >
> >
> > On Mon, Jan 27, 2014 at 11:43 PM, Tareq Alayan  > > wrote:
> >
> > Hi,
> >
> > Power management makes use of special *dedicated* hardware in
> > order to restart hosts independently of host OS. The engine
> > connects to a power management devices using a *dedicated* network
> > IP address.
> > The engine is capable of rebooting hosts that have entered a
> > non-operational or non-responsive state,
> > The abilities provided by all power management devices are: check
> > status, start, stop and recycle (restart)...
> >
> > In the case of non-responsive host: all of the VMs that are
> > currently running on that host can also become non-responsive.
> > However, the non-responsive host keeps locking the VM hard disk
> > for all VMs it is running. Attempting to start a VM on a different
> > host and assign the second host write privileges for the virtual
> > machine hard disk image can cause data corruption.
> > Rebooting allows the engine to assume that the lock on a VM hard
> > disk image has been released.
> > The engine can know for sure that the problematic host has been
> > rebooted via the power management device and then it can start a
> > VM from the problematic host on another host without risking data
> > corruption.
> > Important note: A virtual machine that has been marked
> > highly-available can not be safely started on a different host
> > without the certainty that doing so will not cause data corruption.
> >
> > N-joy,
> >
> > --Tareq
> >
> >
> >
> >
> > On 01/27/2014 02:05 PM, Dafna Ron wrote:
> >
> > I am adding Tareq for the Power Management implementation.
> >
> > Dafna
> >
> >
> > On 01/27/2014 11:48 AM, Karli Sjöberg wrote:
> >
> > On Mon, 2014-01-27 at 11:11 +, Dafna Ron wrote:
> >
> > Powering off the host will never trigger vm migration.
> > As far as engine is concerned it just lost connection
> > to the host, but
> > has no way of telling if the host is down or if a
> > router is down.
> >
> > Can´t it at least check with power management if the Host
> > status is down
> > first?
> >
> > I mean, if the network is down there will be no response
> > from either PM
> > or Host. But if PM is up and can tell you that the Host is
> > down, sounds
> > rather clear cut to me...
> >
> > Seems to me the VM's would be restarted sooner if the flow
> > was altered
> > to first check with PM if it´s a network or Host issue,
> > and if Host
> > issue, immediately restart VM's on another Host, instead
> > of waiting for
> > a potentially problematic Host to boot up eventually.
> >
> > /K
> >
> > since vm's can continue running on the host even if
> > engine has no access
> > to it, starting the vm's on the second host can cause
> > split brain and
> > data corruption.
> >
> > The way that the engine knows what's going on is by
> > sending heath check
> > queries to the vdsm.
> > Power 

Re: [Users] networking: basic vlan help

2014-01-27 Thread Juan Pablo Lorier
Hi Mike,

I'd like to say that though setting ovirtmgmt as non vm as a default
should be nice, it won't be enough as it won't allow to use mixed
traffic in other interfaces either, so the way I see it, the fix should
be to add this ability to ovirt. I can't make my mind to think what a
big corporation may need in security restrictions, but as a small
company, I'm willing to take the risk of a hardly probable security
breach in favor of been able to use untagged and tagged vlans on the
same nic.
Regards,

On 26/01/14 11:40, Mike Kolesnik wrote:
> - Original Message -
>> On 01/23/2014 08:34 PM, Juan Pablo Lorier wrote:
>>> Hi Itamar,
>>>
>>> I don't know if I get your post right, but to me, it seems that if so
>>> many users hit the same rock, it should mean that this should be
>>> documented somewhere visible and in my opinion, push on getting bug
>>> 1049476  solved asap.
>>> Regards,
>> 1. yes, too many issues on this one, hinting we should provide better
>> text explaining this in the UI.
>>
>> 2. the bug you referenced[1]
>> Bug 1049476 - [RFE] Mix untagged and tagged Logical Networks on the same NIC
>>
>> is actually supported, as long as the untagged logical network is not a
>> VM network (so VMs associated with it would not be able to see/create
>> other logical networks traffic).
>>
>> 3. considering how prevalent this is, maybe we should allow doing this,
>> even for VM networks, with a big red warning, rather than block it,
>> which seems to be failing everyone.
> Besides that it's technically not possible in the way we currently use the 
> Linux Bridge [1],
> I'm not sure what's to gain from representing a single "flat" network with 
> multiple representations.
>
> Seems to me like there may be a couple different points here:
> * ovirtmgmt is VM network by default - should be configurable on setup and/or 
> DC creation.
>   If it's such a prevalent issue, we should consider a default of non VM 
> network (users can create a flat network and use it quite easily anyway, if 
> they want).
> * if people want to represent different L3 networks on the same L2 network, 
> it is worthwhile to design a proper solution
>
> Either way, I wouldn't push for allowing multiple bridged networks on the 
> same physical interface (or bond).
>
> [1] and also not allowed in OpenStack Neutron IIUC.
>
>> cc-ing some more folks for their thoughts.
>>
>>
>> [1] in the future, please use number-name formatso not everyone would
>> have to open it to understand
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] oVirt 3.4 test day - Template Versions

2014-01-27 Thread Omer Frenkel
Thanks for the feedback! much appreciated.

- Original Message -
> From: "Federico Simoncelli" 
> To: "oVirt Users List" 
> Cc: "Omer Frenkel" , "Itamar Heim" 
> Sent: Monday, January 27, 2014 5:12:38 PM
> Subject: oVirt 3.4 test day - Template Versions
> 
> Feature tested:
> 
> http://www.ovirt.org/Features/Template_Versions
> 
> - create a new vm vm1 and make a template template1 from it
> - create a new vm vm2 based on template1 and make some changes
> - upgrade to 3.4
> - create a new template template1.1 from vm2
> - create a new vm vm3 from template1 (clone) - content ok
> - create a new vm vm4 from template1.1 (thin) - content ok
> - create a new vm vm5 from template1 last (thin) - content ok (same as 1.1)
> - try to remove template1 (failed as template1.1 is still present)
> - try to remove template1.1 (failed as vm5 is still present)
> - create a new vm vm6 and make a template blank1.1 as new version of the
>   blank template (succeeded)
> - create a vm pool vmpool1 with the "latest" template from template1
> - create a vm pool vmpool2 with the "template1.1" (last) template from
> template1
> - start vmpool1 and vmpool2 and verify that the content is the same
> - create a new template template1.2
> - start vmpool1 and verify that the content is the same as latest
> (template1.2)
> - start vmpool2 and verify that the content is the same as template1.1
> 
> Suggestions:
> 
> - the template blank is special, I am not sure if allowing versioning may
>   be confusing (for example is not even editable)

right, i also thought about this, and my thought was not to block the user from 
doing this,
but if it was confusing we better block it.

> - as far as I can see the "Sub Version Name" is not editable anymore (after
>   picking it)

thanks, i see its missing in the UI, do you care to open a bug on that?

> 
> --
> Federico
> 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Multi-Host Network Configuration

2014-01-27 Thread Piotr Kliczewski
Hi,

I played with multi host network config and used two boxes to test it.
One el6 (vdsm) and f19 (engine, vdsm). During the test I noticed that
vdsm on f19 haven't joined to cluster (known issue). I performed
modification of vlan and MTU. Both boxes were modified but I noticed
that by accident I modified ovirtmgmt network and I noticed that I
lost connectivity. el6 box recovered whereas f19 haven't. I think it
was because the vdsm was local to the engine. I spend sometime trying
to recover network configuration.

Thanks,
Piotr
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] oVirt 3.4 test day - Template Versions

2014-01-27 Thread Federico Simoncelli
Feature tested:

http://www.ovirt.org/Features/Template_Versions

- create a new vm vm1 and make a template template1 from it
- create a new vm vm2 based on template1 and make some changes
- upgrade to 3.4
- create a new template template1.1 from vm2
- create a new vm vm3 from template1 (clone) - content ok
- create a new vm vm4 from template1.1 (thin) - content ok
- create a new vm vm5 from template1 last (thin) - content ok (same as 1.1)
- try to remove template1 (failed as template1.1 is still present)
- try to remove template1.1 (failed as vm5 is still present)
- create a new vm vm6 and make a template blank1.1 as new version of the
  blank template (succeeded)
- create a vm pool vmpool1 with the "latest" template from template1
- create a vm pool vmpool2 with the "template1.1" (last) template from template1
- start vmpool1 and vmpool2 and verify that the content is the same
- create a new template template1.2
- start vmpool1 and verify that the content is the same as latest (template1.2)
- start vmpool2 and verify that the content is the same as template1.1

Suggestions:

- the template blank is special, I am not sure if allowing versioning may
  be confusing (for example is not even editable)
- as far as I can see the "Sub Version Name" is not editable anymore (after
  picking it)

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] OVirt 3.3.2 Snapshot Pane empty in Firefox

2014-01-27 Thread Daniel Erez


- Original Message -
> From: "Markus Stockhausen" 
> To: "Itamar Heim" , "ovirt-users" , 
> "Daniel Erez" 
> Cc: "Allon Mureinik" 
> Sent: Monday, January 27, 2014 1:33:38 PM
> Subject: AW: [Users] OVirt 3.3.2 Snapshot Pane empty in Firefox
> 
> > Von: Itamar Heim [ih...@redhat.com]
> > Gesendet: Sonntag, 26. Januar 2014 11:52
> > An: Markus Stockhausen; ovirt-users; Daniel Erez
> > Cc: Allon Mureinik
> > Betreff: Re: [Users] OVirt 3.3.2 Snapshot Pane empty in Firefox
> > 
> > On 01/23/2014 12:49 PM, Markus Stockhausen wrote:
> > > Hello,
> > >
> > > I had a mysterios behaviour in the webinterface twice this week.
> > > The Snapshot list of a VM remained empty in Firefox although
> > > I know that snapshots exist.
> > >
> > > The "Create Snapshot" button was still working and issued the
> > > right commands. Nevertheless the list remained empty.
> > >
> > > Opening an IE session prooved that everything was ok. The
> > > pane was populated with the list of snapshots. After restarting
> > > Firefox everything was fine again.
> > >
> > > Has anybody experienced simialr issues and if yes is there
> > > already an opn BZ for that?
> > >
> > > Markus
> > >
> > >
> > > ___
> > > Users mailing list
> > > Users@ovirt.org
> > > http://lists.ovirt.org/mailman/listinfo/users
> > >
> > 
> > derez?
> >
> 
> Got the same behaviour again this morning. But now I have
> hopefully a good idea. I unlocked my screen and the last OVirt
> session in the (Firefox) browser has flipped back to the login
> screen.
> 
> I relogged in and directly jumped to the snapshot pane. It was
> empty and the create button worked as usual. Nevertheless
> the autorefresh did not work. The pane simply stayed empty
> even after creating three snapshots in a row. Btw. the log
> history showed all the actions I was doing.
> 
> To be sure that is the reason I left a Ovirt admin browser window
> open. Now for over 2 hours without a switchback to the login
> page.
> 
> Any idea how I can force that behaviour? A normal cycle of
> logout/login does not produce the error.

Could be indeed related to the session timeout, some issues
regarding have been addressed in 3.4. The default timeout
is 30 minutes defined by UserSessionTimeOutInterval in
vdc_options table. Can you consistently reproduce the issue 
on timeout? Which version of Firefox are you using?

> 
> Markus
> 
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Centos 6.5: mom spam

2014-01-27 Thread Itamar Heim

On 01/27/2014 04:39 PM, Sven Kieske wrote:

Hi,

what about older but still supported ovirt version?
I ask for all the 3.3.x people, we could need the bugfixes
from mom, too.


the host side packages are all supposed to be backward compatible, so 
once this is fixed for latets vdsm/mom/etc., you should be able to use 
this with ovirt 3.2 engine as well, etc.?




Am 24.01.2014 11:44, schrieb Dan Kenigsberg:

On Thu, Jan 23, 2014 at 04:23:48PM -0500, Adam Litke wrote:

On 23/01/14 23:02 +0200, Itamar Heim wrote:

On 01/23/2014 08:04 PM, Adam Litke wrote:

On 23/01/14 10:43 -0300, Federico Alberto Sayd wrote:

On 22/01/14 21:32, Dan Kenigsberg wrote:

On Wed, Jan 22, 2014 at 01:57:35PM -0300, Federico Alberto Sayd wrote:

On 22/01/14 12:13, Dan Kenigsberg wrote:




The supervdsm.log whith the lines logged yesterday:

http://pastebin.com/kpXrRd2w


On an unrelated matter: this log includes

MainProcess|PolicyEngine::DEBUG::2014-01-21
13:13:41,187::supervdsmServer::95::SuperVdsm.ServerCallback::(wrapper) call
ksmTune with ({},) {}
MainProcess|PolicyEngine::DEBUG::2014-01-21
13:13:41,188::supervdsmServer::102::SuperVdsm.ServerCallback::(wrapper)
return ksmTune with None
MainProcess|PolicyEngine::DEBUG::2014-01-21
13:13:51,220::supervdsmServer::95::SuperVdsm.ServerCallback::(wrapper) call
ksmTune with ({},) {}
MainProcess|PolicyEngine::DEBUG::2014-01-21
13:13:51,220::supervdsmServer::102::SuperVdsm.ServerCallback::(wrapper)
return ksmTune with None

spam every 10 seconds. Federico, which version of mom do you have
installed? I
thought we have solved a similar issue in the past.



mom-0.3.2-6.el6.noarch


This is the upstream version of mom which (sadly) is far too old and
is missing a bunch of features.  For Fedora, we are building master
and releasing versions like mom-0.3.2-20140120.gitfd877c5.fc20 [1]
which have lots of ovirt-specific fixes applied.

The fix is to have someone rebuild the mom RPM for el6 since the last
refresh happened on August 13 (!).  For now, please try to upgrade to
this build of master [2].


any reason we don't build it via jenkins to resources.ovirt.org
like other rpms?


Yes, because (up until now) the mom build system was not compatible
with other oVirt projects.  That was fixed this week [1] so now we can
release proper RPMs into the oVirt repositories.  Next I want to do a
MOM version bump and synchronize the official Fedora packages with
what we have in oVirt.

Any opinion on whether we should adopt the current MOM master (build
system rewrite) this late in the 3.4 release process?  If so, I will
work with Sandro to make sure that rc1 has it updated.  Othewise, we
can do that for 3.4.1.


Currently 3.4 beta includes
http://resources.ovirt.org/releases/beta/rpm/EL/6/noarch/mom-0.3.2-3.el6.noarch.rpm
which is sorely outdated, as you say. We should fix this as soon as
possible. Branching one week too late is a small price to pay, imo.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users







___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Centos 6.5: mom spam

2014-01-27 Thread Sven Kieske
Hi,

what about older but still supported ovirt version?
I ask for all the 3.3.x people, we could need the bugfixes
from mom, too.

Am 24.01.2014 11:44, schrieb Dan Kenigsberg:
> On Thu, Jan 23, 2014 at 04:23:48PM -0500, Adam Litke wrote:
>> On 23/01/14 23:02 +0200, Itamar Heim wrote:
>>> On 01/23/2014 08:04 PM, Adam Litke wrote:
 On 23/01/14 10:43 -0300, Federico Alberto Sayd wrote:
> On 22/01/14 21:32, Dan Kenigsberg wrote:
>> On Wed, Jan 22, 2014 at 01:57:35PM -0300, Federico Alberto Sayd wrote:
>>> On 22/01/14 12:13, Dan Kenigsberg wrote:
>> 
>>
>>> The supervdsm.log whith the lines logged yesterday:
>>>
>>> http://pastebin.com/kpXrRd2w
>>>
>> On an unrelated matter: this log includes
>>
>> MainProcess|PolicyEngine::DEBUG::2014-01-21
>> 13:13:41,187::supervdsmServer::95::SuperVdsm.ServerCallback::(wrapper) 
>> call
>> ksmTune with ({},) {}
>> MainProcess|PolicyEngine::DEBUG::2014-01-21
>> 13:13:41,188::supervdsmServer::102::SuperVdsm.ServerCallback::(wrapper)
>> return ksmTune with None
>> MainProcess|PolicyEngine::DEBUG::2014-01-21
>> 13:13:51,220::supervdsmServer::95::SuperVdsm.ServerCallback::(wrapper) 
>> call
>> ksmTune with ({},) {}
>> MainProcess|PolicyEngine::DEBUG::2014-01-21
>> 13:13:51,220::supervdsmServer::102::SuperVdsm.ServerCallback::(wrapper)
>> return ksmTune with None
>>
>> spam every 10 seconds. Federico, which version of mom do you have
>> installed? I
>> thought we have solved a similar issue in the past.
>>
>>
> mom-0.3.2-6.el6.noarch

 This is the upstream version of mom which (sadly) is far too old and
 is missing a bunch of features.  For Fedora, we are building master
 and releasing versions like mom-0.3.2-20140120.gitfd877c5.fc20 [1]
 which have lots of ovirt-specific fixes applied.

 The fix is to have someone rebuild the mom RPM for el6 since the last
 refresh happened on August 13 (!).  For now, please try to upgrade to
 this build of master [2].
>>>
>>> any reason we don't build it via jenkins to resources.ovirt.org
>>> like other rpms?
>>
>> Yes, because (up until now) the mom build system was not compatible
>> with other oVirt projects.  That was fixed this week [1] so now we can
>> release proper RPMs into the oVirt repositories.  Next I want to do a
>> MOM version bump and synchronize the official Fedora packages with
>> what we have in oVirt.
>>
>> Any opinion on whether we should adopt the current MOM master (build
>> system rewrite) this late in the 3.4 release process?  If so, I will
>> work with Sandro to make sure that rc1 has it updated.  Othewise, we
>> can do that for 3.4.1.
> 
> Currently 3.4 beta includes
> http://resources.ovirt.org/releases/beta/rpm/EL/6/noarch/mom-0.3.2-3.el6.noarch.rpm
> which is sorely outdated, as you say. We should fix this as soon as
> possible. Branching one week too late is a small price to pay, imo.
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
> 
> 

-- 
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH & Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Reboot causes poweroff of VM 3.4 Beta

2014-01-27 Thread Michal Skrivanek

On Jan 27, 2014, at 14:55 , Jonathan Archer  wrote:

> On 27/01/2014 13:15, Michal Skrivanek wrote:
> 
>> On Jan 27, 2014, at 12:21 , Jonathan Archer  wrote:
>>> On 27/01/2014 11:03, Michal Skrivanek wrote:
 On Jan 27, 2014, at 11:59 , Jonathan Archer  wrote:
> On 27/01/2014 10:56, Michal Skrivanek wrote:
>> On Jan 27, 2014, at 11:46 , Jonathan Archer  wrote:
>>> On 25/01/2014 20:02, Roy Golan wrote:
 Please attach engine.log and vdsm.log On Jan 25, 2014 5:59 PM, Jon 
 Archer < j...@rosslug.org.uk> wrote:
> Hi, Seem to be suffering an issue in 3.4 where if a vm Hi,
 Seem to be suffering an issue in 3.4 where if a vm is rebooted it 
 actually shuts down, this occurs for all guests regardless of OS 
 installed within. Anyone seen this? Jon 
 ___ Users mailing list 
 Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
>>> Attached is a sample of the engine.log and vdsm.log during a "reboot" 
>>> the VM as previously stated shutdown rather than reboot,. There wasn't 
>>> anything in the server.log around the time, the previous entry was at 
>>> least 30 mins before.
>> Hi, your log doesn't contain the relevant part, there's no command 
>> logged, other than "Message: VM wcsmail01 is down. Exit message: User 
>> shut down" which means the guest was shut down from inside of the OS how 
>> did you trigger the reboot/shutdown? Thanks, michal
>>> Thanks 
>>> ___
>>>  Users mailing list Users@ovirt.org 
>>> http://lists.ovirt.org/mailman/listinfo/users
> The reboot was triggered using the reboot command at the console.
 ok, makes sense there's nothing in the log
> I also have a windows 7 guest, which shuts down when the reboot option is 
> selected.
 what doesn't make sense is the behavior:) It should simply reboot, there's 
 nothing oVirt is doing in this case…could it be your OS is configured(or 
 has decided) to shutdown instead?
> Jon
>>> Hi, I'd be pointed towards the guest OS if it wasn't for 2 things: 1) this 
>>> happens to all guests of both windows and Linux flavours 2) the guests are 
>>> just plain vanilla installs with nothing special.
>> that is weird. Any special/non-default setting?
>> do you have libvirt/qemu logs? VDSM doesn't say much other than a clean 
>> user-initiated shutdown happened.
>> is ACPI enabled in the guest (unless you manually changed config it always 
>> is)
>> any logs from the guest?
>> 
>> Thanks,
>> michal
>> 
>>> Jon
> Yes it does seem to be weird, nothing special about my setup. Simple gluster 
> setup converted from all-in-one.
> 
> As I mentioned the guests are vanilla with regard to configs.
> 
> Just had a look in the libvirt and qemu logs, nothin in the libvirt, and 
> nothing exciting in the qemu log
> 
> 2014-01-27 13:37:16.842+: starting up
> LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin 
> QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name wcssrv01 -S -M rhel6.5.0 
> -cpu Conroe,+vmx -enable-kvm -m 1024 -realtime mlock=off -smp 
> 1,sockets=1,cores=1,threads=1 -uuid 350e285d-8c3a-494b-b484-27784caae976 
> -smbios type=1,manufacturer=oVirt,product=oVirt 
> Node,version=6-5.el6.centos.11.2,serial=439CF517-D52F-11DF-BBDA-C0970C0278AC,uuid=350e285d-8c3a-494b-b484-27784caae976
>  -nodefconfig -nodefaults -chardev 
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/wcssrv01.monitor,server,nowait
>  -mon chardev=charmonitor,id=monitor,mode=control -rtc 
> base=2014-01-27T13:37:16,driftfix=slew -no-shutdown -device 
> piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
> virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device 
> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive 
> file=/rhev/data-center/mnt/cauldron.arclab:_ISO__DOMAIN/abf4ff41-eb0a-4580-9ad0-bae53d56c6b0/images/----/CentOS-6.5-x86_64-minimal.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial=
>  -device 
> ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2 
> -drive 
> file=/rhev/data-center/mnt/glusterSD/cauldron.arclab:_data/c074a6e3-87be-445e-890f-601c215ba320/images/24cf5dba-6b4a-430c-9a15-cfdebe0c1f09/9e47f51f-5bc5-47a0-8f98-c8a14626c393,if=none,id=drive-virtio-disk0,format=raw,serial=24cf5dba-6b4a-430c-9a15-cfdebe0c1f09,cache=none,werror=stop,rerror=stop,aio=threads
>  -device 
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>  -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=32 -device 
> virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:2d:98:0e,bus=pci.0,addr=0x3,bootindex=3
>  -chardev 
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/350e285d-8c3a-494b-b484-27784caae976.com.redhat.rhevm.vdsm,server,nowait
>  -device 
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=

Re: [Users] Reboot causes poweroff of VM 3.4 Beta

2014-01-27 Thread Jonathan Archer
 

On 27/01/2014 13:15, Michal Skrivanek wrote: 

> On Jan 27, 2014, at 12:21 , Jonathan Archer  wrote:
> On 27/01/2014 11:03, Michal Skrivanek wrote: On Jan 27, 2014, at 11:59 , 
> Jonathan Archer  wrote: On 27/01/2014 10:56, Michal 
> Skrivanek wrote: On Jan 27, 2014, at 11:46 , Jonathan Archer 
>  wrote: On 25/01/2014 20:02, Roy Golan wrote: Please 
> attach engine.log and vdsm.log On Jan 25, 2014 5:59 PM, Jon Archer < 
> j...@rosslug.org.uk> wrote: Hi, Seem to be suffering an issue in 3.4 where if 
> a vm Hi, Seem to be suffering an issue in 3.4 where if a vm is rebooted it 
> actually shuts down, this occurs for all guests regardless of OS installed 
> within. Anyone seen this? Jon ___ 
> Users mailing list Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users [1]
 Attached is a sample of the engine.log and vdsm.log during a "reboot"
the VM as previously stated shutdown rather than reboot,. There wasn't
anything in the server.log around the time, the previous entry was at
least 30 mins before. Hi, your log doesn't contain the relevant part,
there's no command logged, other than "Message: VM wcsmail01 is down.
Exit message: User shut down" which means the guest was shut down from
inside of the OS how did you trigger the reboot/shutdown? Thanks, michal


> Thanks 
> ___ 
> Users mailing list Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users [1]
 The reboot was triggered using the reboot command at the console. ok,
makes sense there's nothing in the log 

> I also have a windows 7 guest, which shuts down when the reboot option is 
> selected.
 what doesn't make sense is the behavior:) It should simply reboot,
there's nothing oVirt is doing in this case…could it be your OS is
configured(or has decided) to shutdown instead? 

> Jon
 Hi, I'd be pointed towards the guest OS if it wasn't for 2 things: 1)
this happens to all guests of both windows and Linux flavours 2) the
guests are just plain vanilla installs with nothing special. 

that is weird. Any special/non-default setting?
do you have libvirt/qemu logs? VDSM doesn't say much other than a clean
user-initiated shutdown happened.
is ACPI enabled in the guest (unless you manually changed config it
always is)
any logs from the guest?

Thanks,
michal

> Jon

Yes it does seem to be weird, nothing special about my setup. Simple
gluster setup converted from all-in-one. 

As I mentioned the guests are vanilla with regard to configs. 

Just had a look in the libvirt and qemu logs, nothin in the libvirt, and
nothing exciting in the qemu log 

2014-01-27 13:37:16.842+: starting up
LC_ALL=C
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin
QEMU_AUDIO_DRV=spice /usr/libexec/qemu-kvm -name wcssrv01 -S -M
rhel6.5.0 -cpu Conroe,+vmx -enable-kvm -m 1024 -realtime mlock=off -smp
1,sockets=1,cores=1,threads=1 -uuid 350e285d-8c3a-494b-b484-27784caae976
-smbios type=1,manufacturer=oVirt,product=oVirt
Node,version=6-5.el6.centos.11.2,serial=439CF517-D52F-11DF-BBDA-C0970C0278AC,uuid=350e285d-8c3a-494b-b484-27784caae976
-nodefconfig -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/wcssrv01.monitor,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc
base=2014-01-27T13:37:16,driftfix=slew -no-shutdown -device
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device
virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive
file=/rhev/data-center/mnt/cauldron.arclab:_ISO__DOMAIN/abf4ff41-eb0a-4580-9ad0-bae53d56c6b0/images/----/CentOS-6.5-x86_64-minimal.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial=
-device
ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=2
-drive
file=/rhev/data-center/mnt/glusterSD/cauldron.arclab:_data/c074a6e3-87be-445e-890f-601c215ba320/images/24cf5dba-6b4a-430c-9a15-cfdebe0c1f09/9e47f51f-5bc5-47a0-8f98-c8a14626c393,if=none,id=drive-virtio-disk0,format=raw,serial=24cf5dba-6b4a-430c-9a15-cfdebe0c1f09,cache=none,werror=stop,rerror=stop,aio=threads
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=32 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:2d:98:0e,bus=pci.0,addr=0x3,bootindex=3
-chardev
socket,id=charchannel0,path=/var/lib/libvirt/qemu/channels/350e285d-8c3a-494b-b484-27784caae976.com.redhat.rhevm.vdsm,server,nowait
-device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm
-chardev
socket,id=charchannel1,path=/var/lib/libvirt/qemu/channels/350e285d-8c3a-494b-b484-27784caae976.org.qemu.guest_agent.0,server,nowait
-device
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.0
-chardev spicevmc,id=charchannel2,name=vdagent -device
virtserialport,bus=virt

Re: [Users] Issues starting hosted engine VM

2014-01-27 Thread Andrew Lau
I've opened a BZ https://bugzilla.redhat.com/show_bug.cgi?id=1058300

Itamar, in BZ 1055461 is that last comment directed to me? I don't seem to
have the option anywhere to set Target Release

Andrew.

On Fri, Jan 24, 2014 at 8:06 AM, Andrew Lau  wrote:

> Sorry, I must've overlooked this email - I'll try reproduce and open a new
> bz.
>
>
> On Fri, Jan 24, 2014 at 4:13 AM, Doron Fediuck wrote:
>
>>
>>
>> - Original Message -
>> > From: "Itamar Heim" 
>> > To: "Andrew Lau" 
>> > Cc: "users" 
>> > Sent: Monday, January 20, 2014 2:33:08 PM
>> > Subject: Re: [Users] Issues starting hosted engine VM
>> >
>> > On 01/20/2014 02:27 PM, Andrew Lau wrote:
>> > > On Mon, Jan 20, 2014 at 11:19 PM, Itamar Heim > > > >wrote:
>> > >
>> > > On 01/20/2014 01:19 PM, Andrew Lau wrote:
>> > >
>> > > Hi,
>> > >
>> > > That bug seems to be private :(
>> > >
>> > > I'm interested also to hear about this feature, as with 3.3.2
>> I
>> > > had my
>> > > gluster vms go into paused state quite a few times and they
>> > > actually
>> > > couldn't be resumed at all, they needed to be forced off and
>> > > back on.
>> > >
>> > >
>> > > did the storage domain go back to up and they remained down?
>> > >
>> > > Yup, the storage domain went down and when it came back up the VMs
>> > > remained paused.
>> >
>> > please open a bug with repro steps in that case and attach logs. thanks
>> >
>> > >
>> > >
>> > > On Mon, Jan 20, 2014 at 10:13 PM, Dafna Ron > > > 
>> > > >> wrote:
>> > >
>> > >  interesting... :) so this is now configurable...
>> > >  what happens if qemu fails to start the vm (this happens
>> > > sometimes -
>> > >  mostly on file type storage). do we have a re-try or a
>> > >  specific
>> > >  error telling the use that the activation failed and
>> manual
>> > >  intervention is required?
>> > >
>> >
>>
>> Andrew,
>> did you manage to open a bug for the resume issue?
>>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Power saving cluster policy (3.4 test day)

2014-01-27 Thread Karli Sjöberg
On Mon, 2014-01-27 at 08:09 -0500, Doron Fediuck wrote:
> 
> - Original Message -
> > From: "Lior Vernia" 
> > To: users@ovirt.org
> > Sent: Monday, January 27, 2014 1:45:10 PM
> > Subject: [Users] Power saving cluster policy (3.4 test day)
> > 
> > Hello,
> > 
> > Just wanted to drop a line here about this feature I tested last week,
> > which I thought was cool. I only tested it with two hosts and a single
> > VM, using one of the hosts as a reserve, but it was quite satisfying to
> > see the hosts being powered off and on as the VM was being run and stopped.
> > 
> > This could be interesting to look at for those of you running a
> > moderately-sized deployment, if often the load on your hosts is low.
> > 
> > Yours, Lior.
> 
> Thanks for the feedback, Lior.
> I'd love to get feedback from others who will give it a try, as this
> was requested for in various occasions.

Yes, I for one am very interested in testing this feature, hadn´t
realized this had been realized already:)

Will report back as soon as I have had a chance to test.

/K

> 
> Doron
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



-- 

Med Vänliga Hälsningar

---
Karli Sjöberg
Swedish University of Agricultural Sciences Box 7079 (Visiting Address
Kronåsvägen 8)
S-750 07 Uppsala, Sweden
Phone:  +46-(0)18-67 15 66
karli.sjob...@slu.se
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Reboot causes poweroff of VM 3.4 Beta

2014-01-27 Thread Michal Skrivanek

On Jan 27, 2014, at 12:21 , Jonathan Archer  wrote:

> On 27/01/2014 11:03, Michal Skrivanek wrote:
> 
>> On Jan 27, 2014, at 11:59 , Jonathan Archer  wrote:
>>> On 27/01/2014 10:56, Michal Skrivanek wrote:
 On Jan 27, 2014, at 11:46 , Jonathan Archer  wrote:
> On 25/01/2014 20:02, Roy Golan wrote:
>> Please attach engine.log and vdsm.log On Jan 25, 2014 5:59 PM, Jon 
>> Archer < j...@rosslug.org.uk> wrote:
>>> Hi, Seem to be suffering an issue in 3.4 where if a vm Hi,
>> Seem to be suffering an issue in 3.4 where if a vm is rebooted it 
>> actually shuts down, this occurs for all guests regardless of OS 
>> installed within. Anyone seen this? Jon 
>> ___ Users mailing list 
>> Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
> Attached is a sample of the engine.log and vdsm.log during a "reboot" the 
> VM as previously stated shutdown rather than reboot,. There wasn't 
> anything in the server.log around the time, the previous entry was at 
> least 30 mins before.
 Hi, your log doesn't contain the relevant part, there's no command logged, 
 other than "Message: VM wcsmail01 is down. Exit message: User shut down" 
 which means the guest was shut down from inside of the OS how did you 
 trigger the reboot/shutdown? Thanks, michal
> Thanks 
> ___
>  Users mailing list Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users
>>> The reboot was triggered using the reboot command at the console.
>> ok, makes sense there's nothing in the log
>>> I also have a windows 7 guest, which shuts down when the reboot option is 
>>> selected.
>> what doesn't make sense is the behavior:) It should simply reboot, there's 
>> nothing oVirt is doing in this case…could it be your OS is configured(or has 
>> decided) to shutdown instead?
>>> Jon
> Hi,
> 
> I'd be pointed towards the guest OS if it wasn't for 2 things:
> 
> 1) this happens to all guests of both windows and Linux flavours
> 
> 2) the guests are just plain vanilla installs with nothing special.

that is weird. Any special/non-default setting?
do you have libvirt/qemu logs? VDSM doesn't say much other than a clean 
user-initiated shutdown happened.
is ACPI enabled in the guest (unless you manually changed config it always is)
any logs from the guest?

Thanks,
michal

> 
> Jon
> 
>  

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Power saving cluster policy (3.4 test day)

2014-01-27 Thread Doron Fediuck


- Original Message -
> From: "Lior Vernia" 
> To: users@ovirt.org
> Sent: Monday, January 27, 2014 1:45:10 PM
> Subject: [Users] Power saving cluster policy (3.4 test day)
> 
> Hello,
> 
> Just wanted to drop a line here about this feature I tested last week,
> which I thought was cool. I only tested it with two hosts and a single
> VM, using one of the hosts as a reserve, but it was quite satisfying to
> see the hosts being powered off and on as the VM was being run and stopped.
> 
> This could be interesting to look at for those of you running a
> moderately-sized deployment, if often the load on your hosts is low.
> 
> Yours, Lior.

Thanks for the feedback, Lior.
I'd love to get feedback from others who will give it a try, as this
was requested for in various occasions.

Doron
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] changing hostname in ovirt

2014-01-27 Thread Alon Bar-Lev


- Original Message -
> From: "Sven Kieske" 
> To: d...@redhat.com
> Cc: "Alon Bar-Lev" , "Users@ovirt.org List" 
> 
> Sent: Monday, January 27, 2014 2:52:03 PM
> Subject: Re: [Users] changing hostname in ovirt
> 
> well, that's not what I want, because I'm talking about a local
> storage DC. I just want to change the hosts address which ovirt
> uses to connect to the host.
> 
> This isn't possible without changing the certificates (re-deploy)?

You can... just a lot of places.

generate empty subject certificate request using /etc/pki/vdsm/keys/vdsmkey.pem 
key

# openssl req -new -key /etc/pki/vdsm/keys/vdsmkey.pem -subj "/"

use engine utility /usr/share/ovirt-engine/bin/pki-enroll-request.sh to enroll 
a new certificate with the new host name.

# cat > /etc/pki/ovirt-engine/requests/xxx.req

# /usr/share/ovirt-engine/bin/pki-enroll-request.sh --name=xxx 
--subject="/CN=/O=/C=xxx"
# cat /etc/pki/ovirt-engine/certs/xxx.cer

copy the certificate into:
/etc/pki/vdsm/certs/vdsmcert.pem
/etc/pki/vdsm/libvirt-spice/server-cert.pem
/etc/pki/libvirt/clientcert.pem

> 
> Am 27.01.2014 13:27, schrieb Dafna Ron:
> > well, if you can re-deploy the hosts that would change the certificates
> > as well (create new ones with the new hostname).
> 
> --
> Mit freundlichen Grüßen / Regards
> 
> Sven Kieske
> 
> Systemadministrator
> Mittwald CM Service GmbH & Co. KG
> Königsberger Straße 6
> 32339 Espelkamp
> T: +49-5772-293-100
> F: +49-5772-293-333
> https://www.mittwald.de
> Geschäftsführer: Robert Meyer
> St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
> Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] two node ovirt cluster with HA

2014-01-27 Thread Dafna Ron

Andrew,
Once this discussion is finished, and If what you like done is not in 
the current implementation can you please open a bug/feature request for 
it?


Thanks,

Dafna

On 01/27/2014 12:59 PM, Tareq Alayan wrote:

Adding Eli.


On 01/27/2014 02:50 PM, Andrew Lau wrote:

Hi,

I think he was asking what if the power management device reported 
that the host was powered off. Then VMs should be brought back up as 
being off would essentially be the same as running a power cycle/reboot?


Another example I'm seeing is what happens if the whole host loses 
power and it's power management device then becomes unavailable (ie. 
not reachable) then you're stuck in the case where it requires manual 
intervention.


I would be interested to potentially see something like a timeout on 
those problematic VMs (eg. if nothing was read or write after x 
amount of time) then you could consider the host as offline? I guess 
then that adds a lot of risk..



On Mon, Jan 27, 2014 at 11:43 PM, Tareq Alayan > wrote:


Hi,

Power management makes use of special *dedicated* hardware in
order to restart hosts independently of host OS. The engine
connects to a power management devices using a *dedicated*
network IP address.
The engine is capable of rebooting hosts that have entered a
non-operational or non-responsive state,
The abilities provided by all power management devices are: check
status, start, stop and recycle (restart)...

In the case of non-responsive host: all of the VMs that are
currently running on that host can also become non-responsive.
However, the non-responsive host keeps locking the VM hard disk
for all VMs it is running. Attempting to start a VM on a
different host and assign the second host write privileges for
the virtual machine hard disk image can cause data corruption.
Rebooting allows the engine to assume that the lock on a VM hard
disk image has been released.
The engine can know for sure that the problematic host has been
rebooted via the power management device and then it can start a
VM from the problematic host on another host without risking data
corruption.
Important note: A virtual machine that has been marked
highly-available can not be safely started on a different host
without the certainty that doing so will not cause data corruption.

N-joy,

--Tareq




On 01/27/2014 02:05 PM, Dafna Ron wrote:

I am adding Tareq for the Power Management implementation.

Dafna


On 01/27/2014 11:48 AM, Karli Sjöberg wrote:

On Mon, 2014-01-27 at 11:11 +, Dafna Ron wrote:

Powering off the host will never trigger vm migration.
As far as engine is concerned it just lost connection
to the host, but
has no way of telling if the host is down or if a
router is down.

Can´t it at least check with power management if the Host
status is down
first?

I mean, if the network is down there will be no response
from either PM
or Host. But if PM is up and can tell you that the Host
is down, sounds
rather clear cut to me...

Seems to me the VM's would be restarted sooner if the
flow was altered
to first check with PM if it´s a network or Host issue,
and if Host
issue, immediately restart VM's on another Host, instead
of waiting for
a potentially problematic Host to boot up eventually.

/K

since vm's can continue running on the host even if
engine has no access
to it, starting the vm's on the second host can cause
split brain and
data corruption.

The way that the engine knows what's going on is by
sending heath check
queries to the vdsm.
Power management will try to reboot a host when the
health checks to
vdsm will not be answered.
So... if engine gets no reply and has no way of
rebooting the host, the
host status will be changed to Non-Responsive and the
vm's will be
unknown because engine has no way of knowing what's
happening with the
vm's.
Since reboot of the host will kill the vm's running
on it - this will
never cause any vm migration but... along with the
High-Availability vm
feature, you will be able to have some of the vm's
re-started on the
second host after the host reboot (and that is only
if Power Management
was confirmed as successful).

   

Re: [Users] two node ovirt cluster with HA

2014-01-27 Thread Karli Sjöberg
On Mon, 2014-01-27 at 14:43 +0200, Tareq Alayan wrote:
> Hi,
> 
> Power management makes use of special *dedicated* hardware in order to 
> restart hosts independently of host OS. The engine connects to a power 
> management devices using a *dedicated* network IP address.
> The engine is capable of rebooting hosts that have entered a 
> non-operational or non-responsive state,
> The abilities provided by all power management devices are: check 
> status, start, stop and recycle (restart)...
> 
> In the case of non-responsive host: all of the VMs that are currently 
> running on that host can also become non-responsive. However, the 
> non-responsive host keeps locking the VM hard disk for all VMs it is 
> running. Attempting to start a VM on a different host and assign the 
> second host write privileges for the virtual machine hard disk image can 
> cause data corruption.

Exactly! If Engine was to first check towards the power management that
the problematic/non-responsove Host indeed is down, there would be no
risk of data corruption.

That´s why I suggested a change in the HA flow to first check if the
Host is indeed down, if so, just start the VM's on another Host.

/K

> Rebooting allows the engine to assume that the lock on a VM hard disk 
> image has been released.
> The engine can know for sure that the problematic host has been rebooted 
> via the power management device and then it can start a VM from the 
> problematic host on another host without risking data corruption.
> Important note: A virtual machine that has been marked highly-available 
> can not be safely started on a different host without the certainty that 
> doing so will not cause data corruption.
> 
> N-joy,
> 
> --Tareq
> 
> 
> 
> On 01/27/2014 02:05 PM, Dafna Ron wrote:
> > I am adding Tareq for the Power Management implementation.
> >
> > Dafna
> >
> >
> > On 01/27/2014 11:48 AM, Karli Sjöberg wrote:
> >> On Mon, 2014-01-27 at 11:11 +, Dafna Ron wrote:
> >>> Powering off the host will never trigger vm migration.
> >>> As far as engine is concerned it just lost connection to the host, but
> >>> has no way of telling if the host is down or if a router is down.
> >> Can´t it at least check with power management if the Host status is down
> >> first?
> >>
> >> I mean, if the network is down there will be no response from either PM
> >> or Host. But if PM is up and can tell you that the Host is down, sounds
> >> rather clear cut to me...
> >>
> >> Seems to me the VM's would be restarted sooner if the flow was altered
> >> to first check with PM if it´s a network or Host issue, and if Host
> >> issue, immediately restart VM's on another Host, instead of waiting for
> >> a potentially problematic Host to boot up eventually.
> >>
> >> /K
> >>
> >>> since vm's can continue running on the host even if engine has no 
> >>> access
> >>> to it, starting the vm's on the second host can cause split brain and
> >>> data corruption.
> >>>
> >>> The way that the engine knows what's going on is by sending heath check
> >>> queries to the vdsm.
> >>> Power management will try to reboot a host when the health checks to
> >>> vdsm will not be answered.
> >>> So... if engine gets no reply and has no way of rebooting the host, the
> >>> host status will be changed to Non-Responsive and the vm's will be
> >>> unknown because engine has no way of knowing what's happening with the
> >>> vm's.
> >>> Since reboot of the host will kill the vm's running on it - this will
> >>> never cause any vm migration but... along with the High-Availability vm
> >>> feature, you will be able to have some of the vm's re-started on the
> >>> second host after the host reboot (and that is only if Power Management
> >>> was confirmed as successful).
> >>>
> >>> VM migration is only triggered when:
> >>> 1. Cluster configuration states that the vm should be migrated in case
> >>> of failure
> >>> 2. Engine has access to the host - so the failure is on the storage 
> >>> side
> >>> and not the host side.
> >>> 3. the vms are not actively writing (although there might be a new RFE
> >>> for it).
> >>>
> >>> hope this clears things up
> >>>
> >>> Dafna
> >>>
> >>>
> >>>
> >>> On 01/27/2014 10:11 AM, Andrew Lau wrote:
>  Hi,
> 
>  Have you got power management enabled?
> 
>  That's the fencing feature required for the engine to ensure that the
>  host is actually offline. It won't resume any other VMs to prevent
>  potential VM corruption (eg. VM running on multiple hosts).
> 
>  Andrew.
> 
>  On Jan 27, 2014 5:12 PM, "Jaison peter"   > wrote:
> 
>   Hi all ,
> 
>   I was setting a two node ovirt cluster with ovirt engine on
>   seperate node . I completed the configuration and tested VM  live
>   migrations with out any issues . Then for checking cluster HA I
>   powered down one host and expected vms running on that host to be
>   migrated to 

Re: [Users] changing hostname in ovirt

2014-01-27 Thread Sven Kieske
well, that's not what I want, because I'm talking about a local
storage DC. I just want to change the hosts address which ovirt
uses to connect to the host.

This isn't possible without changing the certificates (re-deploy)?

Am 27.01.2014 13:27, schrieb Dafna Ron:
> well, if you can re-deploy the hosts that would change the certificates
> as well (create new ones with the new hostname).

-- 
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH & Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] two node ovirt cluster with HA

2014-01-27 Thread Andrew Lau
Hi,

I think he was asking what if the power management device reported that the
host was powered off. Then VMs should be brought back up as being off would
essentially be the same as running a power cycle/reboot?

Another example I'm seeing is what happens if the whole host loses power
and it's power management device then becomes unavailable (ie. not
reachable) then you're stuck in the case where it requires manual
intervention.

I would be interested to potentially see something like a timeout on those
problematic VMs (eg. if nothing was read or write after x amount of time)
then you could consider the host as offline? I guess then that adds a lot
of risk..


On Mon, Jan 27, 2014 at 11:43 PM, Tareq Alayan  wrote:

> Hi,
>
> Power management makes use of special *dedicated* hardware in order to
> restart hosts independently of host OS. The engine connects to a power
> management devices using a *dedicated* network IP address.
> The engine is capable of rebooting hosts that have entered a
> non-operational or non-responsive state,
> The abilities provided by all power management devices are: check status,
> start, stop and recycle (restart)...
>
> In the case of non-responsive host: all of the VMs that are currently
> running on that host can also become non-responsive. However, the
> non-responsive host keeps locking the VM hard disk for all VMs it is
> running. Attempting to start a VM on a different host and assign the second
> host write privileges for the virtual machine hard disk image can cause
> data corruption.
> Rebooting allows the engine to assume that the lock on a VM hard disk
> image has been released.
> The engine can know for sure that the problematic host has been rebooted
> via the power management device and then it can start a VM from the
> problematic host on another host without risking data corruption.
> Important note: A virtual machine that has been marked highly-available
> can not be safely started on a different host without the certainty that
> doing so will not cause data corruption.
>
> N-joy,
>
> --Tareq
>
>
>
>
> On 01/27/2014 02:05 PM, Dafna Ron wrote:
>
>> I am adding Tareq for the Power Management implementation.
>>
>> Dafna
>>
>>
>> On 01/27/2014 11:48 AM, Karli Sjöberg wrote:
>>
>>> On Mon, 2014-01-27 at 11:11 +, Dafna Ron wrote:
>>>
 Powering off the host will never trigger vm migration.
 As far as engine is concerned it just lost connection to the host, but
 has no way of telling if the host is down or if a router is down.

>>> Can´t it at least check with power management if the Host status is down
>>> first?
>>>
>>> I mean, if the network is down there will be no response from either PM
>>> or Host. But if PM is up and can tell you that the Host is down, sounds
>>> rather clear cut to me...
>>>
>>> Seems to me the VM's would be restarted sooner if the flow was altered
>>> to first check with PM if it´s a network or Host issue, and if Host
>>> issue, immediately restart VM's on another Host, instead of waiting for
>>> a potentially problematic Host to boot up eventually.
>>>
>>> /K
>>>
>>>  since vm's can continue running on the host even if engine has no access
 to it, starting the vm's on the second host can cause split brain and
 data corruption.

 The way that the engine knows what's going on is by sending heath check
 queries to the vdsm.
 Power management will try to reboot a host when the health checks to
 vdsm will not be answered.
 So... if engine gets no reply and has no way of rebooting the host, the
 host status will be changed to Non-Responsive and the vm's will be
 unknown because engine has no way of knowing what's happening with the
 vm's.
 Since reboot of the host will kill the vm's running on it - this will
 never cause any vm migration but... along with the High-Availability vm
 feature, you will be able to have some of the vm's re-started on the
 second host after the host reboot (and that is only if Power Management
 was confirmed as successful).

 VM migration is only triggered when:
 1. Cluster configuration states that the vm should be migrated in case
 of failure
 2. Engine has access to the host - so the failure is on the storage side
 and not the host side.
 3. the vms are not actively writing (although there might be a new RFE
 for it).

 hope this clears things up

 Dafna



 On 01/27/2014 10:11 AM, Andrew Lau wrote:

> Hi,
>
> Have you got power management enabled?
>
> That's the fencing feature required for the engine to ensure that the
> host is actually offline. It won't resume any other VMs to prevent
> potential VM corruption (eg. VM running on multiple hosts).
>
> Andrew.
>
> On Jan 27, 2014 5:12 PM, "Jaison peter"  > wrote:
>
>  Hi all ,
>
>  I was setting a two node ovirt cluster 

Re: [Users] changing hostname in ovirt

2014-01-27 Thread Dafna Ron

On 01/27/2014 12:23 PM, Sven Kieske wrote:

Hi,

so you can change the address> part
of the host via api?

I can see the update description in the rsdl
it seems this should work, but will this have
any effect on the certificates Dafna mentioned?


well, if you can re-deploy the hosts that would change the certificates 
as well (create new ones with the new hostname).






Am 27.01.2014 12:26, schrieb Alon Bar-Lev:


- Original Message -

From: "Dafna Ron" 
To: "Sven Kieske" 
Cc: "Users@ovirt.org List" 
Sent: Monday, January 27, 2014 1:13:14 PM
Subject: Re: [Users] changing hostname in ovirt

you cannot change a host name after it has been installed because of
certificates being installed during host installation.

This should not be an issue, as you can:
1. go to maintenance.
2. modify host name via API
3. re-deploy host.


On 01/27/2014 10:42 AM, Sven Kieske wrote:

Hi,

maybe I should write an RFE BZ for this
but there might be a technical limitation, I don't know.

What I want:

I have a Host in ovirt which is in status "down" and/or
"maintenance".

I want to change the hostname/ip ovirt uses to connect to
this host.

2. RFEs:

1. This is not possible via webadmin, the address field
is grayed out (there seem to be conflicting design patterns
for the webadmin, other buttons which don't work don't get
grayed out for some reason, instead they throw errors..)

2. make it possible to change the hostname/ip via API

from what I've seen so far I have to hack the database
to make this change happen in 3.3.2

Is there any technical reason why this is not possible?
ovirt should be happy with the UUIDs for the host and
should not bother about the hostname.

Is there a way to alter the hostname/ip which I don't know?

Could someone point me to the right table and how to alter it
without breaking it?

Thank you!





--
Dafna Ron
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users







--
Dafna Ron
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] changing hostname in ovirt

2014-01-27 Thread Sven Kieske
Hi,

so you can change the address> part
of the host via api?

I can see the update description in the rsdl
it seems this should work, but will this have
any effect on the certificates Dafna mentioned?

Am 27.01.2014 12:26, schrieb Alon Bar-Lev:
> 
> 
> - Original Message -
>> From: "Dafna Ron" 
>> To: "Sven Kieske" 
>> Cc: "Users@ovirt.org List" 
>> Sent: Monday, January 27, 2014 1:13:14 PM
>> Subject: Re: [Users] changing hostname in ovirt
>>
>> you cannot change a host name after it has been installed because of
>> certificates being installed during host installation.
> 
> This should not be an issue, as you can:
> 1. go to maintenance.
> 2. modify host name via API
> 3. re-deploy host.
> 
>>
>> On 01/27/2014 10:42 AM, Sven Kieske wrote:
>>> Hi,
>>>
>>> maybe I should write an RFE BZ for this
>>> but there might be a technical limitation, I don't know.
>>>
>>> What I want:
>>>
>>> I have a Host in ovirt which is in status "down" and/or
>>> "maintenance".
>>>
>>> I want to change the hostname/ip ovirt uses to connect to
>>> this host.
>>>
>>> 2. RFEs:
>>>
>>> 1. This is not possible via webadmin, the address field
>>> is grayed out (there seem to be conflicting design patterns
>>> for the webadmin, other buttons which don't work don't get
>>> grayed out for some reason, instead they throw errors..)
>>>
>>> 2. make it possible to change the hostname/ip via API
>>>
>>> from what I've seen so far I have to hack the database
>>> to make this change happen in 3.3.2
>>>
>>> Is there any technical reason why this is not possible?
>>> ovirt should be happy with the UUIDs for the host and
>>> should not bother about the hostname.
>>>
>>> Is there a way to alter the hostname/ip which I don't know?
>>>
>>> Could someone point me to the right table and how to alter it
>>> without breaking it?
>>>
>>> Thank you!
>>>
>>>
>>
>>
>>
>> --
>> Dafna Ron
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
> 
> 
> 

-- 
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH & Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] two node ovirt cluster with HA

2014-01-27 Thread Dafna Ron

I am adding Tareq for the Power Management implementation.

Dafna


On 01/27/2014 11:48 AM, Karli Sjöberg wrote:

On Mon, 2014-01-27 at 11:11 +, Dafna Ron wrote:

Powering off the host will never trigger vm migration.
As far as engine is concerned it just lost connection to the host, but
has no way of telling if the host is down or if a router is down.

Can´t it at least check with power management if the Host status is down
first?

I mean, if the network is down there will be no response from either PM
or Host. But if PM is up and can tell you that the Host is down, sounds
rather clear cut to me...

Seems to me the VM's would be restarted sooner if the flow was altered
to first check with PM if it´s a network or Host issue, and if Host
issue, immediately restart VM's on another Host, instead of waiting for
a potentially problematic Host to boot up eventually.

/K


since vm's can continue running on the host even if engine has no access
to it, starting the vm's on the second host can cause split brain and
data corruption.

The way that the engine knows what's going on is by sending heath check
queries to the vdsm.
Power management will try to reboot a host when the health checks to
vdsm will not be answered.
So... if engine gets no reply and has no way of rebooting the host, the
host status will be changed to Non-Responsive and the vm's will be
unknown because engine has no way of knowing what's happening with the
vm's.
Since reboot of the host will kill the vm's running on it - this will
never cause any vm migration but... along with the High-Availability vm
feature, you will be able to have some of the vm's re-started on the
second host after the host reboot (and that is only if Power Management
was confirmed as successful).

VM migration is only triggered when:
1. Cluster configuration states that the vm should be migrated in case
of failure
2. Engine has access to the host - so the failure is on the storage side
and not the host side.
3. the vms are not actively writing (although there might be a new RFE
for it).

hope this clears things up

Dafna



On 01/27/2014 10:11 AM, Andrew Lau wrote:

Hi,

Have you got power management enabled?

That's the fencing feature required for the engine to ensure that the
host is actually offline. It won't resume any other VMs to prevent
potential VM corruption (eg. VM running on multiple hosts).

Andrew.

On Jan 27, 2014 5:12 PM, "Jaison peter" mailto:urotr...@gmail.com>> wrote:

 Hi all ,

 I was setting a two node ovirt cluster with ovirt engine on
 seperate node . I completed the configuration and tested VM  live
 migrations with out any issues . Then for checking cluster HA I
 powered down one host and expected vms running on that host to be
 migrated to the other one . But nothing happened , Engine detected
 host as un-rechable and marked it as non-operational and vm ran on
 that host went to 'unknown state' . Is that not possible to setup
 a fully HA ovirt cluster with two nodes ? or else is that my
 configuration problem ? please advice .

 Thanks & Regards

 Alex

 ___
 Users mailing list
 Users@ovirt.org 
 http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


--
Dafna Ron
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users






--
Dafna Ron
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] sanlock leases on VM disks

2014-01-27 Thread Itamar Heim

On 01/27/2014 01:06 PM, José Luis Sanz Boixader wrote:

On 01/17/2014 11:43 PM, Itamar Heim wrote:

On 01/10/2014 08:44 PM, José Luis Sanz Boixader wrote:

I have an oVirt testing setup with 3 hosts running for a few weeks:
CentOS 6.4, oVirt 3.3.1, VDSM 4.13.0, iSCSI based storage domain.

I have just realized that sanlock has no leases on VM disks, so nothing
prevents vdsm/libvirt from starting a VM on two different hosts,
corrupting disk data. I know that something has to go wrong on oVirt
engine to do it, but I've manually forced some errors ("Setting Host
state to Non-Operational", "VM  is not responding") for a "Highly
available" VM and oVirt engine started that VM on another host. oVirt
engine was not aware, but the VM was running on two hosts.

I think this is a job for libvirt/sanlock/wdmd, but libvirt is not
receiving "lease" tags for disks when creating domains. I think it
should.
What's left in my config? What am I doing wrong?

Thanks
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



we started introducing sanlock carefully, first to SPM nodes. in 3.4
to hosted ovirt-engine node, and looking to add it to VMs/disks as
well going forward.

I don't remember if we have a config option to enable this, but  you
can make this work via a custom hook at this point at vm/disk level,
and we would love feedback on this.

Thanks,
Itamar



Looking into vdsm code, I've found that there's already code for sanlock
on VM disks, but it has been disabled by default
[https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=838802].
I guess it was disabled because you can't hot attach/dettach disks to VM
while running. But I prefer to enable it, as data protection is critical
in a SAN environment.


it wasn't enabled yet to finish testing and some corner cases. help in 
this area is appreciated.
ayal/federico can share on the current gaps/status (assuming i remember 
correctly)




To enable it, you need to include this in /etc/vdsm/vdsm.conf at every
host in your setup.
...
[irs]
use_volume_leases = true
...
and after that, you'll need to restart vdsmd. Ensure that
/etc/libvirt/qemu.conf says lock_manager="sanlock".
VMs that were already running will not be modified, and thus not
protected. Restart them to get sanlock leases on those disks.

To confirm that it is properly running, connect to a host and type:
# sanlock client status
and you'll get some output like this, listing VMs running on that host
and its disk leases:

daemon dc7e06a0-18bb-4f68-9ea6-883dda883ef2.server
p -1 helper
p -1 listener
p 10629 Vicent
p -1 status
s
e9a91ad7-2bd3-4c98-a171-88324bc87a09:2:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/ids:0
r
e9a91ad7-2bd3-4c98-a171-88324bc87a09:1c7005ad-3d33-4c8f-9c99-2eef7be865f3:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/leases:123731968:6
p 10629
r
e9a91ad7-2bd3-4c98-a171-88324bc87a09:978b3630-f491-45a3-9826-e9ab6a744e72:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/leases:127926272:6
p 10629

Which tells that domain "Vicent" with proc id 10629 has two disks with
leases.
No other host will be able to run that VM, even in a engine error event,
because it could not get leases to use those disks.

Migration of VM also works, and the destination host gets and acquires
the disk leases.
All this has been tested with oVirt 3.3.1 release.



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] two node ovirt cluster with HA

2014-01-27 Thread Karli Sjöberg
On Mon, 2014-01-27 at 11:11 +, Dafna Ron wrote:
> Powering off the host will never trigger vm migration.
> As far as engine is concerned it just lost connection to the host, but 
> has no way of telling if the host is down or if a router is down.

Can´t it at least check with power management if the Host status is down
first?

I mean, if the network is down there will be no response from either PM
or Host. But if PM is up and can tell you that the Host is down, sounds
rather clear cut to me...

Seems to me the VM's would be restarted sooner if the flow was altered
to first check with PM if it´s a network or Host issue, and if Host
issue, immediately restart VM's on another Host, instead of waiting for
a potentially problematic Host to boot up eventually. 

/K

> since vm's can continue running on the host even if engine has no access 
> to it, starting the vm's on the second host can cause split brain and 
> data corruption.
> 
> The way that the engine knows what's going on is by sending heath check 
> queries to the vdsm.
> Power management will try to reboot a host when the health checks to 
> vdsm will not be answered.
> So... if engine gets no reply and has no way of rebooting the host, the 
> host status will be changed to Non-Responsive and the vm's will be 
> unknown because engine has no way of knowing what's happening with the 
> vm's.
> Since reboot of the host will kill the vm's running on it - this will 
> never cause any vm migration but... along with the High-Availability vm 
> feature, you will be able to have some of the vm's re-started on the 
> second host after the host reboot (and that is only if Power Management 
> was confirmed as successful).
> 
> VM migration is only triggered when:
> 1. Cluster configuration states that the vm should be migrated in case 
> of failure
> 2. Engine has access to the host - so the failure is on the storage side 
> and not the host side.
> 3. the vms are not actively writing (although there might be a new RFE 
> for it).
> 
> hope this clears things up
> 
> Dafna
> 
> 
> 
> On 01/27/2014 10:11 AM, Andrew Lau wrote:
> >
> > Hi,
> >
> > Have you got power management enabled?
> >
> > That's the fencing feature required for the engine to ensure that the 
> > host is actually offline. It won't resume any other VMs to prevent 
> > potential VM corruption (eg. VM running on multiple hosts).
> >
> > Andrew.
> >
> > On Jan 27, 2014 5:12 PM, "Jaison peter"  > > wrote:
> >
> > Hi all ,
> >
> > I was setting a two node ovirt cluster with ovirt engine on
> > seperate node . I completed the configuration and tested VM  live
> > migrations with out any issues . Then for checking cluster HA I
> > powered down one host and expected vms running on that host to be
> > migrated to the other one . But nothing happened , Engine detected
> > host as un-rechable and marked it as non-operational and vm ran on
> > that host went to 'unknown state' . Is that not possible to setup
> > a fully HA ovirt cluster with two nodes ? or else is that my
> > configuration problem ? please advice .
> >
> > Thanks & Regards
> >
> > Alex
> >
> > ___
> > Users mailing list
> > Users@ovirt.org 
> > http://lists.ovirt.org/mailman/listinfo/users
> >
> >
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> 
> 
> -- 
> Dafna Ron
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



-- 

Med Vänliga Hälsningar

---
Karli Sjöberg
Swedish University of Agricultural Sciences Box 7079 (Visiting Address
Kronåsvägen 8)
S-750 07 Uppsala, Sweden
Phone:  +46-(0)18-67 15 66
karli.sjob...@slu.se
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Power saving cluster policy (3.4 test day)

2014-01-27 Thread Lior Vernia
Hello,

Just wanted to drop a line here about this feature I tested last week,
which I thought was cool. I only tested it with two hosts and a single
VM, using one of the hosts as a reserve, but it was quite satisfying to
see the hosts being powered off and on as the VM was being run and stopped.

This could be interesting to look at for those of you running a
moderately-sized deployment, if often the load on your hosts is low.

Yours, Lior.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] OVirt 3.3.2 Snapshot Pane empty in Firefox

2014-01-27 Thread Markus Stockhausen
> Von: Itamar Heim [ih...@redhat.com]
> Gesendet: Sonntag, 26. Januar 2014 11:52
> An: Markus Stockhausen; ovirt-users; Daniel Erez
> Cc: Allon Mureinik
> Betreff: Re: [Users] OVirt 3.3.2 Snapshot Pane empty in Firefox
> 
> On 01/23/2014 12:49 PM, Markus Stockhausen wrote:
> > Hello,
> >
> > I had a mysterios behaviour in the webinterface twice this week.
> > The Snapshot list of a VM remained empty in Firefox although
> > I know that snapshots exist.
> >
> > The "Create Snapshot" button was still working and issued the
> > right commands. Nevertheless the list remained empty.
> >
> > Opening an IE session prooved that everything was ok. The
> > pane was populated with the list of snapshots. After restarting
> > Firefox everything was fine again.
> >
> > Has anybody experienced simialr issues and if yes is there
> > already an opn BZ for that?
> >
> > Markus
> >
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
> 
> derez?
>

Got the same behaviour again this morning. But now I have  
hopefully a good idea. I unlocked my screen and the last OVirt
session in the (Firefox) browser has flipped back to the login
screen. 

I relogged in and directly jumped to the snapshot pane. It was
empty and the create button worked as usual. Nevertheless
the autorefresh did not work. The pane simply stayed empty
even after creating three snapshots in a row. Btw. the log
history showed all the actions I was doing.

To be sure that is the reason I left a Ovirt admin browser window 
open. Now for over 2 hours without a switchback to the login
page. 

Any idea how I can force that behaviour? A normal cycle of
logout/login does not produce the error.

Markus


Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte
Weitergabe dieser Mail ist nicht gestattet.

Über das Internet versandte E-Mails können unter fremden Namen erstellt oder
manipuliert werden. Deshalb ist diese als E-Mail verschickte Nachricht keine
rechtsverbindliche Willenserklärung.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

Vorstand:
Kadir Akin
Dr. Michael Höhnerbach

Vorsitzender des Aufsichtsrates:
Hans Kristian Langva

Registergericht: Amtsgericht Köln
Registernummer: HRB 52 497

This e-mail may contain confidential and/or privileged information. If you
are not the intended recipient (or have received this e-mail in error)
please notify the sender immediately and destroy this e-mail. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden.

e-mails sent over the internet may have been written under a wrong name or
been manipulated. That is why this message sent as an e-mail is not a
legally binding declaration of intention.

Collogia
Unternehmensberatung AG
Ubierring 11
D-50678 Köln

executive board:
Kadir Akin
Dr. Michael Höhnerbach

President of the supervisory board:
Hans Kristian Langva

Registry office: district court Cologne
Register number: HRB 52 497


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] changing hostname in ovirt

2014-01-27 Thread Alon Bar-Lev


- Original Message -
> From: "Dafna Ron" 
> To: "Sven Kieske" 
> Cc: "Users@ovirt.org List" 
> Sent: Monday, January 27, 2014 1:13:14 PM
> Subject: Re: [Users] changing hostname in ovirt
> 
> you cannot change a host name after it has been installed because of
> certificates being installed during host installation.

This should not be an issue, as you can:
1. go to maintenance.
2. modify host name via API
3. re-deploy host.

> 
> On 01/27/2014 10:42 AM, Sven Kieske wrote:
> > Hi,
> >
> > maybe I should write an RFE BZ for this
> > but there might be a technical limitation, I don't know.
> >
> > What I want:
> >
> > I have a Host in ovirt which is in status "down" and/or
> > "maintenance".
> >
> > I want to change the hostname/ip ovirt uses to connect to
> > this host.
> >
> > 2. RFEs:
> >
> > 1. This is not possible via webadmin, the address field
> > is grayed out (there seem to be conflicting design patterns
> > for the webadmin, other buttons which don't work don't get
> > grayed out for some reason, instead they throw errors..)
> >
> > 2. make it possible to change the hostname/ip via API
> >
> > from what I've seen so far I have to hack the database
> > to make this change happen in 3.3.2
> >
> > Is there any technical reason why this is not possible?
> > ovirt should be happy with the UUIDs for the host and
> > should not bother about the hostname.
> >
> > Is there a way to alter the hostname/ip which I don't know?
> >
> > Could someone point me to the right table and how to alter it
> > without breaking it?
> >
> > Thank you!
> >
> >
> 
> 
> 
> --
> Dafna Ron
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Reboot causes poweroff of VM 3.4 Beta

2014-01-27 Thread Jonathan Archer
 

On 27/01/2014 11:03, Michal Skrivanek wrote: 

> On Jan 27, 2014, at 11:59 , Jonathan Archer  wrote:
> On 27/01/2014 10:56, Michal Skrivanek wrote: On Jan 27, 2014, at 11:46 , 
> Jonathan Archer  wrote: On 25/01/2014 20:02, Roy Golan 
> wrote: Please attach engine.log and vdsm.log On Jan 25, 2014 5:59 PM, Jon 
> Archer < j...@rosslug.org.uk> wrote: Hi, Seem to be suffering an issue in 3.4 
> where if a vm Hi, Seem to be suffering an issue in 3.4 where if a vm is 
> rebooted it actually shuts down, this occurs for all guests regardless of OS 
> installed within. Anyone seen this? Jon 
> ___ Users mailing list 
> Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users [1]
 Attached is a sample of the engine.log and vdsm.log during a "reboot"
the VM as previously stated shutdown rather than reboot,. There wasn't
anything in the server.log around the time, the previous entry was at
least 30 mins before. Hi, your log doesn't contain the relevant part,
there's no command logged, other than "Message: VM wcsmail01 is down.
Exit message: User shut down" which means the guest was shut down from
inside of the OS how did you trigger the reboot/shutdown? Thanks, michal


> Thanks 
> ___ 
> Users mailing list Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users [1]
 The reboot was triggered using the reboot command at the console. 

ok, makes sense there's nothing in the log

> I also have a windows 7 guest, which shuts down when the reboot option is 
> selected.

what doesn't make sense is the behavior:) It should simply reboot,
there's nothing oVirt is doing in this case…could it be your OS is
configured(or has decided) to shutdown instead?

> Jon

Hi, 

I'd be pointed towards the guest OS if it wasn't for 2 things: 

1) this happens to all guests of both windows and Linux flavours 

2) the guests are just plain vanilla installs with nothing special. 

Jon 
 

Links:
--
[1] http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] changing hostname in ovirt

2014-01-27 Thread Dafna Ron
you cannot change a host name after it has been installed because of 
certificates being installed during host installation.


On 01/27/2014 10:42 AM, Sven Kieske wrote:

Hi,

maybe I should write an RFE BZ for this
but there might be a technical limitation, I don't know.

What I want:

I have a Host in ovirt which is in status "down" and/or
"maintenance".

I want to change the hostname/ip ovirt uses to connect to
this host.

2. RFEs:

1. This is not possible via webadmin, the address field
is grayed out (there seem to be conflicting design patterns
for the webadmin, other buttons which don't work don't get
grayed out for some reason, instead they throw errors..)

2. make it possible to change the hostname/ip via API

from what I've seen so far I have to hack the database
to make this change happen in 3.3.2

Is there any technical reason why this is not possible?
ovirt should be happy with the UUIDs for the host and
should not bother about the hostname.

Is there a way to alter the hostname/ip which I don't know?

Could someone point me to the right table and how to alter it
without breaking it?

Thank you!






--
Dafna Ron
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] two node ovirt cluster with HA

2014-01-27 Thread Dafna Ron

Powering off the host will never trigger vm migration.
As far as engine is concerned it just lost connection to the host, but 
has no way of telling if the host is down or if a router is down.
since vm's can continue running on the host even if engine has no access 
to it, starting the vm's on the second host can cause split brain and 
data corruption.


The way that the engine knows what's going on is by sending heath check 
queries to the vdsm.
Power management will try to reboot a host when the health checks to 
vdsm will not be answered.
So... if engine gets no reply and has no way of rebooting the host, the 
host status will be changed to Non-Responsive and the vm's will be 
unknown because engine has no way of knowing what's happening with the 
vm's.
Since reboot of the host will kill the vm's running on it - this will 
never cause any vm migration but... along with the High-Availability vm 
feature, you will be able to have some of the vm's re-started on the 
second host after the host reboot (and that is only if Power Management 
was confirmed as successful).


VM migration is only triggered when:
1. Cluster configuration states that the vm should be migrated in case 
of failure
2. Engine has access to the host - so the failure is on the storage side 
and not the host side.
3. the vms are not actively writing (although there might be a new RFE 
for it).


hope this clears things up

Dafna



On 01/27/2014 10:11 AM, Andrew Lau wrote:


Hi,

Have you got power management enabled?

That's the fencing feature required for the engine to ensure that the 
host is actually offline. It won't resume any other VMs to prevent 
potential VM corruption (eg. VM running on multiple hosts).


Andrew.

On Jan 27, 2014 5:12 PM, "Jaison peter" > wrote:


Hi all ,

I was setting a two node ovirt cluster with ovirt engine on
seperate node . I completed the configuration and tested VM  live
migrations with out any issues . Then for checking cluster HA I
powered down one host and expected vms running on that host to be
migrated to the other one . But nothing happened , Engine detected
host as un-rechable and marked it as non-operational and vm ran on
that host went to 'unknown state' . Is that not possible to setup
a fully HA ovirt cluster with two nodes ? or else is that my
configuration problem ? please advice .

Thanks & Regards

Alex

___
Users mailing list
Users@ovirt.org 
http://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



--
Dafna Ron
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] changing hostname in ovirt

2014-01-27 Thread Alon Bar-Lev



- Original Message -
> From: "Sven Kieske" 
> To: "Users@ovirt.org List" 
> Sent: Monday, January 27, 2014 12:42:51 PM
> Subject: [Users] changing hostname in ovirt
> 
> Hi,
> 
> maybe I should write an RFE BZ for this
> but there might be a technical limitation, I don't know.
> 
> What I want:
> 
> I have a Host in ovirt which is in status "down" and/or
> "maintenance".
> 
> I want to change the hostname/ip ovirt uses to connect to
> this host.
> 
> 2. RFEs:
> 
> 1. This is not possible via webadmin, the address field
> is grayed out (there seem to be conflicting design patterns
> for the webadmin, other buttons which don't work don't get
> grayed out for some reason, instead they throw errors..)
> 
> 2. make it possible to change the hostname/ip via API

It should be supported, yaniv?
Apart of the unique id all fiends should be editable via API, right?

> 
> from what I've seen so far I have to hack the database
> to make this change happen in 3.3.2
> 
> Is there any technical reason why this is not possible?
> ovirt should be happy with the UUIDs for the host and
> should not bother about the hostname.
> 
> Is there a way to alter the hostname/ip which I don't know?
> 
> Could someone point me to the right table and how to alter it
> without breaking it?
> 
> Thank you!
> 
> 
> --
> Mit freundlichen Grüßen / Regards
> 
> Sven Kieske
> 
> Systemadministrator
> Mittwald CM Service GmbH & Co. KG
> Königsberger Straße 6
> 32339 Espelkamp
> T: +49-5772-293-100
> F: +49-5772-293-333
> https://www.mittwald.de
> Geschäftsführer: Robert Meyer
> St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
> Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] sanlock leases on VM disks

2014-01-27 Thread José Luis Sanz Boixader
On 01/17/2014 11:43 PM, Itamar Heim wrote:
> On 01/10/2014 08:44 PM, José Luis Sanz Boixader wrote:
>> I have an oVirt testing setup with 3 hosts running for a few weeks:
>> CentOS 6.4, oVirt 3.3.1, VDSM 4.13.0, iSCSI based storage domain.
>>
>> I have just realized that sanlock has no leases on VM disks, so nothing
>> prevents vdsm/libvirt from starting a VM on two different hosts,
>> corrupting disk data. I know that something has to go wrong on oVirt
>> engine to do it, but I've manually forced some errors ("Setting Host
>> state to Non-Operational", "VM  is not responding") for a "Highly
>> available" VM and oVirt engine started that VM on another host. oVirt
>> engine was not aware, but the VM was running on two hosts.
>>
>> I think this is a job for libvirt/sanlock/wdmd, but libvirt is not
>> receiving "lease" tags for disks when creating domains. I think it
>> should.
>> What's left in my config? What am I doing wrong?
>>
>> Thanks
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
> we started introducing sanlock carefully, first to SPM nodes. in 3.4
> to hosted ovirt-engine node, and looking to add it to VMs/disks as
> well going forward.
>
> I don't remember if we have a config option to enable this, but  you
> can make this work via a custom hook at this point at vm/disk level,
> and we would love feedback on this.
>
> Thanks,
>Itamar
>

Looking into vdsm code, I've found that there's already code for sanlock
on VM disks, but it has been disabled by default
[https://bugzilla.redhat.com/show_bug.cgi?format=multiple&id=838802].
I guess it was disabled because you can't hot attach/dettach disks to VM
while running. But I prefer to enable it, as data protection is critical
in a SAN environment.

To enable it, you need to include this in /etc/vdsm/vdsm.conf at every
host in your setup.
...
[irs]
use_volume_leases = true
...
and after that, you'll need to restart vdsmd. Ensure that
/etc/libvirt/qemu.conf says lock_manager="sanlock".
VMs that were already running will not be modified, and thus not
protected. Restart them to get sanlock leases on those disks.

To confirm that it is properly running, connect to a host and type:
# sanlock client status
and you'll get some output like this, listing VMs running on that host
and its disk leases:

daemon dc7e06a0-18bb-4f68-9ea6-883dda883ef2.server
p -1 helper
p -1 listener
p 10629 Vicent
p -1 status
s
e9a91ad7-2bd3-4c98-a171-88324bc87a09:2:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/ids:0
r
e9a91ad7-2bd3-4c98-a171-88324bc87a09:1c7005ad-3d33-4c8f-9c99-2eef7be865f3:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/leases:123731968:6
p 10629
r
e9a91ad7-2bd3-4c98-a171-88324bc87a09:978b3630-f491-45a3-9826-e9ab6a744e72:/dev/e9a91ad7-2bd3-4c98-a171-88324bc87a09/leases:127926272:6
p 10629

Which tells that domain "Vicent" with proc id 10629 has two disks with
leases.
No other host will be able to run that VM, even in a engine error event,
because it could not get leases to use those disks.

Migration of VM also works, and the destination host gets and acquires
the disk leases.
All this has been tested with oVirt 3.3.1 release.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Reboot causes poweroff of VM 3.4 Beta

2014-01-27 Thread Michal Skrivanek

On Jan 27, 2014, at 11:59 , Jonathan Archer  wrote:

> On 27/01/2014 10:56, Michal Skrivanek wrote:
> 
>> On Jan 27, 2014, at 11:46 , Jonathan Archer  wrote:
>>> On 25/01/2014 20:02, Roy Golan wrote:
 Please attach engine.log and vdsm.log On Jan 25, 2014 5:59 PM, Jon Archer 
 < j...@rosslug.org.uk> wrote:
> Hi, Seem to be suffering an issue in 3.4 where if a vm Hi,
 Seem to be suffering an issue in 3.4 where if a vm is rebooted it actually 
 shuts down, this occurs for all guests regardless of OS installed within. 
 Anyone seen this? Jon ___ 
 Users mailing list Users@ovirt.org 
 http://lists.ovirt.org/mailman/listinfo/users
>>> Attached is a sample of the engine.log and vdsm.log during a "reboot" the 
>>> VM as previously stated shutdown rather than reboot,. There wasn't anything 
>>> in the server.log around the time, the previous entry was at least 30 mins 
>>> before.
>> Hi,
>> 
>> your log doesn't contain the relevant part, there's no command logged, other 
>> than
>> "Message: VM wcsmail01 is down. Exit message: User shut down"
>> which means the guest was shut down from inside of the OS
>> 
>> how did you trigger the reboot/shutdown?
>> 
>> Thanks,
>> michal
>> 
>>> Thanks 
>>> ___
>>>  Users mailing list Users@ovirt.org 
>>> http://lists.ovirt.org/mailman/listinfo/users
> The reboot was triggered using the reboot command at the console.

ok, makes sense there's nothing in the log

> 
> 
> I also have a windows 7 guest, which shuts down when the reboot option is 
> selected.

what doesn't make sense is the behavior:) It should simply reboot, there's 
nothing oVirt is doing in this case…could it be your OS is configured(or has 
decided) to shutdown instead?
 

> 
>  
> Jon
> 
>  
>  

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Reboot causes poweroff of VM 3.4 Beta

2014-01-27 Thread Jonathan Archer
 

On 27/01/2014 10:56, Michal Skrivanek wrote: 

> On Jan 27, 2014, at 11:46 , Jonathan Archer  wrote:
> On 25/01/2014 20:02, Roy Golan wrote: Please attach engine.log and vdsm.log 
> On Jan 25, 2014 5:59 PM, Jon Archer < j...@rosslug.org.uk> wrote: Hi, Seem to 
> be suffering an issue in 3.4 where if a vm Hi, Seem to be suffering an issue 
> in 3.4 where if a vm is rebooted it actually shuts down, this occurs for all 
> guests regardless of OS installed within. Anyone seen this? Jon 
> ___ Users mailing list 
> Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users [1]
 Attached is a sample of the engine.log and vdsm.log during a "reboot"
the VM as previously stated shutdown rather than reboot,. There wasn't
anything in the server.log around the time, the previous entry was at
least 30 mins before. 

Hi,

your log doesn't contain the relevant part, there's no command logged,
other than
"Message: VM wcsmail01 is down. Exit message: User shut down"
which means the guest was shut down from inside of the OS

how did you trigger the reboot/shutdown?

Thanks,
michal

> Thanks 
> ___ 
> Users mailing list Users@ovirt.org 
> http://lists.ovirt.org/mailman/listinfo/users [1]

The reboot was triggered using the reboot command at the console.

I also have a windows 7 guest, which shuts down when the reboot option
is selected. 

Jon 

 

Links:
--
[1] http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Reboot causes poweroff of VM 3.4 Beta

2014-01-27 Thread Michal Skrivanek

On Jan 27, 2014, at 11:46 , Jonathan Archer  wrote:

> On 25/01/2014 20:02, Roy Golan wrote:
> 
>> Please attach engine.log and vdsm.log
>> 
>> On Jan 25, 2014 5:59 PM, Jon Archer <
>> j...@rosslug.org.uk> wrote:
>>> Hi, Seem to be suffering an issue in 3.4 where if a vm Hi,
>> Seem to be suffering an issue in 3.4 where if a vm is rebooted it 
>> actually shuts down, this occurs for all guests regardless of OS 
>> installed within.
>> 
>> Anyone seen this?
>> 
>> Jon
>> ___
>> Users mailing list
>> 
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
> Attached is a sample of the engine.log and vdsm.log during a "reboot" the VM 
> as previously stated shutdown rather than reboot,.
> 
> There wasn't anything in the server.log around the time, the previous entry 
> was at least 30 mins before.
Hi,

your log doesn't contain the relevant part, there's no command logged, other 
than
"Message: VM wcsmail01 is down. Exit message: User shut down"
which means the guest was shut down from inside of the OS

how did you trigger the reboot/shutdown?

Thanks,
michal

> 
> 
> Thanks
> 
>  
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] changing hostname in ovirt

2014-01-27 Thread Sven Kieske
Hi,

maybe I should write an RFE BZ for this
but there might be a technical limitation, I don't know.

What I want:

I have a Host in ovirt which is in status "down" and/or
"maintenance".

I want to change the hostname/ip ovirt uses to connect to
this host.

2. RFEs:

1. This is not possible via webadmin, the address field
is grayed out (there seem to be conflicting design patterns
for the webadmin, other buttons which don't work don't get
grayed out for some reason, instead they throw errors..)

2. make it possible to change the hostname/ip via API

from what I've seen so far I have to hack the database
to make this change happen in 3.3.2

Is there any technical reason why this is not possible?
ovirt should be happy with the UUIDs for the host and
should not bother about the hostname.

Is there a way to alter the hostname/ip which I don't know?

Could someone point me to the right table and how to alter it
without breaking it?

Thank you!


-- 
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH & Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] two node ovirt cluster with HA

2014-01-27 Thread Andrew Lau
Hi,

Have you got power management enabled?

That's the fencing feature required for the engine to ensure that the host
is actually offline. It won't resume any other VMs to prevent potential VM
corruption (eg. VM running on multiple hosts).

Andrew.
On Jan 27, 2014 5:12 PM, "Jaison peter"  wrote:

> Hi all ,
>
> I was setting a two node ovirt cluster with ovirt engine on seperate node
> . I completed the configuration and tested VM  live migrations with out any
> issues . Then for checking cluster HA I powered down one host and expected
> vms running on that host to be migrated to the other one . But nothing
> happened , Engine detected host as un-rechable and marked it as
> non-operational and vm ran on that host went to 'unknown state' . Is that
> not possible to setup a fully HA ovirt cluster with two nodes ? or else is
> that my configuration problem ? please advice .
>
> Thanks & Regards
>
> Alex
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-27 Thread Dafna Ron

I'm adding Vijay to see if he can help here.

Dafna


On 01/27/2014 08:47 AM, Federico Simoncelli wrote:

- Original Message -

From: "Itamar Heim" 
To: "Ted Miller" , users@ovirt.org, "Federico Simoncelli" 

Cc: "Allon Mureinik" 
Sent: Sunday, January 26, 2014 11:17:04 PM
Subject: Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

On 01/27/2014 12:00 AM, Ted Miller wrote:

On 1/26/2014 4:00 PM, Itamar Heim wrote:

On 01/26/2014 10:51 PM, Ted Miller wrote:

On 1/26/2014 3:10 PM, Itamar Heim wrote:

On 01/26/2014 10:08 PM, Ted Miller wrote:
is this gluster storage (guessing sunce you mentioned a 'volume')

yes (mentioned under "setup" above)

does it have a quorum?

Volume Name: VM2
Type: Replicate
Volume ID: 7bea8d3b-ec2a-4939-8da8-a82e6bda841e
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.41.65.2:/bricks/01/VM2
Brick2: 10.41.65.4:/bricks/01/VM2
Brick3: 10.41.65.4:/bricks/101/VM2
Options Reconfigured:
cluster.server-quorum-type: server
storage.owner-gid: 36
storage.owner-uid: 36
auth.allow: *
user.cifs: off
nfs.disa

(there were reports of split brain on the domain metadata before when
no quorum exist for gluster)

after full heal:

[root@office4a ~]$ gluster volume heal VM2 info
Gathering Heal info on volume VM2 has been successful

Brick 10.41.65.2:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/101/VM2
Number of entries: 0
[root@office4a ~]$ gluster volume heal VM2 info split-brain
Gathering Heal info on volume VM2 has been successful

Brick 10.41.65.2:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/101/VM2
Number of entries: 0

noticed this in host /var/log/messages (while looking for something
else).  Loop seems to repeat over and over.

Jan 26 15:35:52 office4a sanlock[3763]: 2014-01-26 15:35:52-0500 14678
[30419]: read_sectors delta_leader offset 512 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids


Jan 26 15:35:53 office4a sanlock[3763]: 2014-01-26 15:35:53-0500 14679
[3771]: s1997 add_lockspace fail result -90
Jan 26 15:35:58 office4a vdsm TaskManager.Task ERROR
Task=`89885661-88eb-4ea3-8793-00438735e4ab`::Unexpected
error#012Traceback
(most recent call last):#012  File "/usr/share/vdsm/storage/task.py",
line
857, in _run#012 return fn(*args, **kargs)#012  File
"/usr/share/vdsm/logUtils.py", line 45, in wrapper#012res = f(*args,
**kwargs)#012  File "/usr/share/vdsm/storage/hsm.py", line 2111, in
getAllTasksStatuses#012allTasksStatus = sp.getAllTasksStatuses()#012
File "/usr/share/vdsm/storage/securable.py", line 66, in wrapper#012
raise
SecureError()#012SecureError
Jan 26 15:35:59 office4a sanlock[3763]: 2014-01-26 15:35:59-0500 14686
[30495]: read_sectors delta_leader offset 512 rv -90
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids


Jan 26 15:36:00 office4a sanlock[3763]: 2014-01-26 15:36:00-0500 14687
[3772]: s1998 add_lockspace fail result -90
Jan 26 15:36:00 office4a vdsm TaskManager.Task ERROR
Task=`8db9ff1a-2894-407a-915a-279f6a7eb205`::Unexpected
error#012Traceback
(most recent call last):#012  File "/usr/share/vdsm/storage/task.py",
line
857, in _run#012 return fn(*args, **kargs)#012  File
"/usr/share/vdsm/storage/task.py", line 318, in run#012return
self.cmd(*self.argslist, **self.argsdict)#012 File
"/usr/share/vdsm/storage/sp.py", line 273, in startSpm#012
self.masterDomain.acquireHostId(self.id)#012  File
"/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId#012
self._clusterLock.acquireHostId(hostId, async)#012  File
"/usr/share/vdsm/storage/clusterlock.py", line 189, in
acquireHostId#012raise se.AcquireHostIdFailure(self._sdUUID,
e)#012AcquireHostIdFailure: Cannot acquire host id:
('0322a407-2b16-40dc-ac67-13d387c6eb4c', SanlockException(90, 'Sanlock
lockspace add failure', 'Message too long'))

fede - thoughts on above?
(vojtech reported something similar, but it sorted out for him after
some retries)

Something truncated the ids file, as also reported by:


[root@office4a ~]$ ls
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
-l
total 1029
-rw-rw 1 vdsm kvm 0 Jan 22 00:44 ids
-rw-rw 1 vdsm kvm 0 Jan 16 18:50 inbox
-rw-rw 1 vdsm kvm 2097152 Jan 21 18:20 leases
-rw-r--r-- 1 vdsm kvm 491 Jan 21 18:20 metadata
-rw-rw 1 vdsm kvm 0 Jan 16 18:50 outbox

In the past I saw that happening because of a glusterfs bug:

https://bugzilla.redhat.com/show_bug.cgi?id=862975

Anyway in general it seems that glusterfs is not always able to reconcile
the ids file (as it's written by all the hosts at the same time).

Maybe someone from gluster can identify easily what happened. Meanwhile if
you just want to repair your data-center you could try with:

  $ cd 
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-

[Users] oVirt 3.4 test day: iscsi multipath

2014-01-27 Thread Francesco Romani

Hi everyone,

During the last oVirt 3.4 test day I played a bit with ISCSI multipath:

http://www.ovirt.org/Feature/iSCSI-Multipath#Configure_iSCSI_Multipathing

It took me longer than expected to properly set up things but, once done,
everything looked fine at the config level (both seen from engine and
the actual OS configuration) and worked well on my tests, including
physical disconnections mid-I/O. Well done! :)

-- 
Francesco Romani
RedHat Engineering Virtualization R & D
IRC: fromani
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-27 Thread Federico Simoncelli
- Original Message -
> From: "Itamar Heim" 
> To: "Ted Miller" , users@ovirt.org, "Federico Simoncelli" 
> 
> Cc: "Allon Mureinik" 
> Sent: Sunday, January 26, 2014 11:17:04 PM
> Subject: Re: [Users] Data Center stuck between "Non Responsive" and 
> "Contending"
> 
> On 01/27/2014 12:00 AM, Ted Miller wrote:
> >
> > On 1/26/2014 4:00 PM, Itamar Heim wrote:
> >> On 01/26/2014 10:51 PM, Ted Miller wrote:
> >>>
> >>> On 1/26/2014 3:10 PM, Itamar Heim wrote:
>  On 01/26/2014 10:08 PM, Ted Miller wrote:
>  is this gluster storage (guessing sunce you mentioned a 'volume')
> >>> yes (mentioned under "setup" above)
>  does it have a quorum?
> >>> Volume Name: VM2
> >>> Type: Replicate
> >>> Volume ID: 7bea8d3b-ec2a-4939-8da8-a82e6bda841e
> >>> Status: Started
> >>> Number of Bricks: 1 x 3 = 3
> >>> Transport-type: tcp
> >>> Bricks:
> >>> Brick1: 10.41.65.2:/bricks/01/VM2
> >>> Brick2: 10.41.65.4:/bricks/01/VM2
> >>> Brick3: 10.41.65.4:/bricks/101/VM2
> >>> Options Reconfigured:
> >>> cluster.server-quorum-type: server
> >>> storage.owner-gid: 36
> >>> storage.owner-uid: 36
> >>> auth.allow: *
> >>> user.cifs: off
> >>> nfs.disa
>  (there were reports of split brain on the domain metadata before when
>  no quorum exist for gluster)
> >>> after full heal:
> >>>
> >>> [root@office4a ~]$ gluster volume heal VM2 info
> >>> Gathering Heal info on volume VM2 has been successful
> >>>
> >>> Brick 10.41.65.2:/bricks/01/VM2
> >>> Number of entries: 0
> >>>
> >>> Brick 10.41.65.4:/bricks/01/VM2
> >>> Number of entries: 0
> >>>
> >>> Brick 10.41.65.4:/bricks/101/VM2
> >>> Number of entries: 0
> >>> [root@office4a ~]$ gluster volume heal VM2 info split-brain
> >>> Gathering Heal info on volume VM2 has been successful
> >>>
> >>> Brick 10.41.65.2:/bricks/01/VM2
> >>> Number of entries: 0
> >>>
> >>> Brick 10.41.65.4:/bricks/01/VM2
> >>> Number of entries: 0
> >>>
> >>> Brick 10.41.65.4:/bricks/101/VM2
> >>> Number of entries: 0
> >>>
> >>> noticed this in host /var/log/messages (while looking for something
> >>> else).  Loop seems to repeat over and over.
> >>>
> >>> Jan 26 15:35:52 office4a sanlock[3763]: 2014-01-26 15:35:52-0500 14678
> >>> [30419]: read_sectors delta_leader offset 512 rv -90
> >>> /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
> >>>
> >>>
> >>> Jan 26 15:35:53 office4a sanlock[3763]: 2014-01-26 15:35:53-0500 14679
> >>> [3771]: s1997 add_lockspace fail result -90
> >>> Jan 26 15:35:58 office4a vdsm TaskManager.Task ERROR
> >>> Task=`89885661-88eb-4ea3-8793-00438735e4ab`::Unexpected
> >>> error#012Traceback
> >>> (most recent call last):#012  File "/usr/share/vdsm/storage/task.py",
> >>> line
> >>> 857, in _run#012 return fn(*args, **kargs)#012  File
> >>> "/usr/share/vdsm/logUtils.py", line 45, in wrapper#012res = f(*args,
> >>> **kwargs)#012  File "/usr/share/vdsm/storage/hsm.py", line 2111, in
> >>> getAllTasksStatuses#012allTasksStatus = sp.getAllTasksStatuses()#012
> >>> File "/usr/share/vdsm/storage/securable.py", line 66, in wrapper#012
> >>> raise
> >>> SecureError()#012SecureError
> >>> Jan 26 15:35:59 office4a sanlock[3763]: 2014-01-26 15:35:59-0500 14686
> >>> [30495]: read_sectors delta_leader offset 512 rv -90
> >>> /rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
> >>>
> >>>
> >>> Jan 26 15:36:00 office4a sanlock[3763]: 2014-01-26 15:36:00-0500 14687
> >>> [3772]: s1998 add_lockspace fail result -90
> >>> Jan 26 15:36:00 office4a vdsm TaskManager.Task ERROR
> >>> Task=`8db9ff1a-2894-407a-915a-279f6a7eb205`::Unexpected
> >>> error#012Traceback
> >>> (most recent call last):#012  File "/usr/share/vdsm/storage/task.py",
> >>> line
> >>> 857, in _run#012 return fn(*args, **kargs)#012  File
> >>> "/usr/share/vdsm/storage/task.py", line 318, in run#012return
> >>> self.cmd(*self.argslist, **self.argsdict)#012 File
> >>> "/usr/share/vdsm/storage/sp.py", line 273, in startSpm#012
> >>> self.masterDomain.acquireHostId(self.id)#012  File
> >>> "/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId#012
> >>> self._clusterLock.acquireHostId(hostId, async)#012  File
> >>> "/usr/share/vdsm/storage/clusterlock.py", line 189, in
> >>> acquireHostId#012raise se.AcquireHostIdFailure(self._sdUUID,
> >>> e)#012AcquireHostIdFailure: Cannot acquire host id:
> >>> ('0322a407-2b16-40dc-ac67-13d387c6eb4c', SanlockException(90, 'Sanlock
> >>> lockspace add failure', 'Message too long'))
> 
> fede - thoughts on above?
> (vojtech reported something similar, but it sorted out for him after
> some retries)

Something truncated the ids file, as also reported by:

> [root@office4a ~]$ ls
> /rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
> -l
> total 1029
> -rw-rw 1 vdsm kvm 0 Jan 22 00:44 ids
> -rw-rw 1 vdsm kvm 0 Jan 16 18:50 inbox
> -rw-rw 1 vdsm kvm 2097152 Jan 21 18:20 leases
> -rw-r--r-- 1 vdsm kvm 491 Jan 21 18:20 metadata
>

Re: [Users] Hosted-Engine startup problem

2014-01-27 Thread Sandro Bonazzola
Il 24/01/2014 09:11, Sebastian Classen ha scritto:
> Hi,
> 
> we installes ovirt beta with hosted-engine. After Setup was complete the 
> engine VM reboots and never came up again. It looks like the host is unable 
> to find the VM. As requested in IRC I attached the relevant logs.
> 
> Please CC me, as I'm not subscribed.

It seems like Bug 1055059 - The --vm-start function does not call the createvm 
command but --vm-start-paused does

Suggested workaround is:
# hosted-engine --vm-start-paused
# virsh resume HostedEngine

We're still investigating.


> 
> Greets
>   Sebastian.
> 
> 
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 


-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] multiple storage domains

2014-01-27 Thread Michal Skrivanek
Hey,
I was trying to play with multiple storage domain during the 3.4 test day and 
even though it took me ages to set storage up correctly, the feature itself 
seems to be working just fine. Much easier to select what kind of storage I 
want to use…just Shared vs Local storage pool. VM could use volumes from 
different domains and I didn't have to worry about the right type (other than 
local vs shared, obviously:)

Thanks,
michal
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users