Re: [ovirt-users] hosted-engine setup Gluster fails to execute

2016-05-01 Thread Sahina Bose
You will need to provide the hosted-engine setup log to see which 
gluster command failed to execute.


On 04/30/2016 10:10 PM, Langley, Robert wrote:


I’m attempting to host the engine within a GlusterFS Replica 3 storage 
volume.


During setup, after entering the server and volume, I’m receiving the 
message that ‘/sbin/gluster’ failed to execute.


Reviewing the gluster cmd log, it looks as though /sbin/gluster does 
execute.


I can successfully mount the volume on the host outside of the 
hosted-engine setup.


Any assistance would be appreciated.



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] iSCSI Data Domain Down

2016-05-01 Thread Clint Boggio
Thank you so much Arman. With use of that command, I was able to restore 
service.

I really appreciate the help

> On May 1, 2016, at 2:58 PM, Arman Khalatyan  wrote:
> 
> Hi, before to start target cli you should remove all lvm auto-imported 
> volumes:
> dmsetup remove_all
> Then restart your targetcli.
> Am 01.05.2016 1:51 nachm. schrieb "Clint Boggio" :
>> Greetings oVirt Family;
>> 
>> Due to catastrophic power failure, my datacenter lost power. I am using a 
>> CentOS7 server to provide ISCSI services to my OVirt platform.
>> 
>> When the power came back on, and the iscsi server booted back up, the 
>> filters in lvm.conf were faulty and LVM assumed control over the LVM's that 
>> OVirt uses as the disks for the VMs. This tanked target.service because it 
>> claims "device already in use" and my datacenter is down.
>> 
>> I've tried several filter combinations in lvm.conf to no avail, and in my 
>> search I've found no documentation on how to make LVM "forget" about the 
>> volumes that it had assumed and release them.
>> 
>> Do any of you know of a procedure to make lvm forget about and release the 
>> volumes on /dev/sda ?
>> 
>> OVirt 3.6.5 on CentOS 7
>> 4 Hypervisor nodes CentOS7
>> 1 Dedicated engine CentOS7
>> 1 iscsi SAN CentOS 7 exporting 10TB block device from a Dell Perc RAID 
>> controller /dev/sda with targetcli.
>> 1 NFS server for ISO and Export Domains 5TB
>> 
>> I'm out I ideas and any help would be greatly appreciated.
>> 
>> I'm currently using dd to recover the VM disk drives over to the NFS server 
>> in case this cannot be recovered.
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Can't remove snapshot

2016-05-01 Thread Marcelo Leandro
hello,
i have problem for delete one snapshot.
output the script vm-disk-info.py

Warning: volume 023110fa-7d24-46ec-ada8-d617d7c2adaf is in chain but illegal
Volumes:
a09bfb5d-3922-406d-b4e0-daafad96ffec

after running the md5sum command I realized that the volume change is the
base:
a09bfb5d-3922-406d-b4e0-daafad96ffec

the disk  023110fa-7d24-46ec-ada8-d617d7c2adaf does not change.

Thanks.



2016-03-18 16:50 GMT-03:00 Greg Padgett :

> On 03/18/2016 03:10 PM, Nir Soffer wrote:
>
>> On Fri, Mar 18, 2016 at 7:55 PM, Nathanaël Blanchet 
>> wrote:
>>
>>> Hello,
>>>
>>> I can create snapshot when no one exists but I'm not able to remove it
>>> after.
>>>
>>
>> Do you try to remove it when the vm is running?
>>
>> It concerns many of my vms, and when stopping them, they can't boot
>>> anymore
>>> because of the illegal status of the disks, this leads me in a critical
>>> situation
>>>
>>> VM fedora23 is down with error. Exit message: Unable to get volume size
>>> for
>>> domain 5ef8572c-0ab5-4491-994a-e4c30230a525 volume
>>> e5969faa-97ea-41df-809b-cc62161ab1bc
>>>
>>> As far as I didn't initiate any live merge, am I concerned by this bug
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1306741?
>>> I'm running 3.6.2, will upgrade to 3.6.3 solve this issue?
>>>
>>
>> If you tried to remove a snapshot while the vm is running you did
>> initiate live merge, and this bug may effect you.
>>
>> Adding Greg for adding more info about this.
>>
>>
> Hi Nathanaël,
>
> From the logs you pasted below, showing RemoveSnapshotSingleDiskCommand
> (not ..SingleDiskLiveCommand), it looks like a non-live snapshot.  In that
> case, bug 1306741 would not affect you.
>
> To dig deeper, we'd need to know the root cause of why the image could not
> be deleted.  You should be able to find some clues in your engine log above
> the snippet you pasted below, or perhaps something in the vdsm log will
> reveal the reason.
>
> Thanks,
> Greg
>
>
>
>>> 2016-03-18 18:26:57,652 ERROR
>>> [org.ovirt.engine.core.bll.RemoveSnapshotCommand]
>>> (org.ovirt.thread.pool-8-thread-39) [a1e222d] Ending command
>>> 'org.ovirt.engine.core.bll.RemoveSnapshotCommand' with failure.
>>> 2016-03-18 18:26:57,663 ERROR
>>> [org.ovirt.engine.core.bll.RemoveSnapshotCommand]
>>> (org.ovirt.thread.pool-8-thread-39) [a1e222d] Could not delete image
>>> '46e9ecc8-e168-4f4d-926c-e769f5df1f2c' from snapshot
>>> '88fcf167-4302-405e-825f-ad7e0e9f6564'
>>> 2016-03-18 18:26:57,678 WARN
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
>>> (org.ovirt.thread.pool-8-thread-39) [a1e222d] Correlation ID: a1e222d,
>>> Job
>>> ID: 00d3e364-7e47-4022-82ff-f772cd79d4a1, Call Stack: null, Custom Event
>>> ID:
>>> -1, Message: Due to partial snapshot removal, Snapshot 'test' of VM
>>> 'fedora23' now contains only the following disks: 'fedora23_Disk1'.
>>> 2016-03-18 18:26:57,695 ERROR
>>> [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand]
>>> (org.ovirt.thread.pool-8-thread-39) [724e99fd] Ending command
>>> 'org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand' with failure.
>>> 2016-03-18 18:26:57,708 ERROR
>>> [org.ovirt.engine.core.dal.dbbroker.auditloghandlin
>>>
>>> Thank you for your help.
>>>
>>>
>>> Le 23/02/2016 19:51, Greg Padgett a écrit :
>>>

 On 02/22/2016 07:10 AM, Marcelo Leandro wrote:

>
> Hello,
>
> The bug with snapshot  it will be fixed in ovirt 3.6.3?
>
> thanks.
>
>
 Hi Marcelo,

 Yes, the bug below (bug 1301709) is now targeted to 3.6.3.

 Thanks,
 Greg

 2016-02-18 11:34 GMT-03:00 Adam Litke :
>
>>
>> On 18/02/16 10:37 +0100, Rik Theys wrote:
>>
>>>
>>>
>>> Hi,
>>>
>>> On 02/17/2016 05:29 PM, Adam Litke wrote:
>>>


 On 17/02/16 11:14 -0500, Greg Padgett wrote:

>
>
> On 02/17/2016 03:42 AM, Rik Theys wrote:
>
>>
>>
>> Hi,
>>
>> On 02/16/2016 10:52 PM, Greg Padgett wrote:
>>
>>>
>>>
>>> On 02/16/2016 08:50 AM, Rik Theys wrote:
>>>


From the above I conclude that the disk with id that ends
 with

>>>
>>>
>>> Similar to what I wrote to Marcelo above in the thread, I'd
>>> recommend
>>> running the "VM disk info gathering tool" attached to [1].  It's
>>> the
>>> best way to ensure the merge was completed and determine which
>>> image
>>> is
>>> the "bad" one that is no longer in use by any volume chains.
>>>
>>
>>
>>
>> I've ran the disk info gathering tool and this outputs (for the
>> affected
>> VM):
>>
>> VM lena
>>   Disk b2390535-744f-4c02-bdc8-5a897226554b
>> 

Re: [ovirt-users] iSCSI Data Domain Down

2016-05-01 Thread Arman Khalatyan
Hi, before to start target cli you should remove all lvm auto-imported
volumes:
dmsetup remove_all
Then restart your targetcli.
Am 01.05.2016 1:51 nachm. schrieb "Clint Boggio" :

> Greetings oVirt Family;
>
> Due to catastrophic power failure, my datacenter lost power. I am using a
> CentOS7 server to provide ISCSI services to my OVirt platform.
>
> When the power came back on, and the iscsi server booted back up, the
> filters in lvm.conf were faulty and LVM assumed control over the LVM's that
> OVirt uses as the disks for the VMs. This tanked target.service because it
> claims "device already in use" and my datacenter is down.
>
> I've tried several filter combinations in lvm.conf to no avail, and in my
> search I've found no documentation on how to make LVM "forget" about the
> volumes that it had assumed and release them.
>
> Do any of you know of a procedure to make lvm forget about and release the
> volumes on /dev/sda ?
>
> OVirt 3.6.5 on CentOS 7
> 4 Hypervisor nodes CentOS7
> 1 Dedicated engine CentOS7
> 1 iscsi SAN CentOS 7 exporting 10TB block device from a Dell Perc RAID
> controller /dev/sda with targetcli.
> 1 NFS server for ISO and Export Domains 5TB
>
> I'm out I ideas and any help would be greatly appreciated.
>
> I'm currently using dd to recover the VM disk drives over to the NFS
> server in case this cannot be recovered.
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted-engine --deploy errors out with code "29" -- "no link present"

2016-05-01 Thread Edward Haas
On Fri, Apr 29, 2016 at 9:20 AM, Sandro Bonazzola 
wrote:

>
>
> On Thu, Apr 28, 2016 at 11:06 PM, Beckman, Daniel <
> daniel.beck...@ingramcontent.com> wrote:
>
>> Hello,
>>
>>
>>
>> I’m trying to setup oVirt for the first time using hosted engine. This is
>> on a Dell PowerEdge R720 (512GB RAM), with 2 10G interfaces (connected to
>> regular access ports on the switch, DHCP), and using external iSCSI
>> storage. This is on CentOS 7.2 (latest) with the 4.5 kernel from EPEL.
>> Here’s the main error I’m getting at the end of setup:
>>
>>
>>
>> RuntimeError: Failed to setup networks {'ovirtmgmt': {'nic': 'p1p1',
>> 'bootproto': 'dhcp', 'blockingdhcp': True, 'defaultRoute': True}}. Error
>> code: "29" message: "Determining IP information for ovirtmgmt... failed; no
>> link present.  Check cable?"
>>
>
>
> This message comes from vdsm, can you please attach vdsm log?
>
>
>
>
>>
>>
>> Here is what that interface ‘p1p1’ looks like:
>>
>>
>>
>> [root@labvmhostt01 ovirt-hosted-engine-setup]# cat
>> /etc/sysconfig/network-scripts/ifcfg-p1p1
>>
>> # Generated by dracut initrd
>>
>> DEVICE="p1p1"
>>
>> ONBOOT=yes
>>
>> UUID="9d2666a5-9b72-4f9e-b4e9-4bfb6ad9b263"
>>
>> IPV6INIT=no
>>
>> BOOTPROTO=dhcp
>>
>> DEFROUTE=yes
>>
>> HWADDR="a0:36:9f:33:39:e8"
>>
>> TYPE=Ethernet
>>
>> NAME="p1p1"
>>
>> PERSISTENT_DHCLIENT=1
>>
>> NM_CONTROLLED=no
>>
>> LINKDELAY=10
>>
>>
>>
>> Note that I had added ‘linkdelay=10’ because that interface takes a while
>> to come up. Without it, an ‘ifup p1p1’ will generate that same error about
>> “no link present. Check cable?”. It works after a second ‘ifup p1p1’. With
>> the linkdelay option it works right away. I wonder if that’s related.  From
>> /var/log/messages:
>>
>
You are correct, it is related.
VDSM takes over the ifcfg configuration of the interfaces it manages and
writes its own settings (overwriting the existing config after backing it
up).
At the moment, we do not support the LINKDELAY option.

As an workaround, you could use a VDSM before_ifcfg_write hook to add this
parameter for the specific nic.

Please open a bug on this so we can track it.

Thanks,
Edy.



>>
>> Apr 28 15:24:51 localhost dhclient[5976]: dhclient.c:2680: Failed to bind
>> fallback interface to ovirtmgmt: No such device
>>
>> Apr 28 15:25:01 localhost dhclient[5976]: DHCPREQUEST on ovirtmgmt to
>> 10.50.3.2 port 67 (xid=0x6d98d072)
>>
>> Apr 28 15:25:01 localhost dhclient[5976]: dhclient.c:2680: Failed to bind
>> fallback interface to ovirtmgmt: No such device
>>
>> Apr 28 15:25:06 localhost systemd: Started /usr/sbin/ifup ovirtmgmt.
>>
>> Apr 28 15:25:06 localhost systemd: Starting /usr/sbin/ifup ovirtmgmt.
>>
>> Apr 28 15:25:06 localhost kernel: IPv6: ADDRCONF(NETDEV_UP): ovirtmgmt:
>> link is not ready
>>
>> Apr 28 15:25:12 localhost kernel: ovirtmgmt: port 1(p1p1) entered
>> disabled state
>>
>> Apr 28 15:25:48 localhost journal: vdsm vds ERROR Determining IP
>> information for ovirtmgmt... failed; no link present.  Check
>> cable?#012Traceback (most recent call last):#012  File
>> "/usr/share/vdsm/API.py", line 1648, in _rollback#012yield
>> rollbackCtx#012  File "/usr/share/vdsm/API.py", line 1500, in
>> setupNetworks#012supervdsm.getProxy().setupNetworks(networks, bondings,
>> options)#012  File "/usr/share/vdsm/supervdsm.py", line 50, in
>> __call__#012return callMethod()#012  File
>> "/usr/share/vdsm/supervdsm.py", line 48, in #012**kwargs)#012
>> File "", line 2, in setupNetworks#012  File
>> "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in
>> _callmethod#012raise convert_to_error(kind,
>> result)#012ConfigNetworkError: (29, 'Determining IP information for
>> ovirtmgmt... failed; no link present.  Check cable?')
>>
>>
>>
>> I’m attaching the setup log file. The physical interface p1p1 is indeed
>> stable once up. Any help would be appreciated!
>>
>>
>>
>> Thanks,
>>
>> Daniel
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
>
> --
> Sandro Bonazzola
> Better technology. Faster innovation. Powered by community collaboration.
> See how it works at redhat.com
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] node unresponsive after reboot

2016-05-01 Thread Cam Mac
Hi,

I have a two node + engine ovirt setup, and I was having problems
doing a live migration between nodes. I looked in the vdsm logs and
noticed selinux errors, so I checked the selinux config, and both the
ovirt-engine host and one of the nodes had selinux disabled. So I
thought I would enable it on these two hosts, as it is officially
supported anyway. I started with the node, and put it into maintenance
mode, which interestingly, migrated the VMs off to the other node
without issue. After modifying the selinux config, I then rebooted
that node, which came back up. I then tried to activate the node but
it fails and marks it as unresponsive.

--8<--

2016-04-28 16:34:31,326 INFO
[org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp
Reactor) [29acb18b] Connecting to
kvm-ldn-02/172.16.23.12
2016-04-28 16:34:31,327 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand]
(DefaultQuartzScheduler_Worker-32) [ac322cb] Command
'GetCapabilitiesVDSCommand(HostName = kvm-ldn-02,
VdsIdAndVdsVDSCommandParametersBase:{runAsync='true',
hostId='b12c0b80-d64d-42fd-8a55-94f92b9ca3aa',
vds='Host[kvm-ldn-02,b12c0b80-d64d-42fd-8a55-94f92b9ca3aa]'})'
execution failed:
org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection
failed
2016-04-28 16:34:31,327 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-32) [ac322cb] Failure to refresh Vds
runtime info: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException:
Connection failed
2016-04-28 16:34:31,327 ERROR
[org.ovirt.engine.core.vdsbroker.HostMonitoring]
(DefaultQuartzScheduler_Worker-32) [ac322cb] Exception:
org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
org.ovirt.vdsm.jsonrpc.client.ClientConnectionException: Connection
failed
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createNetworkException(VdsBrokerCommand.java:157)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:120)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65)
[vdsbroker.jar:]
at
org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33)
[dal.jar:]
at
org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:467)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:652)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:119)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.HostMonitoring.refresh(HostMonitoring.java:84)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:227)
[vdsbroker.jar:]
at sun.reflect.GeneratedMethodAccessor120.invoke(Unknown
Source) [:1.8.0_71]
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[rt.jar:1.8.0_71]
at java.lang.reflect.Method.invoke(Method.java:497)
[rt.jar:1.8.0_71]
at
org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:81)
[scheduler.jar:]
at
org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:52)
[scheduler.jar:]
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
[quartz.jar:]
at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
[quartz.jar:]
Caused by: org.ovirt.vdsm.jsonrpc.client.ClientConnectionException:
Connection failed
at
org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.connect(ReactorClient.java:157)
[vdsm-jsonrpc-java-client.jar:]
at
org.ovirt.vdsm.jsonrpc.client.JsonRpcClient.getClient(JsonRpcClient.java:114)
[vdsm-jsonrpc-java-client.jar:]
at
org.ovirt.vdsm.jsonrpc.client.JsonRpcClient.call(JsonRpcClient.java:73)
[vdsm-jsonrpc-java-client.jar:]
at
org.ovirt.engine.core.vdsbroker.jsonrpc.FutureMap.(FutureMap.java:68)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.jsonrpc.JsonRpcVdsServer.getCapabilities(JsonRpcVdsServer.java:268)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand.executeVdsBrokerCommand(GetCapabilitiesVDSCommand.java:15)
[vdsbroker.jar:]
at
org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:110)
[vdsbroker.jar:]
... 14 more

--8<--

Any ideas?

Thanks for any help,

Cam
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VMs becoming non-responsive sporadically

2016-05-01 Thread nicolas

El 2016-05-01 14:01, Nir Soffer escribió:

On Sun, May 1, 2016 at 3:31 PM,   wrote:

El 2016-04-30 23:22, Nir Soffer escribió:


On Sun, May 1, 2016 at 12:48 AM,   wrote:


El 2016-04-30 22:37, Nir Soffer escribió:



On Sat, Apr 30, 2016 at 10:28 PM, Nir Soffer  
wrote:



On Sat, Apr 30, 2016 at 7:16 PM,   wrote:



El 2016-04-30 16:55, Nir Soffer escribió:




On Sat, Apr 30, 2016 at 11:33 AM, Nicolás  
wrote:




Hi Nir,

El 29/04/16 a las 22:34, Nir Soffer escribió:





On Fri, Apr 29, 2016 at 9:17 PM,   wrote:





Hi,

We're running oVirt 3.6.5.3-1 and lately we're experiencing 
some

issues
with
some VMs being paused because they're marked as 
non-responsive.

Mostly,
after a few seconds they recover, but we want to debug 
precisely

this
problem so we can fix it consistently.

Our scenario is the following:

~495 VMs, of which ~120 are constantly up
3 datastores, all of them iSCSI-based:
   * ds1: 2T, currently has 276 disks
   * ds2: 2T, currently has 179 disks
   * ds3: 500G, currently has 65 disks
7 hosts: All have mostly the same hardware. CPU and memory 
are

currently
very lowly used (< 10%).

   ds1 and ds2 are physically the same backend which exports 
two

2TB
volumes.
ds3 is a different storage backend where we're currently 
migrating

some
disks from ds1 and ds2.





What the the storage backend behind ds1 and 2?






The storage backend for ds1 and ds2 is the iSCSI-based HP 
LeftHand

P4000
G2.

Usually, when VMs become unresponsive, the whole host where 
they

run
gets
unresponsive too, so that gives a hint about the problem, my 
bet

is
the
culprit is somewhere on the host side and not on the VMs 
side.





Probably the vm became unresponsive because connection to the 
host

was
lost.






I forgot to mention that less commonly we have situations where 
the

host
doesn't get unresponsive but the VMs on it do and they don't 
become
responsive ever again, so we have to forcibly power them off 
and

start
them
on a different host. But in this case the connection with the 
host

doesn't
ever get lost (so basically the host is Up, but any VM run on 
them

is
unresponsive).



When that
happens, the host itself gets non-responsive and only 
recoverable

after
reboot, since it's unable to reconnect.





Piotr, can you check engine log and explain why host is not
reconnected?


I must say this is not specific to
this oVirt version, when we were using v.3.6.4 the same 
happened,

and
it's
also worthy mentioning we've not done any configuration 
changes

and
everything had been working quite well for a long time.

We were monitoring our ds1 and ds2 physical backend to see
performance
and
we suspect we've run out of IOPS since we're reaching the 
maximum

specified
by the manufacturer, probably at certain times the host 
cannot

perform
a
storage operation within some time limit and it marks VMs as
unresponsive.
That's why we've set up ds3 and we're migrating ds1 and ds2 
to

ds3.
When
we
run out of space on ds3 we'll create more smaller volumes to 
keep

migrating.

On the host side, when this happens, we've run repoplot on 
the

vdsm
log
and
I'm attaching the result. Clearly there's a *huge* LVM 
response

time
(~30
secs.).





Indeed the log show very slow vgck and vgs commands - these 
are

called
every
5 minutes for checking the vg health and refreshing vdsm lvm 
cache.


1. starting vgck

Thread-96::DEBUG::2016-04-29
13:17:48,682::lvm::290::Storage.Misc.excCmd::(cmd) 
/usr/bin/taskset

--cpu-list 0-23 /usr/bin/sudo -n /usr/sbin/lvm vgck --config '
devices
{ pre
ferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1
write_cache_state=0 disable_after_error_count=3 filter = [
'\''a|/dev/mapper/36000eb3a4f1acbc20043|'\
'', '\''r|.*|'\'' ] }  global {  locking_type=1
prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 }  
backup

{
retain_min = 50  retain_days = 0 } ' 5de4a000-a9c4-48
9c-8eee-10368647c413 (cwd None)

2. vgck ends after 55 seconds

Thread-96::DEBUG::2016-04-29
13:18:43,173::lvm::290::Storage.Misc.excCmd::(cmd) SUCCESS: 
 =

'
WARNING: lvmetad is running but disabled. Restart lvmetad 
before

enabling it!\n';  = 0

3. starting vgs

Thread-96::DEBUG::2016-04-29
13:17:11,963::lvm::290::Storage.Misc.excCmd::(cmd) 
/usr/bin/taskset

--cpu-list 0-23 /usr/bin/sudo -n /usr/sbin/lvm vgs --config '
devices
{ pref
erred_names = ["^/dev/mapper/"] ignore_suspended_devices=1
write_cache_state=0 disable_after_error_count=3 filter = [
'\''a|/dev/mapper/36000eb3a4f1acbc20043|/de




v/mapper/36000eb3a4f1acbc200b9|/dev/mapper/360014056f0dc8930d744f83af8ddc709|/dev/mapper/WDC_WD5003ABYZ-011FA0_WD-WMAYP0J73DU6|'\'',
'\''r|.*|'\'' ] }  global {
  locking_type=1  prioritise_write_locks=1  wait_for_locks=1
use_lvmetad=0 }  backup {  retain_min = 50  retain_days = 0 } 
'

--noheadings --units b --nosuffix --separator '|
' 

Re: [ovirt-users] VMs becoming non-responsive sporadically

2016-05-01 Thread Nir Soffer
On Sun, May 1, 2016 at 3:31 PM,   wrote:
> El 2016-04-30 23:22, Nir Soffer escribió:
>>
>> On Sun, May 1, 2016 at 12:48 AM,   wrote:
>>>
>>> El 2016-04-30 22:37, Nir Soffer escribió:


 On Sat, Apr 30, 2016 at 10:28 PM, Nir Soffer  wrote:
>
>
> On Sat, Apr 30, 2016 at 7:16 PM,   wrote:
>>
>>
>> El 2016-04-30 16:55, Nir Soffer escribió:
>>>
>>>
>>>
>>> On Sat, Apr 30, 2016 at 11:33 AM, Nicolás  wrote:



 Hi Nir,

 El 29/04/16 a las 22:34, Nir Soffer escribió:
>
>
>
>
> On Fri, Apr 29, 2016 at 9:17 PM,   wrote:
>>
>>
>>
>>
>> Hi,
>>
>> We're running oVirt 3.6.5.3-1 and lately we're experiencing some
>> issues
>> with
>> some VMs being paused because they're marked as non-responsive.
>> Mostly,
>> after a few seconds they recover, but we want to debug precisely
>> this
>> problem so we can fix it consistently.
>>
>> Our scenario is the following:
>>
>> ~495 VMs, of which ~120 are constantly up
>> 3 datastores, all of them iSCSI-based:
>>* ds1: 2T, currently has 276 disks
>>* ds2: 2T, currently has 179 disks
>>* ds3: 500G, currently has 65 disks
>> 7 hosts: All have mostly the same hardware. CPU and memory are
>> currently
>> very lowly used (< 10%).
>>
>>ds1 and ds2 are physically the same backend which exports two
>> 2TB
>> volumes.
>> ds3 is a different storage backend where we're currently migrating
>> some
>> disks from ds1 and ds2.
>
>
>
>
> What the the storage backend behind ds1 and 2?





 The storage backend for ds1 and ds2 is the iSCSI-based HP LeftHand
 P4000
 G2.

>> Usually, when VMs become unresponsive, the whole host where they
>> run
>> gets
>> unresponsive too, so that gives a hint about the problem, my bet
>> is
>> the
>> culprit is somewhere on the host side and not on the VMs side.
>
>
>
>
> Probably the vm became unresponsive because connection to the host
> was
> lost.





 I forgot to mention that less commonly we have situations where the
 host
 doesn't get unresponsive but the VMs on it do and they don't become
 responsive ever again, so we have to forcibly power them off and
 start
 them
 on a different host. But in this case the connection with the host
 doesn't
 ever get lost (so basically the host is Up, but any VM run on them
 is
 unresponsive).


>> When that
>> happens, the host itself gets non-responsive and only recoverable
>> after
>> reboot, since it's unable to reconnect.
>
>
>
>
> Piotr, can you check engine log and explain why host is not
> reconnected?
>
>> I must say this is not specific to
>> this oVirt version, when we were using v.3.6.4 the same happened,
>> and
>> it's
>> also worthy mentioning we've not done any configuration changes
>> and
>> everything had been working quite well for a long time.
>>
>> We were monitoring our ds1 and ds2 physical backend to see
>> performance
>> and
>> we suspect we've run out of IOPS since we're reaching the maximum
>> specified
>> by the manufacturer, probably at certain times the host cannot
>> perform
>> a
>> storage operation within some time limit and it marks VMs as
>> unresponsive.
>> That's why we've set up ds3 and we're migrating ds1 and ds2 to
>> ds3.
>> When
>> we
>> run out of space on ds3 we'll create more smaller volumes to keep
>> migrating.
>>
>> On the host side, when this happens, we've run repoplot on the
>> vdsm
>> log
>> and
>> I'm attaching the result. Clearly there's a *huge* LVM response
>> time
>> (~30
>> secs.).
>
>
>
>
> Indeed the log show very slow vgck and vgs commands - these are
> called
> every
> 5 minutes for checking the vg health and refreshing vdsm lvm cache.
>
> 1. starting vgck
>
> Thread-96::DEBUG::2016-04-29
> 13:17:48,682::lvm::290::Storage.Misc.excCmd::(cmd) 

Re: [ovirt-users] [hosted-engine] engine VM doesn't respawn when its host was killed (poweroff)

2016-05-01 Thread Yedidyah Bar David
It's very hard to understand your flow when time moves backwards.

Please try again from a clean state. Make sure all hosts have same clock.
Then document the exact time you do stuff - starting/stopping a host,
checking status, etc.

Some things to check from your logs:

in agent.host01.log:

MainThread::INFO::2016-04-25
15:32:41,370::states::488::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine down and local host has best score (3400), attempting to start
engine VM
...
MainThread::INFO::2016-04-25
15:32:44,276::hosted_engine::1147::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
Engine VM started on localhost
...
MainThread::INFO::2016-04-25
15:32:58,478::states::672::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
Score is 0 due to unexpected vm shutdown at Mon Apr 25 15:32:58 2016

Why?

Also, in agent.host03.log:

MainThread::INFO::2016-04-25
15:29:53,218::states::488::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine down and local host has best score (3400), attempting to start
engine VM
MainThread::INFO::2016-04-25
15:29:53,223::brokerlink::111::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1461572993.22 type=state_transition
detail=EngineDown-EngineStart hostname='host03.ovirt.forest.go.th'
MainThread::ERROR::2016-04-25
15:30:23,253::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(_communicate)
Connection closed: Connection timed out

Why?

Also, in addition to the actions you stated, you changed a lot maintenance mode.

You can try something like this to get some interesting lines from agent.log:

egrep -i 'start eng|shut|vm started|vm running|vm is running on|
maintenance detected|migra'

Best,

On Mon, Apr 25, 2016 at 12:27 PM, Wee Sritippho  wrote:
> The hosted engine storage is located in an external Fibre Channel SAN.
>
>
> On 25/4/2559 16:19, Martin Sivak wrote:
>>
>> Hi,
>>
>> it seems that all nodes lost access to storage for some reason after
>> the host was killed. Where is your hosted engine storage located?
>>
>> Regards
>>
>> --
>> Martin Sivak
>> SLA / oVirt
>>
>>
>> On Mon, Apr 25, 2016 at 10:58 AM, Wee Sritippho 
>> wrote:
>>>
>>> Hi,
>>>
>>>  From the hosted-engine FAQ, the engine VM should be up and running in
>>> about
>>> 5 minutes after its host was forced poweroff. However, after updated
>>> oVirt
>>> 3.6.4 to 3.6.5, the engine VM won't restart automatically even after 10+
>>> minutes (I already made sure that global maintenance mode is set to
>>> none). I
>>> initially thought its a time sync issue, so I installed and enabled ntp
>>> on
>>> the hosts and engine. However, the issue still persists.
>>>
>>> ###Versions:
>>> [root@host01 ~]# rpm -qa | grep ovirt
>>> libgovirt-0.3.3-1.el7_2.1.x86_64
>>> ovirt-vmconsole-1.0.0-1.el7.centos.noarch
>>> ovirt-vmconsole-host-1.0.0-1.el7.centos.noarch
>>> ovirt-hosted-engine-ha-1.3.5.3-1.el7.centos.noarch
>>> ovirt-host-deploy-1.4.1-1.el7.centos.noarch
>>> ovirt-engine-sdk-python-3.6.5.0-1.el7.centos.noarch
>>> ovirt-hosted-engine-setup-1.3.5.0-1.el7.centos.noarch
>>> ovirt-release36-007-1.noarch
>>> ovirt-setup-lib-1.0.1-1.el7.centos.noarch
>>> [root@host01 ~]# rpm -qa | grep vdsm
>>> vdsm-infra-4.17.26-0.el7.centos.noarch
>>> vdsm-jsonrpc-4.17.26-0.el7.centos.noarch
>>> vdsm-gluster-4.17.26-0.el7.centos.noarch
>>> vdsm-python-4.17.26-0.el7.centos.noarch
>>> vdsm-yajsonrpc-4.17.26-0.el7.centos.noarch
>>> vdsm-4.17.26-0.el7.centos.noarch
>>> vdsm-cli-4.17.26-0.el7.centos.noarch
>>> vdsm-xmlrpc-4.17.26-0.el7.centos.noarch
>>> vdsm-hook-vmfex-dev-4.17.26-0.el7.centos.noarch
>>>
>>> ###Log files:
>>> https://app.box.com/s/fkurmwagogwkv5smkwwq7i4ztmwf9q9r
>>>
>>> ###After host02 was killed:
>>> [root@host03 wees]# hosted-engine --vm-status
>>>
>>>
>>> --== Host 1 status ==--
>>>
>>> Status up-to-date  : True
>>> Hostname   : host01.ovirt.forest.go.th
>>> Host ID: 1
>>> Engine status  : {"reason": "vm not running on this
>>> host", "health": "bad", "vm": "down", "detail": "unknown"}
>>> Score  : 3400
>>> stopped: False
>>> Local maintenance  : False
>>> crc32  : 396766e0
>>> Host timestamp : 4391
>>>
>>>
>>> --== Host 2 status ==--
>>>
>>> Status up-to-date  : True
>>> Hostname   : host02.ovirt.forest.go.th
>>> Host ID: 2
>>> Engine status  : {"health": "good", "vm": "up",
>>> "detail": "up"}
>>> Score  : 0
>>> stopped: True
>>> Local maintenance  : False
>>> crc32  : 3a345b65
>>> Host timestamp : 1458
>>>
>>>
>>> --== Host 3 status ==--
>>>
>>> 

Re: [ovirt-users] VMs becoming non-responsive sporadically

2016-05-01 Thread nicolas

El 2016-04-30 23:22, Nir Soffer escribió:

On Sun, May 1, 2016 at 12:48 AM,   wrote:

El 2016-04-30 22:37, Nir Soffer escribió:


On Sat, Apr 30, 2016 at 10:28 PM, Nir Soffer  
wrote:


On Sat, Apr 30, 2016 at 7:16 PM,   wrote:


El 2016-04-30 16:55, Nir Soffer escribió:



On Sat, Apr 30, 2016 at 11:33 AM, Nicolás  
wrote:



Hi Nir,

El 29/04/16 a las 22:34, Nir Soffer escribió:




On Fri, Apr 29, 2016 at 9:17 PM,   wrote:




Hi,

We're running oVirt 3.6.5.3-1 and lately we're experiencing 
some

issues
with
some VMs being paused because they're marked as non-responsive.
Mostly,
after a few seconds they recover, but we want to debug 
precisely

this
problem so we can fix it consistently.

Our scenario is the following:

~495 VMs, of which ~120 are constantly up
3 datastores, all of them iSCSI-based:
   * ds1: 2T, currently has 276 disks
   * ds2: 2T, currently has 179 disks
   * ds3: 500G, currently has 65 disks
7 hosts: All have mostly the same hardware. CPU and memory are
currently
very lowly used (< 10%).

   ds1 and ds2 are physically the same backend which exports 
two 2TB

volumes.
ds3 is a different storage backend where we're currently 
migrating

some
disks from ds1 and ds2.




What the the storage backend behind ds1 and 2?





The storage backend for ds1 and ds2 is the iSCSI-based HP 
LeftHand

P4000
G2.

Usually, when VMs become unresponsive, the whole host where 
they run

gets
unresponsive too, so that gives a hint about the problem, my 
bet is

the
culprit is somewhere on the host side and not on the VMs side.




Probably the vm became unresponsive because connection to the 
host

was
lost.





I forgot to mention that less commonly we have situations where 
the

host
doesn't get unresponsive but the VMs on it do and they don't 
become
responsive ever again, so we have to forcibly power them off and 
start

them
on a different host. But in this case the connection with the 
host

doesn't
ever get lost (so basically the host is Up, but any VM run on 
them is

unresponsive).



When that
happens, the host itself gets non-responsive and only 
recoverable

after
reboot, since it's unable to reconnect.




Piotr, can you check engine log and explain why host is not
reconnected?


I must say this is not specific to
this oVirt version, when we were using v.3.6.4 the same 
happened,

and
it's
also worthy mentioning we've not done any configuration changes 
and

everything had been working quite well for a long time.

We were monitoring our ds1 and ds2 physical backend to see
performance
and
we suspect we've run out of IOPS since we're reaching the 
maximum

specified
by the manufacturer, probably at certain times the host cannot
perform
a
storage operation within some time limit and it marks VMs as
unresponsive.
That's why we've set up ds3 and we're migrating ds1 and ds2 to 
ds3.

When
we
run out of space on ds3 we'll create more smaller volumes to 
keep

migrating.

On the host side, when this happens, we've run repoplot on the 
vdsm

log
and
I'm attaching the result. Clearly there's a *huge* LVM response 
time

(~30
secs.).




Indeed the log show very slow vgck and vgs commands - these are
called
every
5 minutes for checking the vg health and refreshing vdsm lvm 
cache.


1. starting vgck

Thread-96::DEBUG::2016-04-29
13:17:48,682::lvm::290::Storage.Misc.excCmd::(cmd) 
/usr/bin/taskset

--cpu-list 0-23 /usr/bin/sudo -n /usr/sbin/lvm vgck --config '
devices
{ pre
ferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1
write_cache_state=0 disable_after_error_count=3 filter = [
'\''a|/dev/mapper/36000eb3a4f1acbc20043|'\
'', '\''r|.*|'\'' ] }  global {  locking_type=1
prioritise_write_locks=1  wait_for_locks=1  use_lvmetad=0 }  
backup {

retain_min = 50  retain_days = 0 } ' 5de4a000-a9c4-48
9c-8eee-10368647c413 (cwd None)

2. vgck ends after 55 seconds

Thread-96::DEBUG::2016-04-29
13:18:43,173::lvm::290::Storage.Misc.excCmd::(cmd) SUCCESS: 
 = '

WARNING: lvmetad is running but disabled. Restart lvmetad before
enabling it!\n';  = 0

3. starting vgs

Thread-96::DEBUG::2016-04-29
13:17:11,963::lvm::290::Storage.Misc.excCmd::(cmd) 
/usr/bin/taskset
--cpu-list 0-23 /usr/bin/sudo -n /usr/sbin/lvm vgs --config ' 
devices

{ pref
erred_names = ["^/dev/mapper/"] ignore_suspended_devices=1
write_cache_state=0 disable_after_error_count=3 filter = [
'\''a|/dev/mapper/36000eb3a4f1acbc20043|/de



v/mapper/36000eb3a4f1acbc200b9|/dev/mapper/360014056f0dc8930d744f83af8ddc709|/dev/mapper/WDC_WD5003ABYZ-011FA0_WD-WMAYP0J73DU6|'\'',
'\''r|.*|'\'' ] }  global {
  locking_type=1  prioritise_write_locks=1  wait_for_locks=1
use_lvmetad=0 }  backup {  retain_min = 50  retain_days = 0 } '
--noheadings --units b --nosuffix --separator '|
' --ignoreskippedcluster -o




Re: [ovirt-users] VMs becoming non-responsive sporadically

2016-05-01 Thread Nir Soffer
On Sun, May 1, 2016 at 1:35 AM,   wrote:
> El 2016-04-30 23:22, Nir Soffer escribió:
>>
>> On Sun, May 1, 2016 at 12:48 AM,   wrote:
>>>
>>> El 2016-04-30 22:37, Nir Soffer escribió:


 On Sat, Apr 30, 2016 at 10:28 PM, Nir Soffer  wrote:
>
>
> On Sat, Apr 30, 2016 at 7:16 PM,   wrote:
>>
>>
>> El 2016-04-30 16:55, Nir Soffer escribió:
>>>
>>>
>>>
>>> On Sat, Apr 30, 2016 at 11:33 AM, Nicolás  wrote:



 Hi Nir,

 El 29/04/16 a las 22:34, Nir Soffer escribió:
>
>
>
>
> On Fri, Apr 29, 2016 at 9:17 PM,   wrote:
>>
>>
>>
>>
>> Hi,
>>
>> We're running oVirt 3.6.5.3-1 and lately we're experiencing some
>> issues
>> with
>> some VMs being paused because they're marked as non-responsive.
>> Mostly,
>> after a few seconds they recover, but we want to debug precisely
>> this
>> problem so we can fix it consistently.
>>
>> Our scenario is the following:
>>
>> ~495 VMs, of which ~120 are constantly up
>> 3 datastores, all of them iSCSI-based:
>>* ds1: 2T, currently has 276 disks
>>* ds2: 2T, currently has 179 disks
>>* ds3: 500G, currently has 65 disks
>> 7 hosts: All have mostly the same hardware. CPU and memory are
>> currently
>> very lowly used (< 10%).
>>
>>ds1 and ds2 are physically the same backend which exports two
>> 2TB
>> volumes.
>> ds3 is a different storage backend where we're currently migrating
>> some
>> disks from ds1 and ds2.
>
>
>
>
> What the the storage backend behind ds1 and 2?





 The storage backend for ds1 and ds2 is the iSCSI-based HP LeftHand
 P4000
 G2.

>> Usually, when VMs become unresponsive, the whole host where they
>> run
>> gets
>> unresponsive too, so that gives a hint about the problem, my bet
>> is
>> the
>> culprit is somewhere on the host side and not on the VMs side.
>
>
>
>
> Probably the vm became unresponsive because connection to the host
> was
> lost.





 I forgot to mention that less commonly we have situations where the
 host
 doesn't get unresponsive but the VMs on it do and they don't become
 responsive ever again, so we have to forcibly power them off and
 start
 them
 on a different host. But in this case the connection with the host
 doesn't
 ever get lost (so basically the host is Up, but any VM run on them
 is
 unresponsive).


>> When that
>> happens, the host itself gets non-responsive and only recoverable
>> after
>> reboot, since it's unable to reconnect.
>
>
>
>
> Piotr, can you check engine log and explain why host is not
> reconnected?
>
>> I must say this is not specific to
>> this oVirt version, when we were using v.3.6.4 the same happened,
>> and
>> it's
>> also worthy mentioning we've not done any configuration changes
>> and
>> everything had been working quite well for a long time.
>>
>> We were monitoring our ds1 and ds2 physical backend to see
>> performance
>> and
>> we suspect we've run out of IOPS since we're reaching the maximum
>> specified
>> by the manufacturer, probably at certain times the host cannot
>> perform
>> a
>> storage operation within some time limit and it marks VMs as
>> unresponsive.
>> That's why we've set up ds3 and we're migrating ds1 and ds2 to
>> ds3.
>> When
>> we
>> run out of space on ds3 we'll create more smaller volumes to keep
>> migrating.
>>
>> On the host side, when this happens, we've run repoplot on the
>> vdsm
>> log
>> and
>> I'm attaching the result. Clearly there's a *huge* LVM response
>> time
>> (~30
>> secs.).
>
>
>
>
> Indeed the log show very slow vgck and vgs commands - these are
> called
> every
> 5 minutes for checking the vg health and refreshing vdsm lvm cache.
>
> 1. starting vgck
>
> Thread-96::DEBUG::2016-04-29
> 13:17:48,682::lvm::290::Storage.Misc.excCmd::(cmd) 

[ovirt-users] iSCSI Data Domain Down

2016-05-01 Thread Clint Boggio
Greetings oVirt Family;

Due to catastrophic power failure, my datacenter lost power. I am using a 
CentOS7 server to provide ISCSI services to my OVirt platform.

When the power came back on, and the iscsi server booted back up, the filters 
in lvm.conf were faulty and LVM assumed control over the LVM's that OVirt uses 
as the disks for the VMs. This tanked target.service because it claims "device 
already in use" and my datacenter is down.

I've tried several filter combinations in lvm.conf to no avail, and in my 
search I've found no documentation on how to make LVM "forget" about the 
volumes that it had assumed and release them.

Do any of you know of a procedure to make lvm forget about and release the 
volumes on /dev/sda ?

OVirt 3.6.5 on CentOS 7
4 Hypervisor nodes CentOS7
1 Dedicated engine CentOS7
1 iscsi SAN CentOS 7 exporting 10TB block device from a Dell Perc RAID 
controller /dev/sda with targetcli.
1 NFS server for ISO and Export Domains 5TB

I'm out I ideas and any help would be greatly appreciated. 

I'm currently using dd to recover the VM disk drives over to the NFS server in 
case this cannot be recovered.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Shared storage between DC

2016-05-01 Thread NUNIN Roberto
Hi
I have in production the scenery something similar to what you've described.
The "enabling factor" is represented by an "storage virtualization" set of 
appliances, that maintain mirrored logical volume over fc physical volumes 
across two distinct datacenters, while giving rw simultaneus access to cluster 
hypervisors split between datacenters, that run the VMs.

So: cluster also is spread across dc, no need to import nothing.
Regards,

Roberto


Il giorno 01 mag 2016, alle ore 10:37, Arsène Gschwind 
> ha scritto:

Hi,

Is it possible to have a shared Storage domain between 2 Datacenter in oVirt?
We do replicate a FC Volume between 2 datacenter using FC SAN storage 
technology and we have an oVirt cluster on each site defined in separate DCs. 
The idea behind this is to setup a DR site and also balance the load between 
each site.
What happens if I do import a storage domain already active in one DC, will it 
break the Storage domain?

Thanks for any information..
Regards,
Arsène
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



Questo messaggio e' indirizzato esclusivamente al destinatario indicato e 
potrebbe contenere informazioni confidenziali, riservate o proprietarie. 
Qualora la presente venisse ricevuta per errore, si prega di segnalarlo 
immediatamente al mittente, cancellando l'originale e ogni sua copia e 
distruggendo eventuali copie cartacee. Ogni altro uso e' strettamente proibito 
e potrebbe essere fonte di violazione di legge.

This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise private information. If you have received it in 
error, please notify the sender immediately, deleting the original and all 
copies and destroying any hard copies. Any other use is strictly prohibited and 
may be unlawful.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted Engine Almost setup

2016-05-01 Thread Yedidyah Bar David
On Sun, Apr 24, 2016 at 3:11 AM, Pat Riehecky  wrote:
> I realize now I shouldn't have set the default cluster name,

You have to manually create such a cluster in the engine after
'engine-setup' is finished and before telling '--deploy' to continue.

> is there a way
> I can resume the install of the hosted engine?
>
> I've got the engine up and running, so I just need to jump in from after the
> engine install.
>
> Ideas?

In principle, you can try creating a cluster now and then add the host
in the engine, but if this is a new setup (and indeed destined for
"Production"), I'd start from scratch.

>
>   Checking for oVirt-Engine status at ...
> [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
> [ INFO  ] Acquiring internal CA cert from the engine
> [ INFO  ] The following CA certificate is going to be used, please
> immediately interrupt if not correct:
> [ INFO  ] Issuer: C=US,xx
> [ INFO  ] Connecting to the Engine
> [ ERROR ] Failed to execute stage 'Closing up': Specified cluster does not
> exist: Production
> [ INFO  ] Stage: Clean up
> [ INFO  ] Generating answer file
> '/var/lib/ovirt-hosted-engine-setup/answers/answers-20160424000549.conf'
> [ INFO  ] Stage: Pre-termination
> [ INFO  ] Stage: Termination
> [ ERROR ] Hosted Engine deployment failed: this system is not reliable,
> please check the issue, fix and redeploy
>   Log file is located at
> /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engine-setup-20160423233732-5tl68l.log
>
> --
> Pat Riehecky
> Scientific Linux developer
>
> Fermi National Accelerator Laboratory
> www.fnal.gov
> www.scientificlinux.org
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.6 upgrade issue

2016-05-01 Thread Shirly Radco
Hi,

In oreder to figure out the problem please open a bug detailing the error
you get.
The exact version you are trying to upgrade to and attach the
ovirt-engine-dwh.log and the engine-setup log.

Thank you,

Shirly Radco
BI Software Engineer
Red Hat Israel Ltd.
34 Jerusalem Road
Building A, 4th floor
Ra'anana, Israel 4350109


On Tue, Apr 26, 2016 at 9:28 PM, gregor  wrote:

> Hi,
>
> today I tried to upgrade from 3.6.4 to 3.6.5, as usual with
> engine-setup. This time it gives me an schema update error and after
> this the database is corrupt and gives me the same error you get.
>
> Did you create a bug entry?
> I'm tired of creating bugs for oVirt. Every time I touch oVirt I hit 1-5
> bugs, it's so time consuming :-(
>
> cheers
> gregor
>
> On 22/10/15 14:24, Yaniv Dary wrote:
> > History database has a issue. Can you add logs? maybe open a bug?
> >
> > Yaniv Dary Technical Product Manager Red Hat Israel Ltd. 34 Jerusalem
> > Road Building A, 4th floor Ra'anana, Israel 4350109 Tel : +972 (9)
> > 7692306 8272306 Email: yd...@redhat.com  IRC :
> > ydary
> >
> >
> > On Tue, Sep 29, 2015 at 4:45 PM, Jon Archer  > > wrote:
> >
> > Hi all,
> >
> > Wonder if anyone can shed any light on an error i'm seeing while
> > running engine-setup.
> >
> > If just upgraded the packages to the latest 3.6 ones today (from
> > 3.5), run engine-setup, answered the questions, confirming install
> > then get presented with:
> > [ INFO  ] Cleaning async tasks and compensations
> > [ INFO  ] Unlocking existing entities
> > [ INFO  ] Checking the Engine database consistency
> > [ INFO  ] Stage: Transaction setup
> > [ INFO  ] Stopping engine service
> > [ INFO  ] Stopping ovirt-fence-kdump-listener service
> > [ INFO  ] Stopping websocket-proxy service
> > [ INFO  ] Stage: Misc configuration
> > [ INFO  ] Stage: Package installation
> > [ INFO  ] Stage: Misc configuration
> > [ ERROR ] Failed to execute stage 'Misc configuration': function
> > getdwhhistorytimekeepingbyvarname(unknown) does not exist LINE 2:
> >  select * from GetDwhHistoryTimekeepingByVarName(
> >^ HINT:  No function matches the given name
> > and argument types. You might need to add explicit type casts.
> > [ INFO  ] Yum Performing yum transaction rollback
> > [ INFO  ] Stage: Clean up
> >   Log file is located at
> >
>  /var/log/ovirt-engine/setup/ovirt-engine-setup-20150929144137-7u5rhg.log
> > [ INFO  ] Generating answer file
> > '/var/lib/ovirt-engine/setup/answers/20150929144215-setup.conf'
> > [ INFO  ] Stage: Pre-termination
> > [ INFO  ] Stage: Termination
> > [ ERROR ] Execution of setup failed
> >
> >
> > Any ideas, where to look to fix things?
> >
> > Thanks
> >
> > Jon
> > ___
> > Users mailing list
> > Users@ovirt.org 
> > http://lists.ovirt.org/mailman/listinfo/users
> >
> >
> >
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] hosted engine is down

2016-05-01 Thread Yedidyah Bar David
On Fri, Apr 22, 2016 at 10:31 AM, Budur Nagaraju  wrote:
> HI
>
> I have configured hosted engine with two hosts ,one of the hosted engine is
> down and unable to make it active .
>
> Is there anyways to fix the issue ? I have restarted ha-agent and ha-broker
> but no luck.

If still not solved, please check/post relevant logs. Thanks.

>
> Thanks,
> Nagaraju
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>



-- 
Didi
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Shared storage between DC

2016-05-01 Thread Arsène Gschwind

Hi,

Is it possible to have a shared Storage domain between 2 Datacenter in 
oVirt?
We do replicate a FC Volume between 2 datacenter using FC SAN storage 
technology and we have an oVirt cluster on each site defined in separate 
DCs. The idea behind this is to setup a DR site and also balance the 
load between each site.
What happens if I do import a storage domain already active in one DC, 
will it break the Storage domain?


Thanks for any information..
Regards,
Arsène
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Error deleting template

2016-05-01 Thread Idan Shaby
Hi,

Can you please attach the engine and vdsm logs?
Also, can you describe your setup and the steps that reproduced this error?


Thanks,
Idan

On Wed, Apr 27, 2016 at 1:47 PM, Giulio Casella  wrote:

> Hi all,
> I have a problem deleting a template from admin portal.
> In file /var/log/vdsm/vdsm.log (on SPM hypervisor) I got:
>
> jsonrpc.Executor/4::ERROR::2016-04-27
> 10:19:57,122::hsm::1518::Storage.HSM::(deleteImage) Empty or not found
> image  in SD  [...]
>
> Looking in the (data) storage domain the disk with that UUID doesn't
> exists.
> It seems I reached an inconsistent state between engine database and
> images on disk.
>
> Is there a (safe) way to rebuild a consistent situation? Maybe deleting
> entries from database?
>
> My setup is based:
> manager RHEV 3.5.8-0.1.el6ev
> hypervisors: RHEV Hypervisor - 7.2 - 20160328.0.el7ev
>
>
> Thanx in advance,
> Giulio
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Import and Exporting domains (to export/import across datacenters) no longer working

2016-05-01 Thread Idan Shaby
Hi,

Can you please attach the engine and vdsm logs?


Thanks,
Idan

On Tue, Apr 26, 2016 at 1:13 AM,  wrote:

> In the past we had a 3.4 and a 3.5 ovirt.  I was able to attach an Export
> domain to one, export Templates and/or VMs... put in into Maint mode,
> Detach it from the datacenter, go to the other ovirt manager, import the
> Domain (sometimes forcing an attach, sometimes it would automatically
> attach) and import my VM and/or template.  Afterwards, placing it back
> into Maint mode and Dettaching.
>
> Often times there would be a residual icon left under Storage for the
> detached item... and the process was fairly repeatable.
>
> Now, it's getting repeatable in a bad way.  Now it won't import, I have to
> manually do two table record deletes and a REST call to remove the
> connected storage and then and only then can I successfully import.
>
> psql db mods look like (from the ovirt manage box):
> First to find erroneous data:
> select * from storage_domain_static where storage_domain_type = 3;
> Then we locate the offending uuid (see full one below on the REST call)
> and delete the record our of storage_domain_dynamic and static like so:
> engine=# delete from storage_domain_dynamic where id =
> '9f00c1d9-3f2a-41b9-80c3-344900622b07';
> DELETE 1
> engine=# select
> Deletestorage_domain_static('9f00c1d9-3f2a-41b9-80c3-344900622b07');
>
> Folks here talked about the latter command, but it would fail if the
> reference in the storage_domain_dynamic table wasn't removed first.
>
>
> Rest call looks something like this:
> curl -v -u "admin@internal:ourpassword" -X DELETE
>
> https://ovirt.example.com/ovirt-engine/api/storageconnections/9f00c1d9-3f2a-41b9-80c3-344900622b07
>
> Any ideas about what is not apparently broken on our site?
>
> Is there any other way to do export/import across datacenters?
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted-engine-Issue

2016-05-01 Thread Yedidyah Bar David
On Fri, Apr 22, 2016 at 1:29 PM, Budur Nagaraju  wrote:
> I have created vms  in the vmware  ESXi.

Please verify first that your infrastructure works well.

Create a new VM in your ESXi, install same OS as your host (centos7?),
create a KVM VM there and install on it same OS as your engine VM (centos6?),
make sure you can ping and ssh in both directions, ping the gateway
from both, etc.

If your host was itself an oVirt host, to do what you try to do, you had
to enable mac spoofing. No idea what's required for ESXi.

Best,

>
> Thanks,
> Nagaraju
>
>
> On Fri, Apr 22, 2016 at 3:38 PM, Simone Tiraboschi 
> wrote:
>>
>> On Fri, Apr 22, 2016 at 10:27 AM, Budur Nagaraju 
>> wrote:
>> > HI
>> >
>> > I thought Its promiscuous mode issue ,after reboot of engine unable to
>> > reach
>> > again ? is there anyways to resolve ?
>>
>> Are you using a nested env?
>>
>> > Thanks,
>> > Nagaraju
>> >
>> >
>> > On Fri, Apr 22, 2016 at 11:45 AM, Budur Nagaraju 
>> > wrote:
>> >>
>> >> I found the solution ,promiscuous mode was rejecting the packets this
>> >> was
>> >> causing the issue ,enabled and now able to ping without any issues.
>> >> Thanks for the support !
>> >>
>> >> On Fri, Apr 22, 2016 at 8:39 AM, Budur Nagaraju 
>> >> wrote:
>> >>>
>> >>> HI
>> >>>
>> >>> Any updates issue is blocking ?
>> >>>
>> >>> Thanks,
>> >>> Nagaraju
>> >>>
>> >>>
>> >>> On Thu, Apr 21, 2016 at 7:20 PM, Budur Nagaraju 
>> >>> wrote:
>> 
>>  Below are the  details, unable to ping the gateway from the ovirt
>>  engine
>>  ,able to ping the ovirt engine from host as both are there in the
>>  same
>>  network.
>> 
>> 
>>  [root@oe ~]# ip addr
>>  1: lo:  mtu 65536 qdisc noqueue state UNKNOWN
>>  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
>>  inet 127.0.0.1/8 scope host lo
>>  inet6 ::1/128 scope host
>> valid_lft forever preferred_lft forever
>>  2: eth0:  mtu 1500 qdisc pfifo_fast
>>  state UP qlen 1000
>>  link/ether 00:16:3e:33:38:e5 brd ff:ff:ff:ff:ff:ff
>>  inet 10.206.66.9/23 brd 10.206.67.255 scope global eth0
>>  inet6 fe80::216:3eff:fe33:38e5/64 scope link
>> valid_lft forever preferred_lft forever
>>  [root@oe ~]#
>>  [root@oe ~]#
>>  [root@oe ~]#
>>  [root@oe ~]# ip route
>>  10.206.66.0/23 dev eth0  proto kernel  scope link  src 10.206.66.9
>>  169.254.0.0/16 dev eth0  scope link  metric 1002
>>  default via 10.206.67.254 dev eth0
>>  [root@oe ~]#
>>  [root@oe ~]#
>>  [root@oe ~]#
>>  [root@oe ~]# brctl show
>>  bridge name bridge id   STP enabled interfaces
>>  [root@oe ~]#
>> 
>> 
>> 
>> 
>> 
>> 
>>  On Thu, Apr 21, 2016 at 4:51 PM, Yedidyah Bar David 
>>  wrote:
>> >
>> > On Thu, Apr 21, 2016 at 1:39 PM, Budur Nagaraju 
>> > wrote:
>> > > HI
>> > >
>> > > Installed hosted engine and  after configuring  IP to ovirt engine
>> > > unable to
>> > > ping the gateway and found that there is no issue with the Network
>> > > .
>> >
>> > You mean that from inside the engine machine, you cannot ping the
>> > gateway?
>> >
>> > Can you ping it from the host?
>> >
>> > Please share output of these commands, from both the host and the
>> > engine vm:
>> >
>> > ip addr
>> > ip route
>> > brctl show
>> >
>> > Thanks,
>> >
>> > >
>> > > Is there any thing am missing  while installing Hosted engine ?
>> > > below
>> > > is
>> > > output details,
>> > >
>> > >
>> > > --== CONFIGURATION PREVIEW ==--
>> > >
>> > >   Engine FQDN:
>> > > oe.bnglab.psecure.net
>> > >   Bridge name: ovirtmgmt
>> > >   SSH daemon port: 22
>> > >   Gateway address: 10.206.67.254
>> > >   Host name for web application  : hosted_engine_1
>> > >   Host ID: 1
>> > >   Image alias: hosted_engine
>> > >   Image size GB  : 25
>> > >   Storage connection :
>> > > 10.204.207.152:/home/heoe
>> > >   Console type   : vnc
>> > >   Memory size MB : 4096
>> > >   MAC address: 00:16:3e:2f:5c:40
>> > >   Boot type  : cdrom
>> > >   Number of CPUs : 2
>> > >   ISO image (for cdrom boot) :
>> > > /ovirt/CentOS-6.7-x86_64-minimal.iso
>> >