Re: [ovirt-users] Dedicated NICs for gluster network

2016-08-21 Thread Sahina Bose
On Fri, Aug 19, 2016 at 6:20 PM, Nicolas Ecarnot 
wrote:

> Le 19/08/2016 à 13:43, Sahina Bose a écrit :
>
>
> Or are you adding the 3 nodes to your existing cluster? If so, I suggest
>> you try adding this to a new cluster
>>
>> OK, I tried and succeed to create a new cluster.
>> In this new cluster, I was ABLE to add the first new host, using its mgmt
>> DNS name.
>> This first host still has to have its NICs configured, and (using Chrome
>> or FF) the access to the network settings window is stalling the browser (I
>> tried to restart even the engine, to no avail). Thus, I can not setup this
>> first node NICs.
>>
>> Thus, I can not add any further host because oVirt relies on a first host
>> to validate the further ones.
>>
>
>
> Network team should be able to help you here.
>
>
> OK, there were no mean I could continue this way (browser crash), so I
> tried and succeed doing so :
> - remove the newly created host and cluster
> - create a new DATACENTER
> - create a new cluster in this DC
> - add the first new host : OK
> - add the 2 other new hosts : OK
>
> Now, I can smoothly configure their NICs.
>
> Doing all this, I saw that oVirt detected there already was existing
> gluster cluster and volume, and integrated it in oVirt.
>
> Then, I was able to create a new storage domain in this new DC and
> cluster, using one of the *gluster* FQDN's host. It went nicely.
>
> BUT, when viewing the volume tab and brick details, the displayed brick
> names are the host DNS name, and NOT the host GLUSTER DNS names.
>
> I'm worrying about this, confirmed by what I read in the logs :
>
> 2016-08-19 14:46:30,484 WARN  [org.ovirt.engine.core.vdsbroker.gluster.
> GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100)
> [107dc2e3] Could not associate brick 'serv-vm-al04-data.sdis.isere.
> fr:/gluster/data/brick04
> ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network
> as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-
> 2f7ef3ed9f30'
> 2016-08-19 14:46:30,492 WARN  [org.ovirt.engine.core.vdsbroker.gluster.
> GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100)
> [107dc2e3] Could not associate brick 'serv-vm-al05-data.sdis.isere.
> fr:/gluster/data/brick04
> ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network
> as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-
> 2f7ef3ed9f30'
> 2016-08-19 14:46:30,500 WARN  [org.ovirt.engine.core.vdsbroker.gluster.
> GlusterVolumesListReturnForXmlRpc] (DefaultQuartzScheduler_Worker-100)
> [107dc2e3] Could not associate brick 'serv-vm-al06-data.sdis.isere.
> fr:/gluster/data/brick04
> ' of volume '35026521-e76e-4774-8ddf-0a701b9eb40c' with correct network
> as no gluster network found in cluster '1c8e75a0-af3f-4e97-a8fb-
> 2f7ef3ed9f30'
>
> [oVirt shell (connected)]# list clusters
>
> id : 0001-0001-0001-0001-0045
> name   : cluster51
> description: Cluster d'alerte de test
>
> id : 1c8e75a0-af3f-4e97-a8fb-2f7ef3ed9f30
> name   : cluster52
> description: Cluster d'alerte de test
>
> [oVirt shell (connected)]#
>
> "cluster52" is the recent cluster, and I do have a dedicated gluster
> network, marked as gluster network, in the correct DC and cluster.
> The only point is that :
> - Each host has its name ("serv-vm-al04") and a second name for gluster
> ("serv-vm-al04-data").
> - Using blahblahblah-data is correct on a gluster point of view
> - Maybe oVirt is disturb not to be able to ping the gluster FQDN (not
> routed) and then throwing this error?
>
>
We do have a limitation currently that if you use multiple FQDNs, oVirt
cannot associate it to the gluster brick correctly. This will be a problem
only when you try brick management from oVirt - i.e try to remove or
replace brick from oVirt. For monitoring brick status and detecting bricks
- this is not an issue, and you can ignore the error in logs.

Adding Ramesh who has a patch to fix this .

-- 
> Nicolas ECARNOT
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Suddenly all VM's down including HostedEngine & NFSshares (except HE) unmounted

2016-08-21 Thread Matt .
Some extra info:

I see a very high load on all hosts without any VM started:


top - 02:36:36 up 56 min,  1 user,  load average: 9.95, 8.11, 7.67
Tasks: 247 total,   1 running, 246 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.2 us,  0.8 sy,  0.0 ni, 97.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 32780472 total, 31680580 free,   676704 used,   423188 buff/cache
KiB Swap: 25165820 total, 25165820 free,0 used. 31827176 avail Mem

  PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+ COMMAND
  951 root  15  -5 1371512  35676   8648 S   4.7  0.1   0:48.57
supervdsmServer
  953 vdsm  20   0 5713492  52204   6496 S   4.7  0.2   1:00.16
ovirt-ha-broker
 4949 vdsm   0 -20 5148232 124364  12144 S   2.7  0.4   2:33.33 vdsm
 4952 vdsm  20   0  599528  21272   4932 S   0.7  0.1   0:16.44 python


In the logs is also this shown:

periodic/91::DEBUG::2016-08-22
02:33:29,621::task::597::Storage.TaskManager.Task::(_updateState)
Task=`65f12a56-dab5-4a19-a9ef-967e4b617087`::moving from state
preparing -> state finished
periodic/91::DEBUG::2016-08-22
02:33:29,621::resourceManager::952::Storage.ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}
periodic/91::DEBUG::2016-08-22
02:33:29,622::resourceManager::989::Storage.ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
periodic/91::DEBUG::2016-08-22
02:33:29,622::task::995::Storage.TaskManager.Task::(_decref)
Task=`65f12a56-dab5-4a19-a9ef-967e4b617087`::ref 0 aborting False
periodic/89::ERROR::2016-08-22
02:33:29,636::brokerlink::279::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(_communicate)
Connection closed: Connection timed out
periodic/89::ERROR::2016-08-22
02:33:29,637::api::253::root::(_getHaInfo) failed to retrieve Hosted
Engine HA info
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 232,
in _getHaInfo
stats = instance.get_all_stats()
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 103, in get_all_stats
self._configure_broker_conn(broker)
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py",
line 180, in _configure_broker_conn
dom_type=dom_type)
  File 
"/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
line 176, in set_storage_domain
.format(sd_type, options, e))
RequestError: Failed to set storage domain FilesystemBackend, options
{'dom_type': 'nfs3', 'sd_uuid':
'4093ad17-bef5-4e4b-9a16-259a98e20321'}: Connection timed out
periodic/89::DEBUG::2016-08-22
02:33:29,638::executor::182::Executor::(_run) Worker was discarded
periodic/86::WARNING::2016-08-22
02:33:30,936::periodic::269::virt.periodic.VmDispatcher::(__call__)
could not run  on
['5576ec24-112e-4995-89f8-57e40c43cc5a']
periodic/90::WARNING::2016-08-22
02:33:32,937::periodic::269::virt.periodic.VmDispatcher::(__call__)
could not run  on
['5576ec24-112e-4995-89f8-57e40c43cc5a']
Reactor thread::INFO::2016-08-22
02:33:33,234::protocoldetector::72::ProtocolDetector.AcceptorImpl::(handle_accept)
Accepting connection from :::172.16.30.11:56176


The strange thing is that I can fairly mount my NFS share where my
VM's and HE is on, this is no issue. But in some way vdsmd fails and
the RHEV mounts, which ovit makes dies and a df -h times out.


I'm very curious what is going on here.

2016-08-22 2:01 GMT+02:00 Matt . :
> I seem some very strange behaviour, hosts are random rebooting where I
> get the feeling it crashed on Sanlock or is, that is the last logline
> from /var/log/messages
>
> I saw this happening on the latest kernel (yum update) and different
> hardware, the filer is OK!
>
>
>
> Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5613
> [5294]: hosted-e close_task_aio 2 0x7f3728000960 busy
> Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5613
> [5294]: hosted-e close_task_aio 3 0x7f37280009b0 busy
> Aug 22 01:35:27 host-01 ovirt-ha-broker:
> INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
> established
> Aug 22 01:35:27 host-01 journal: vdsm
> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Connection
> closed: Connection timed out
> Aug 22 01:35:27 host-01 journal: vdsm root ERROR failed to retrieve
> Hosted Engine HA info#012Traceback (most recent call last):#012  File
> "/usr/lib/python2.$
> Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5614
> [5901]: s2 delta_renew read rv -2 offset 0
> /rhev/data-center/mnt/flr-01
> Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5614
> [5901]: s2 renewal error -2 delta_length 10 last_success 5518
> Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5614
> [1024]: s2 kill 6871 sig 15 count 17
> Aug 22 01:35:28 host-01 wdmd[988]: test failed rem 24 now 5614 ping
> 5568 close 5578 renewal 5518 expire 5598 client 1024
> sanlock_4093ad17-bef5-4e4b-9a16-259$
> Aug 22 01:35:28 

Re: [ovirt-users] Not Updating Statistics

2016-08-21 Thread Yaniv Dary
This can happen when the engine is down or very busy. This error seems to
happen only sometimes, which indicates this is the case.

Yaniv Dary
Technical Product Manager
Red Hat Israel Ltd.
34 Jerusalem Road
Building A, 4th floor
Ra'anana, Israel 4350109

Tel : +972 (9) 7692306
8272306
Email: yd...@redhat.com
IRC : ydary

On Aug 22, 2016 00:21, "Fernando Fuentes"  wrote:

> Shirly,
>
> Here is what I have:
>
> [root@hypervirt ~]# rpm -qa ovirt-engine-dwh
> ovirt-engine-dwh-4.0.2-1.el7.centos.noarch
> [root@hypervirt ~]# rpm -qa ovirt-engine
> ovirt-engine-4.0.2.6-1.el7.centos.noarch
> [root@hypervirt ~]#
>
> As requested the dwh log is attached.
>
> Regards,
>
>
> --
> Fernando Fuentes
> ffuen...@txweather.org
> http://www.txweather.org
>
>
>
> On Sun, Aug 21, 2016, at 03:02 AM, Shirly Radco wrote:
>
> Hi Fernando, Melissa
>
> What version are you using for engine and dwh?
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1344935
>
> Versions:
>
> ovirt-engine-dwh-setup-4.0.2-1.el7ev.noarch
>
> ovirt-engine-setup-4.0.2.6-0.1.el7ev.noarch
>
> should not have this issue.
>
> Please update me if you still have this issue and attach the
> ovirt-engine-dwh log.
>
>
> Best regards,
>
>
> Shirly Radco
>
> BI Software Engineer
> Red Hat Israel Ltd.
> 34 Jerusalem Road
> Building A, 4th floor
> Ra'anana, Israel 4350109
>
>
> On Sat, Aug 20, 2016 at 9:11 PM, Fernando Fuentes 
> wrote:
>
>
> Team,
>
> This is still an issue on my env. :(
>
> 2016-08-20 09:04:56|pIlSG1|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not
> updating the statistics. Please check your oVirt Engine status.|9704
> 2016-08-20 10:57:42|3ZzN5z|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not
> updating the statistics. Please check your oVirt Engine status.|9704
> 2016-08-20 11:31:33|reczvL|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine is not
> updating the statistics. Please check your oVirt Engine status.|9704
>
> Any ideas?
>
> Thanks for the help!
>
>
> --
> Fernando Fuentes
> ffuen...@txweather.org
> http://www.txweather.org
>
>
>
>
> On Thu, Aug 18, 2016, at 01:35 PM, Fernando Fuentes wrote:
>
> I am having similar issues though my problem only started after recovering
> my DWH DB.
> But the errors are the sames as posted below.
>
> Any ideas?
>
> Regards,
>
> --
> Fernando Fuentes
> ffuen...@txweather.org
> http://www.txweather.org
>
>
>
> On Wed, Aug 17, 2016, at 08:47 AM, Sandro Bonazzola wrote:
>
> Adding Shirly
>
> On Wed, Aug 17, 2016 at 3:30 PM, Melissa Mesler 
> wrote:
>
> All, we are running Ovirt 4.0 on CentOS 7. We have some errors in our
> dwh log. Look at the following:
>
> 2016-08-16
> 13:28:51|mFKcbP|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-16
> 19:31:00|dgTmvI|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-16
> 23:19:28|ByUf8M|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 00:27:59|jzRGar|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 02:55:01|WZ7M9t|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 04:12:06|Q237ie|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 04:46:12|8COWHX|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
>
> What does this mean?
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
> --
> Sandro Bonazzola
> Better technology. Faster innovation. Powered by community collaboration.
> See how it works at redhat.com
> *___*
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
> *___*
> Users mailing list
> Users@ovirt.org
> 

Re: [ovirt-users] Suddenly all VM's down including HostedEngine & NFSshares (except HE) unmounted

2016-08-21 Thread Matt .
I seem some very strange behaviour, hosts are random rebooting where I
get the feeling it crashed on Sanlock or is, that is the last logline
from /var/log/messages

I saw this happening on the latest kernel (yum update) and different
hardware, the filer is OK!



Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5613
[5294]: hosted-e close_task_aio 2 0x7f3728000960 busy
Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5613
[5294]: hosted-e close_task_aio 3 0x7f37280009b0 busy
Aug 22 01:35:27 host-01 ovirt-ha-broker:
INFO:ovirt_hosted_engine_ha.broker.listener.ConnectionHandler:Connection
established
Aug 22 01:35:27 host-01 journal: vdsm
ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Connection
closed: Connection timed out
Aug 22 01:35:27 host-01 journal: vdsm root ERROR failed to retrieve
Hosted Engine HA info#012Traceback (most recent call last):#012  File
"/usr/lib/python2.$
Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5614
[5901]: s2 delta_renew read rv -2 offset 0
/rhev/data-center/mnt/flr-01
Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5614
[5901]: s2 renewal error -2 delta_length 10 last_success 5518
Aug 22 01:35:27 host-01 sanlock[1024]: 2016-08-22 01:35:27+0200 5614
[1024]: s2 kill 6871 sig 15 count 17
Aug 22 01:35:28 host-01 wdmd[988]: test failed rem 24 now 5614 ping
5568 close 5578 renewal 5518 expire 5598 client 1024
sanlock_4093ad17-bef5-4e4b-9a16-259$
Aug 22 01:35:28 host-01 sanlock[1024]: 2016-08-22 01:35:28+0200 5615
[1024]: s2 kill 6871 sig 15 count 18
Aug 22 01:35:29 host-01 wdmd[988]: test failed rem 23 now 5615 ping
5568 close 5578 renewal 5518 expire 5598 client 1024
sanlock_4093ad17-bef5-4e4b-9a16-259$
Aug 22 01:35:29 host-01 sanlock[1024]: 2016-08-22 01:35:29+0200 5616
[1024]: s2 kill 6871 sig 15 count 19
Aug 22 01:35:30 host-01 wdmd[988]: test failed rem 22 now 5616 ping
5568 close 5578 renewal 5518 expire 5598 client 1024
sanlock_4093ad17-bef5-4e4b-9a16-259$
Aug 22 01:35:30 host-01 sanlock[1024]: 2016-08-22 01:35:30+0200 5617
[1024]: s2 kill 6871 sig 15 count 20
Aug 22 01:35:31 host-01 wdmd[988]: test failed rem 21 now 5617 ping
5568 close 5578 renewal 5518 expire 5598 client 1024
sanlock_4093ad17-bef5-4e4b-9a16-259$
Aug 22 01:35:31 host-01 ovirt-ha-agent:
/usr/lib/python2.7/site-packages/yajsonrpc/stomp.py:352:
DeprecationWarning: Dispatcher.pending is deprecated. Use D$
Aug 22 01:35:31 host-01 ovirt-ha-agent: pending = getattr(dispatcher,
'pending', lambda: 0)
Aug 22 01:35:31 host-01 ovirt-ha-agent:
/usr/lib/python2.7/site-packages/yajsonrpc/stomp.py:352:
DeprecationWarning: Dispatcher.pending is deprecated. Use D$
Aug 22 01:35:31 host-01 ovirt-ha-agent: pending = getattr(dispatcher,
'pending', lambda: 0)
Aug 22 01:35:31 host-01 sanlock[1024]: 2016-08-22 01:35:31+0200 5618
[1024]: s2 kill 6871 sig 15 count 21
Aug 22 01:35:32 host-01 wdmd[988]: test failed rem 20 now 5618 ping
5568 close 5578 renewal 5518 expire 5598 client 1024
sanlock_4093ad17-bef5-4e4b-9a16-259$
Aug 22 01:35:32 host-01 momd:
/usr/lib/python2.7/site-packages/mom/Collectors/GuestMemory.py:52:
DeprecationWarning: BaseException.message has been deprecat$
Aug 22 01:35:32 host-01 momd: self.stats_error('getVmMemoryStats():
%s' % e.message)
Aug 22 01:35:32 host-01 momd:
/usr/lib/python2.7/site-packages/mom/Collectors/GuestMemory.py:52:
DeprecationWarning: BaseException.message has been deprecat$
Aug 22 01:35:32 host-01 momd: self.stats_error('getVmMemoryStats():
%s' % e.message)
Aug 22 01:35:32 host-01 sanlock[1024]: 2016-08-22 01:35:32+0200 5619
[1024]: s2 kill 6871 sig 15 count 22
Aug 22 01:35:33 host-01 ovirt-ha-broker:
WARNING:engine_health.CpuLoadNoEngine:bad health status: Hosted Engine
is not up!
Aug 22 01:35:33 host-01 wdmd[988]: test failed rem 19 now 5619 ping
5568 close 5578 renewal 5518 expire 5598 client 1024
sanlock_4093ad17-bef5-4e4b-9a16-259$
Aug 22 01:35:33 host-01 sanlock[1024]: 2016-08-22 01:35:33+0200 5620
[1024]: s2 kill 6871 sig 15 count 23
Aug 22 01:35:34 host-01 wdmd[988]: test failed rem 18 now 5620 ping
5568 close 5578 renewal 5518 expire 5598 client 1024
sanlock_4093ad17-bef5-4e4b-9a16-259$
Aug 22 01:35:34 host-01 sanlock[1024]: 2016-08-22 01:35:34+0200 5621
[1024]: s2 kill 6871 sig 15 count 24
Aug 22 01:35:35 host-01 wdmd[988]: test failed rem 17 now 5621 ping
5568 close 5578 renewal 5518 expire 5598 client 1024
sanlock_4093ad17-bef5-4e4b-9a16-259$
Aug 22 01:35:35 host-01 sanlock[1024]: 2016-08-22 01:35:35+0200 5622
[10454]: 60c7bc7a close_task_aio 0 0x7f37180008c0 busy
Aug 22 01:35:35 host-01 sanlock[1024]: 2016-08-22 01:35:35+0200 5622
[10454]: 60c7bc7a close_task_aio 1 0x7f3718000910 busy
Aug 22 01:35:35 host-01 sanlock[1024]: 2016-08-22 01:35:35+0200 5622
[10454]: 60c7bc7a close_task_aio 2 0x7f3718000960 busy
Aug 22 01:35:35 host-01 sanlock[1024]: 2016-08-22 01:35:35+0200 5622
[10454]: 60c7bc7a close_task_aio 3 0x7f37180009b0 busy
Aug 22 01:35:35 host-01 sanlock[1024]: 2016-08-22 01:35:35+0200 5622
[1024]: 

Re: [ovirt-users] Not Updating Statistics

2016-08-21 Thread Fernando Fuentes
Shirly,

Here is what I have:

[root@hypervirt ~]# rpm -qa ovirt-engine-dwh
ovirt-engine-dwh-4.0.2-1.el7.centos.noarch
[root@hypervirt ~]# rpm -qa ovirt-engine
ovirt-engine-4.0.2.6-1.el7.centos.noarch
[root@hypervirt ~]#

As requested the dwh log is attached.

Regards,


--
Fernando Fuentes
ffuen...@txweather.org
http://www.txweather.org



On Sun, Aug 21, 2016, at 03:02 AM, Shirly Radco wrote:
> Hi Fernando, Melissa
>
> What version are you using for engine and dwh?
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1344935
>
> Versions: ovirt-engine-dwh-setup-4.0.2-1.el7ev.noarch
> ovirt-engine-setup-4.0.2.6-0.1.el7ev.noarch
>
> should not have this issue.
>
> Please update me if you still have this issue and attach the ovirt-engine-
> dwh log.
>
>
> Best regards,
>
>
> Shirly Radco BI Software Engineer Red Hat Israel Ltd. 34 Jerusalem
> Road Building A, 4th floor Ra'anana, Israel 4350109
>
> On Sat, Aug 20, 2016 at 9:11 PM, Fernando Fuentes
>  wrote:
>> __
>> Team,
>>
>> This is still an issue on my env. :(
>>
>> 2016-08-20 09:04:56|pIlSG1|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTime-
>> KeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine
>> is not updating the statistics. Please check your oVirt Engine
>> status.|9704
>> 2016-08-20 10:57:42|3ZzN5z|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTime-
>> KeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine
>> is not updating the statistics. Please check your oVirt Engine
>> status.|9704
>> 2016-08-20 11:31:33|reczvL|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTime-
>> KeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt Engine
>> is not updating the statistics. Please check your oVirt Engine
>> status.|9704
>>
>> Any ideas?
>>
>> Thanks for the help!
>>
>>
>> --
>> Fernando Fuentes
>> ffuen...@txweather.org
>> http://www.txweather.org
>>
>>
>>
>>
>> On Thu, Aug 18, 2016, at 01:35 PM, Fernando Fuentes wrote:
>>> I am having similar issues though my problem only started after
>>> recovering my DWH DB.
>>> But the errors are the sames as posted below.
>>>
>>> Any ideas?
>>>
>>> Regards,
>>>
>>> --
>>> Fernando Fuentes
>>> ffuen...@txweather.org
>>> http://www.txweather.org
>>>
>>>
>>>
>>> On Wed, Aug 17, 2016, at 08:47 AM, Sandro Bonazzola wrote:
 Adding Shirly

 On Wed, Aug 17, 2016 at 3:30 PM, Melissa Mesler
  wrote:
> All, we are running Ovirt 4.0 on CentOS 7. We have some errors
> in our
> dwh log. Look at the following:
>
> 2016-08-16
> 13:28:51|mFKcbP|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKeepingJ-
> ob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics.
> Please
> check your oVirt Engine status.|9704
> 2016-08-16
> 19:31:00|dgTmvI|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKeepingJ-
> ob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics.
> Please
> check your oVirt Engine status.|9704
> 2016-08-16
> 23:19:28|ByUf8M|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKeepingJ-
> ob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics.
> Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 00:27:59|jzRGar|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKeepingJ-
> ob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics.
> Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 02:55:01|WZ7M9t|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKeepingJ-
> ob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics.
> Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 04:12:06|Q237ie|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKeepingJ-
> ob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics.
> Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 04:46:12|8COWHX|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKeepingJ-
> ob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics.
> Please
> check your oVirt Engine status.|9704
>
> What does this mean?
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users



 --
 Sandro Bonazzola
 Better technology. Faster innovation. Powered by community
 collaboration.
 See how it works at redhat.com
 _
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
>>>
>>> _
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>
2016-07-19 09:35:42|ETL Service Started

Re: [ovirt-users] Suddenly all VM's down including HostedEngine & NFSshares (except HE) unmounted

2016-08-21 Thread Matt .
THe strange things is that there are no IP's duplicated in the ovirt
environment, storage or whatever the VM's make running.

What happens tho is that the statusses of all agents change, and
why... don' t ask me.

There is really nothing in the logs that shows this behaviour.

Restarting broker, agent, Rebooting the hosts, it doesn' t work out.
the only one where I can start the HostedEngine on now is Host-4 where
I was able to start them on other hosts in theit current states also.

Something is wobbeling around the communication between the agents if
you ask me. This happened from 4.0.1

--== Host 1 status ==--

Status up-to-date  : False
Hostname   : host-01.mydomain.tld
Host ID: 1
Engine status  : unknown stale-data
Score  : 0
stopped: True
Local maintenance  : False
crc32  : 6b73a02e
Host timestamp : 2710
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=2710 (Sun Aug 21 21:52:56 2016)
host-id=1
score=0
maintenance=False
state=AgentStopped
stopped=True


--== Host 2 status ==--

Status up-to-date  : False
Hostname   : host-02.mydomain.tld
Host ID: 2
Engine status  : unknown stale-data
Score  : 0
stopped: True
Local maintenance  : False
crc32  : 8e647fca
Host timestamp : 509
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=509 (Sun Aug 21 21:53:00 2016)
host-id=2
score=0
maintenance=False
state=AgentStopped
stopped=True


--== Host 3 status ==--

Status up-to-date  : False
Hostname   : host-01.mydomain.tld
Host ID: 3
Engine status  : unknown stale-data
Score  : 0
stopped: True
Local maintenance  : False
crc32  : 73748f9f
Host timestamp : 2888
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=2888 (Sun Aug 21 00:16:12 2016)
host-id=3
score=0
maintenance=False
state=AgentStopped
stopped=True


--== Host 4 status ==--

Status up-to-date  : False
Hostname   : host-02.mydomain.tld
Host ID: 4
Engine status  : unknown stale-data
Score  : 3400
stopped: False
Local maintenance  : False
crc32  : 86ef0447
Host timestamp : 67879
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=67879 (Sun Aug 21 18:30:38 2016)
host-id=4
score=3400
maintenance=False
state=GlobalMaintenance
stopped=False



2016-08-21 22:09 GMT+02:00 Charles Kozler :
> This usually happens when SPM falls off or master storage domain was
> unreachable for a brief period of time in some capacity. Your logs should
> say something about an underlying storage problem so oVirt offlined or
> paused the VMs to avoid problems. I'd check the pathway to your master
> storage domain. You're probably right that something had another conflict
> IP. This happened to me one time where someone brought up a system on an IP
> that matched my SPM
>
>
> On Aug 21, 2016 3:33 PM, "Matt ."  wrote:
>>
>> HI All,
>>
>> I'm trying to tackle an issues on 4.0.2 that sunddenly all VM's
>> including the HostedEngine are just down at once.
>>
>> I have also seen that all NFS shares are unmounted except the
>> HostedEngine Storage, which is on the same NFS device as well.
>>
>> I have checked the logs, nothing strange to see there, but as I run a
>> vrrp setup and do some tests also I wonder if there is a duplicate IP
>> brought up, could this make happen the whole system to go down and the
>> Engine or VDSM unmounts the NFS shares ? My switches don't complain.
>>
>> It's strange that the HE share is only available after it happens.
>>
>> If so, this would be quite fragile and we should tackle where it goes
>> wrong.
>>
>> Anyone seen this bahaviour ?
>>
>> Thanks,
>>
>> Matt
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing 

Re: [ovirt-users] Suddenly all VM's down including HostedEngine & NFSshares (except HE) unmounted

2016-08-21 Thread Charles Kozler
This usually happens when SPM falls off or master storage domain was
unreachable for a brief period of time in some capacity. Your logs should
say something about an underlying storage problem so oVirt offlined or
paused the VMs to avoid problems. I'd check the pathway to your master
storage domain. You're probably right that something had another conflict
IP. This happened to me one time where someone brought up a system on an IP
that matched my SPM

On Aug 21, 2016 3:33 PM, "Matt ."  wrote:

> HI All,
>
> I'm trying to tackle an issues on 4.0.2 that sunddenly all VM's
> including the HostedEngine are just down at once.
>
> I have also seen that all NFS shares are unmounted except the
> HostedEngine Storage, which is on the same NFS device as well.
>
> I have checked the logs, nothing strange to see there, but as I run a
> vrrp setup and do some tests also I wonder if there is a duplicate IP
> brought up, could this make happen the whole system to go down and the
> Engine or VDSM unmounts the NFS shares ? My switches don't complain.
>
> It's strange that the HE share is only available after it happens.
>
> If so, this would be quite fragile and we should tackle where it goes
> wrong.
>
> Anyone seen this bahaviour ?
>
> Thanks,
>
> Matt
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Suddenly all VM's down including HostedEngine & NFSshares (except HE) unmounted

2016-08-21 Thread Matt .
HI All,

I'm trying to tackle an issues on 4.0.2 that sunddenly all VM's
including the HostedEngine are just down at once.

I have also seen that all NFS shares are unmounted except the
HostedEngine Storage, which is on the same NFS device as well.

I have checked the logs, nothing strange to see there, but as I run a
vrrp setup and do some tests also I wonder if there is a duplicate IP
brought up, could this make happen the whole system to go down and the
Engine or VDSM unmounts the NFS shares ? My switches don't complain.

It's strange that the HE share is only available after it happens.

If so, this would be quite fragile and we should tackle where it goes wrong.

Anyone seen this bahaviour ?

Thanks,

Matt
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Cannot resolve prinicpal 'u...@corp.domain.com' after attaching AD to ovirt-engine

2016-08-21 Thread Ondra Machacek

On 08/21/2016 10:42 AM, Danny Rehelis wrote:

Hi,

After successfully attaching ovirt-engine 4.0.2.6-1.el7 to corporate AD,
I'm unable to login using just 'user'. When typing 'u...@domain.com
' it works.

Is this behavior by design?


Yes, with new aaa-ldap you have to use UPN.
But you can set default domain, which is used, for more info please see 
AAA FAQ:


 Is it possible to change default domain of multi-domain Active 
Directory setup?

 http://www.ovirt.org/develop/release-management/features/infra/aaa_faq/



Cheers,


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Cannot resolve prinicpal 'u...@corp.domain.com' after attaching AD to ovirt-engine

2016-08-21 Thread Danny Rehelis
Hi,

After successfully attaching ovirt-engine 4.0.2.6-1.el7 to corporate AD,
I'm unable to login using just 'user'. When typing 'u...@domain.com' it
works.

Is this behavior by design?

Cheers,
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Not Updating Statistics

2016-08-21 Thread Shirly Radco
Hi Fernando, Melissa

What version are you using for engine and dwh?

https://bugzilla.redhat.com/show_bug.cgi?id=1344935

Versions:

ovirt-engine-dwh-setup-4.0.2-1.el7ev.noarch

ovirt-engine-setup-4.0.2.6-0.1.el7ev.noarch

should not have this issue.

Please update me if you still have this issue and attach the
ovirt-engine-dwh log.


Best regards,


Shirly Radco

BI Software Engineer
Red Hat Israel Ltd.
34 Jerusalem Road
Building A, 4th floor
Ra'anana, Israel 4350109


On Sat, Aug 20, 2016 at 9:11 PM, Fernando Fuentes 
wrote:

> Team,
>
> This is still an issue on my env. :(
>
> 2016-08-20 09:04:56|pIlSG1|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|
> SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt
> Engine is not updating the statistics. Please check your oVirt Engine
> status.|9704
> 2016-08-20 10:57:42|3ZzN5z|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|
> SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt
> Engine is not updating the statistics. Please check your oVirt Engine
> status.|9704
> 2016-08-20 11:31:33|reczvL|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|
> SampleTimeKeepingJob|Default|5|tWarn|tWarn_1|Can not sample data, oVirt
> Engine is not updating the statistics. Please check your oVirt Engine
> status.|9704
>
> Any ideas?
>
> Thanks for the help!
>
> --
> Fernando Fuentes
> ffuen...@txweather.org
> http://www.txweather.org
>
>
>
> On Thu, Aug 18, 2016, at 01:35 PM, Fernando Fuentes wrote:
>
> I am having similar issues though my problem only started after recovering
> my DWH DB.
> But the errors are the sames as posted below.
>
> Any ideas?
>
> Regards,
>
> --
> Fernando Fuentes
> ffuen...@txweather.org
> http://www.txweather.org
>
>
>
> On Wed, Aug 17, 2016, at 08:47 AM, Sandro Bonazzola wrote:
>
> Adding Shirly
>
> On Wed, Aug 17, 2016 at 3:30 PM, Melissa Mesler 
> wrote:
>
> All, we are running Ovirt 4.0 on CentOS 7. We have some errors in our
> dwh log. Look at the following:
>
> 2016-08-16
> 13:28:51|mFKcbP|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-16
> 19:31:00|dgTmvI|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-16
> 23:19:28|ByUf8M|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 00:27:59|jzRGar|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 02:55:01|WZ7M9t|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 04:12:06|Q237ie|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
> 2016-08-17
> 04:46:12|8COWHX|2EMUyU|snSJCy|OVIRT_ENGINE_DWH|SampleTimeKee
> pingJob|Default|5|tWarn|tWarn_1|Can
> not sample data, oVirt Engine is not updating the statistics. Please
> check your oVirt Engine status.|9704
>
> What does this mean?
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
>
> --
> Sandro Bonazzola
> Better technology. Faster innovation. Powered by community collaboration.
> See how it works at redhat.com
> *___*
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
> *___*
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Storage VLAN Issue

2016-08-21 Thread Edward Haas
On Wed, Aug 17, 2016 at 2:49 PM, Kendal Montgomery <
kmontgom...@cbuscollaboratory.com> wrote:

> Hi all,
>
> I just recently started testing out oVirt in our lab.  I’m using CentOS 7
> on my hosts and using the hosted-engine model, and the oVirt 3.6
> repository.  I have NFS storage.  I ran across what I think is a bug of
> some sort, and I’m curious if anyone else has tried this or know what’s
> going on.
>
> I wanted to be able to expose the NFS server (not necessarily the share
> used for oVirt storage domains, but other shares on the NFS server) to VMs
> running on my host (currently my setup only involves a single host). I have
> two 10GbaseT interfaced bonded together on the hose with two VLAN networks
> on it currently, one for the management network, one for storage.  When the
> hosted-engine deployment was set up, I ended up with an ovirtmgmt interface
> that was bridged to my infrastructure vlan interface (vlan 1080).  So, I
> added another network in my oVirt cluster named VM-Storage with vlan 1092
> (my storage network).  Here is approximately how I expected this to end up:
>
> bond0 - (bonded interface)
>   - bond0.1092 (STORAGE - vlan interface)
>  - VM-storage (bridged interface)
>   - bond0.1080 (INFR - van interface)
> - ovirtmgmt (bridged interface)
>
> However, when I did network setup on the host, and dragged the VM-Storage
> network over to the network interface and hit OK, the UI just froze, and
> for a few seconds I checked on the host via ssh session and the VM-storage
> bridge was setup, then the server just rebooted.  After it rebooted, my
> vlan interface was no longer there and it seems like both the hosted engine
> VM and the host ended up being rebooted.  In thinking about it, I may have
> caused at least a temporary outage with my NFS storage when the new bridged
> interface was set up which caused (maybe) the HA agent to think the hosted
> engine vm went away or something, and that cause the reboots.  Not entirely
> sure, but this was certainly unexpected.  I have tried several times and
> the same result each time.
>
> I did check that any other VM network with a different VLAN ID provisions
> just fine on the host, so I assume there’s something that happens when this
> storage network is provisioned that is catching oVirt off-guard somehow.
>
> Anyone else have this issue before?  Can I solve this by adding another
> host, then moving the hosted-engine to a different host while I add the
> storage network to each host?
>
> Thanks.
>
> *Kendal Montgomery*
> *Lab Manager*
> O: 614.407.5584 | M: 614.571.0172
> kmontgom...@cbuscollaboratory.com
>


Hi Kendal,

Please provide the logs from your host: Especially messages, vdsm.log and
supervdsm.log.
ovirtmgmt network is set as a VLAN network in Engine? ( a screenshot of
your network configuration on Engine may help)

Thanks,
Edy.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] 3.6 : VLAN / non VLAN

2016-08-21 Thread Edward Haas
On Thu, Aug 18, 2016 at 6:57 PM, Alexis HAUSER <
alexis.hau...@telecom-bretagne.eu> wrote:

> hi,
>
> I'd like to know what happens when you create a new network, tagged with
> VLAN for example 25 and using em2 :
>

Assuming that by 'tag' you mean the 4 byte added to the ethernet header,
then see answers below.
Note that you can verify this by simply running tcpdump or any other
sniffer.


> - the packets outgoing from em2.25 are tagged, right ?
>

Frames seen by this virtual interface are stripped of the vlan tag.
The purpose of this interface is to abstract the vlan from upper levels.

- the packets outgoing from em2 are tagged or not ?
>

If it originated from em2.25 (from the network you defined), then they will
be tagged with vlan ID 25.
If they originated from em2 (itself), then no tag will exist. (Usually this
occurs when you create a non-vlan network attached to em2)

- the result is packets inside ovirt are tagged, but when you go out of it
> and reach something from em2, are the packets still tagged ?
>

If you created a VLAN network, then the following flow will exist: (assumed
a VM network)
[external network]---[em2][em2.25]---[bridge]---[vm vnic]
[tag25]---[tag25]--[no tag]-[no tag]--[no tag]

This is true for both ingress and egress traffic.

For a non-vlan network, the linux bridge is connected directly to em2, and
if a tag exists or not depends on the ingress and egress traffic,
there is no stripping or adding of tags on the way.

Thanks,
Edy.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users