Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed)

2015-03-09 Thread Bob Doolittle

On 03/09/2015 07:12 AM, Simone Tiraboschi wrote:

 - Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Sent: Monday, March 9, 2015 12:02:49 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 
 (Cannot add the host to cluster ... SSH
 has failed)

 On Mar 9, 2015 5:23 AM, Simone Tiraboschi stira...@redhat.com wrote:


 - Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: users-ovirt users@ovirt.org
 Sent: Friday, March 6, 2015 9:21:20 PM
 Subject: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
 F20 (Cannot add the host to cluster ... SSH has
 failed)

 Hi,

 I'm following the instructions here:
 http://www.ovirt.org/Hosted_Engine_Howto
 My self-hosted install failed near the end:

 To continue make a selection from the options below:
   (1) Continue setup - engine installation is complete
   (2) Power off and restart the VM
   (3) Abort setup
   (4) Destroy VM and abort setup

   (1, 2, 3, 4)[1]: 1
 [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
   Enter the name of the cluster to which you want to add the
 host
   (Default) [Default]:
 [ ERROR ] Cannot automatically add the host to cluster Default: Cannot
 add
 Host. Connecting to host via SSH has failed, verify that the host is
 reachable (IP address, routable address etc.) You may refer to the
 engine.log file for further details.
 [ ERROR ] Failed to execute stage 'Closing up': Cannot add the host to
 cluster Default
 [ INFO  ] Stage: Clean up
 [ INFO  ] Generating answer file
 '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150306135624.conf'
 [ INFO  ] Stage: Pre-termination
 [ INFO  ] Stage: Termination

 I can ssh into the engine VM both locally and remotely. There is no
 /root/.ssh directory, however. Did I need to set that up somehow?
 It's the engine that needs to open an SSH connection to the host calling
 it by its hostname.
 So please be sure that you can SSH to the host from the engine using its
 hostname and not its IP address.

 I'm assuming this should be a password-less login (key-based
 authentication?). 
 Yes, it is.

 As what user?
 root

OK, I see a couple of problems.
First off, I didn't have my deploying-host hostname in the hosts map for my 
engine.
After adding it to /etc/hosts (both hostname and FQDN), when I try to ssh from 
root@engine to root@host it is prompting me for a password.

On my engine, ~root/.ssh does not contain any keys.
On my host, ~root/.ssh has authorized_keys, and in it there is a key with the 
comment ovirt-engine.

It's possible that I inadvertently removed ~root/.ssh on engine while I was 
preparing the engine (I started to set up my own no-password logins and then 
thought better and cleaned up, not realizing that some prior setup affecting 
that directory had occurred). That would explain the second issue.

How/when does the key for root@engine get populated to the host's 
~root/.ssh/authenticated_keys during setup?

-Bob


 -Bob

 Till hosted-engine hosts were simply identified by their IP address but
 than we had some bug report on side effects of that.
 So now we generate and sign certs using host hostnames and so the engine
 should be able to correctly resolve them.
 When I log into the Administration portal, the engine VM does not appear
 under the Virtual machine view (it's empty).
 It's cause the setup didn't complete.

 I've attached what I think are the relevant logs.

 Also, when my host reboots, the ovirt-ha-broker and ovirt-ha-agent
 services
 do not come up automatically. I have to use systemctl to start them
 manually.
 It's cause the setup didn't complete.

 This is a fresh Fedora 20 machine installing a fresh copy of Ovirt
 3.5.1.
 What's the cleanest approach to restore/complete sanity of my setup
 please?
 First step is to clarify what went wrong in order to avoid it in the
 future.
 Than, if you want a really sanity environment for production use I'd
 suggest to redeploy.
 So
  hosted-engine --vm-poweroff
 empty the storage domain share and deploy again

 Thanks,
 Bob


 I've linked 3 files to this email:
 server.log (12.4 MB) Dropbox https://db.tt/g5p09AaD
 vdsm.log (3.2 MB) Dropbox https://db.tt/P4572SUm
 ovirt-hosted-engine-setup-20150306123622-tad1fy.log (413 KB) Dropbox
 https://db.tt/XAM9ffhi
 Mozilla Thunderbird makes it easy to share large files over email.


 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM memory consumption

2015-03-09 Thread Darrell Budic
 On Mar 9, 2015, at 4:51 AM, Dan Kenigsberg dan...@redhat.com wrote:
 
 On Fri, Mar 06, 2015 at 10:58:53AM -0600, Darrell Budic wrote:
 I believe the supervdsm leak was fixed, but 3.5.1 versions of vdsmd still 
 leaks slowly, ~300k/hr, yes.
 
 https://bugzilla.redhat.com/show_bug.cgi?id=1158108
 
 
 On Mar 6, 2015, at 10:23 AM, Chris Adams c...@cmadams.net wrote:
 
 Once upon a time, Federico Alberto Sayd fs...@uncu.edu.ar said:
 I am experiencing troubles with VDSM memory consuption.
 
 I am running
 
 Engine: ovirt 3.5.1
 
 Nodes:
 
 Centos 6.6
 VDSM 4.16.10-8
 Libvirt: libvirt-0.10.2-46
 Kernel: 2.6.32
 
 When the host boots, memory consuption is normal, but after 2 or 3
 days running, VDSM memory consuption grows and it consumes more
 memory that all vm's running in the host. If I restart the vdsm
 service, memory consuption normalizes, but then it start growing
 again.
 
 I have seen some BZ about vdsm and supervdsm about memory leaks, but
 I don't know if VDSM 4.6.10.8 is still affected by a related bug.
 
 Can't help, but I see the same thing with CentOS 7 nodes and the same
 version of vdsm.
 -- 
 Chris Adams c...@cmadams.net
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
 I'm afraid that we are yet to find a solution for this issue, which is
 completly different from the horrible leak of supervdsm  4.16.7.
 
 Could you corroborate the claim of
Bug 1147148 - M2Crypto usage in vdsm leaks memory
 ? Does the leak disappear once you start using plaintext transport?
 
 Regards,
 Dan.

I don’t think this is crypto related, but I could try that if you still need 
some confirmation (and point me at a quick doc on switching to plaintext?).

This is from #ovirt around November 18th I think, Saggi thought he’d found 
something related:

9:58:43 AM saggi: YamakasY: Found the leak
9:58:48 AM saggi: YamakasY: Or at least the flow
9:58:57 AM saggi: YamakasY: The good news is that I can reproduce
9:59:20 AM YamakasY: saggi: that's kewl!
9:59:25 AM YamakasY: saggi: what happens ?
9:59:41 AM YamakasY: I know from Telsin (ping ping!) that he sees it going 
faster on gluster usage
tdosek left the room (quit: Ping timeout: 480 seconds). (10:00:02 AM)
djasa left the room (quit: Quit: Leaving). (10:00:24 AM)
mlipchuk left the room (quit: Quit: Leaving.). (10:00:29 AM)
laravot left the room (quit: Quit: Leaving.). (10:01:19 AM)
10:01:54 AM saggi: YamakasY: it's in getCapabilities(). Here is the RSS graph. 
The flatlines are when I stopped calling it and called other verbs. 
http://i.imgur.com/CLm0Q75.png
movciari left the room (quit: Ping timeout: 480 seconds). (10:02:34 AM)
10:02:46 AM saggi: YamakasY: horizontal is time since epoch and vertical is RSS 
in bytes
bobdrad left the room (quit: Quit: Leaving.). (10:03:25 AM)
10:03:52 AM YamakasY: saggi: I have seen that line s much!
10:04:11 AM YamakasY: I think I even made a mailing about it
10:04:18 AM YamakasY: at least asked here
10:04:32 AM YamakasY: no-one knew, but those lines are almost blowing you away
10:04:35 AM YamakasY: can we patch it ?
10:04:59 AM YamakasY: wow, nice one to catch
10:05:28 AM saggi: YamakasY: I now have a smaller part of the code to scan 
through and a way to reproduce so hopefully I'll have a patch soon

was that ever followed up on?


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed)

2015-03-09 Thread Simone Tiraboschi


- Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Monday, March 9, 2015 6:26:30 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 
 (Cannot add the host to cluster ... SSH
 has failed)
 
 
 On 03/09/2015 12:53 PM, Simone Tiraboschi wrote:
 
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Cc: users-ovirt users@ovirt.org
  Sent: Monday, March 9, 2015 12:48:37 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
  F20 (Cannot add the host to cluster ... SSH
  has failed)
 
 
  On 03/09/2015 07:12 AM, Simone Tiraboschi wrote:
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Sent: Monday, March 9, 2015 12:02:49 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
  F20 (Cannot add the host to cluster ... SSH
  has failed)
 
  On Mar 9, 2015 5:23 AM, Simone Tiraboschi stira...@redhat.com wrote:
 
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: users-ovirt users@ovirt.org
  Sent: Friday, March 6, 2015 9:21:20 PM
  Subject: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
  F20 (Cannot add the host to cluster ... SSH has
  failed)
 
  Hi,
 
  I'm following the instructions here:
  http://www.ovirt.org/Hosted_Engine_Howto
  My self-hosted install failed near the end:
 
  To continue make a selection from the options below:
(1) Continue setup - engine installation is complete
(2) Power off and restart the VM
(3) Abort setup
(4) Destroy VM and abort setup
 
(1, 2, 3, 4)[1]: 1
  [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
Enter the name of the cluster to which you want to add the
  host
(Default) [Default]:
  [ ERROR ] Cannot automatically add the host to cluster Default: Cannot
  add
  Host. Connecting to host via SSH has failed, verify that the host is
  reachable (IP address, routable address etc.) You may refer to the
  engine.log file for further details.
  [ ERROR ] Failed to execute stage 'Closing up': Cannot add the host to
  cluster Default
  [ INFO  ] Stage: Clean up
  [ INFO  ] Generating answer file
  '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150306135624.conf'
  [ INFO  ] Stage: Pre-termination
  [ INFO  ] Stage: Termination
 
  I can ssh into the engine VM both locally and remotely. There is no
  /root/.ssh directory, however. Did I need to set that up somehow?
  It's the engine that needs to open an SSH connection to the host
  calling
  it by its hostname.
  So please be sure that you can SSH to the host from the engine using
  its
  hostname and not its IP address.
 
  I'm assuming this should be a password-less login (key-based
  authentication?).
  Yes, it is.
 
  As what user?
  root
  OK, I see a couple of problems.
  First off, I didn't have my deploying-host hostname in the hosts map for
  my
  engine.
  This is enough by itself to make the deploy procedure failing. If possible
  we recommend to rely a DNS infrastructure especially if you are deploying
  more than one host.
 
 OK, I've started over. Simply removing the storage domain was insufficient,
 the hosted-engine deploy failed when it found the HA and Broker services
 already configured. I decided to just start over fresh starting with
 re-installing the OS on my host.
 
 I can't deploy DNS at the moment, so I have to simply replicate /etc/hosts
 files on my host/engine. I did that this time, but have run into a new
 problem:
 
 [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
   Enter the name of the cluster to which you want to add the host
   (Default) [Default]:
 [ INFO  ] Waiting for the host to become operational in the engine. This may
 take several minutes...
 [ ERROR ] The VDSM host was found in a failed state. Please check engine and
 bootstrap installation logs.
 [ ERROR ] Unable to add ovirt-vm to the manager
   Please shutdown the VM allowing the system to launch it as a
   monitored service.
   The system will wait until the VM is down.
 [ ERROR ] Failed to execute stage 'Closing up': [Errno 111] Connection
 refused
 [ INFO  ] Stage: Clean up
 [ ERROR ] Failed to execute stage 'Clean up': [Errno 111] Connection refused
 
 
 I've attached my engine log and the ovirt-hosted-engine-setup log. I think I
 had an issue with resolving external hostnames, or else a connectivity issue
 during the install.

For some reason your engine wasn't able to deploy your hosts but the SSH 
session this time was established.
2015-03-09 13:05:58,514 ERROR 
[org.ovirt.engine.core.bll.InstallVdsInternalCommand] 
(org.ovirt.thread.pool-8-thread-3) [3cf91626] Host installation failed for host 

Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed)

2015-03-09 Thread Simone Tiraboschi


- Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: Simone Tiraboschi stira...@redhat.com
 Cc: users-ovirt users@ovirt.org
 Sent: Monday, March 9, 2015 12:48:37 PM
 Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 
 (Cannot add the host to cluster ... SSH
 has failed)
 
 
 On 03/09/2015 07:12 AM, Simone Tiraboschi wrote:
 
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: Simone Tiraboschi stira...@redhat.com
  Sent: Monday, March 9, 2015 12:02:49 PM
  Subject: Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
  F20 (Cannot add the host to cluster ... SSH
  has failed)
 
  On Mar 9, 2015 5:23 AM, Simone Tiraboschi stira...@redhat.com wrote:
 
 
  - Original Message -
  From: Bob Doolittle b...@doolittle.us.com
  To: users-ovirt users@ovirt.org
  Sent: Friday, March 6, 2015 9:21:20 PM
  Subject: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on
  F20 (Cannot add the host to cluster ... SSH has
  failed)
 
  Hi,
 
  I'm following the instructions here:
  http://www.ovirt.org/Hosted_Engine_Howto
  My self-hosted install failed near the end:
 
  To continue make a selection from the options below:
(1) Continue setup - engine installation is complete
(2) Power off and restart the VM
(3) Abort setup
(4) Destroy VM and abort setup
 
(1, 2, 3, 4)[1]: 1
  [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
Enter the name of the cluster to which you want to add the
  host
(Default) [Default]:
  [ ERROR ] Cannot automatically add the host to cluster Default: Cannot
  add
  Host. Connecting to host via SSH has failed, verify that the host is
  reachable (IP address, routable address etc.) You may refer to the
  engine.log file for further details.
  [ ERROR ] Failed to execute stage 'Closing up': Cannot add the host to
  cluster Default
  [ INFO  ] Stage: Clean up
  [ INFO  ] Generating answer file
  '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150306135624.conf'
  [ INFO  ] Stage: Pre-termination
  [ INFO  ] Stage: Termination
 
  I can ssh into the engine VM both locally and remotely. There is no
  /root/.ssh directory, however. Did I need to set that up somehow?
  It's the engine that needs to open an SSH connection to the host calling
  it by its hostname.
  So please be sure that you can SSH to the host from the engine using its
  hostname and not its IP address.
 
  I'm assuming this should be a password-less login (key-based
  authentication?).
  Yes, it is.
 
  As what user?
  root
 
 OK, I see a couple of problems.
 First off, I didn't have my deploying-host hostname in the hosts map for my
 engine.

This is enough by itself to make the deploy procedure failing. If possible we 
recommend to rely a DNS infrastructure especially if you are deploying more 
than one host. 

 After adding it to /etc/hosts (both hostname and FQDN), when I try to ssh
 from root@engine to root@host it is prompting me for a password.
 
 On my engine, ~root/.ssh does not contain any keys.
 On my host, ~root/.ssh has authorized_keys, and in it there is a key with the
 comment ovirt-engine.
 
 It's possible that I inadvertently removed ~root/.ssh on engine while I was
 preparing the engine (I started to set up my own no-password logins and then
 thought better and cleaned up, not realizing that some prior setup affecting
 that directory had occurred). That would explain the second issue.

No, it's OK: the private key is contained in 
/etc/pki/ovirt-engine/keys/engine.p12

 How/when does the key for root@engine get populated to the host's
 ~root/.ssh/authenticated_keys during setup?

It's part of hosted-engine deploy procedure: when the engine setup on the VM 
it's completed, it gathers the engine SSH public key from 
http://{enginefqdn}/engine.ssh.key.txt and it stores it under 
~root/.ssh/authenticated_keys to make the engine able to add the host without 
knowing the host root password.
Than hosted-engine setup contacts the engine via REST APIs to trigger the host 
setup procedure.

If the engine wasn't able to contact the host due to bad hostname resolution as 
we pointed out, you missed some steps to have a safe deployment.

 -Bob
 
 
  -Bob
 
  Till hosted-engine hosts were simply identified by their IP address but
  than we had some bug report on side effects of that.
  So now we generate and sign certs using host hostnames and so the engine
  should be able to correctly resolve them.
  When I log into the Administration portal, the engine VM does not appear
  under the Virtual machine view (it's empty).
  It's cause the setup didn't complete.
 
  I've attached what I think are the relevant logs.
 
  Also, when my host reboots, the ovirt-ha-broker and ovirt-ha-agent
  services
  do not come up automatically. I have to use systemctl to start them
  manually.
  It's cause the setup didn't complete.
 
  This is a fresh 

Re: [ovirt-users] VDSM memory consumption

2015-03-09 Thread Chris Adams
Once upon a time, Dan Kenigsberg dan...@redhat.com said:
 I'm afraid that we are yet to find a solution for this issue, which is
 completly different from the horrible leak of supervdsm  4.16.7.
 
 Could you corroborate the claim of
 Bug 1147148 - M2Crypto usage in vdsm leaks memory
 ? Does the leak disappear once you start using plaintext transport?

So, to confirm, it looks like to do that, the steps would be:

- In the [vars] section of /etc/vdsm/vdsm.conf, set ssl = false.
- Restart the vdsmd service.

Is that all that is needed?  Is it safe to restart vdsmd on a node with
active VMs?

-- 
Chris Adams c...@cmadams.net
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] hosted-engine --vm-status output

2015-03-09 Thread Filipe Guarino
Hello guys
I installed ovirt using hosted-engine procedure with six fisical hosts,
with more than 60 vms, and until now, everythings ok and my environment
works fine.
I decided to use some of my hosts for other tasks, so have been removed
four of my six hosts and put it way from my environment.
After few days, my second host (hosted_engine_2) start to fail. It's
hardware issue. My 10GbE interface stoped. I decide to put my host 4 as a
second hosted_engine_2.
It's works fine. but when I use command hosted-engine --vm-status, its
still returns  all of the old members of hosted-engines (1 to 6)
how can i fix it leave only just active active nodes?
See below the output for my hosted-engine --vm-status



[root@bmh0001 ~]# hosted-engine --vm-status

--== Host 1 status ==--

Status up-to-date  : True
Hostname   : bmh0001.place.brazil
Host ID: 1
Engine status  : {reason: vm not running on this
host, health: bad, vm: down, detail: unknown}
Score  : 2400
Local maintenance  : False
Host timestamp : 68830
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=68830 (Sun Mar  8 17:38:05 2015)
host-id=1
score=2400
maintenance=False
state=EngineDown


--== Host 2 status ==--

Status up-to-date  : True
Hostname   : bmh0004.place.brazil
Host ID: 2
Engine status  : {health: good, vm: up,
detail: up}
Score  : 2400
Local maintenance  : False
Host timestamp : 2427
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=2427 (Sun Mar  8 17:38:09 2015)
host-id=2
score=2400
maintenance=False
state=EngineUp


--== Host 3 status ==--

Status up-to-date  : False
Hostname   : bmh0003.place.brazil
Host ID: 3
Engine status  : unknown stale-data
Score  : 0
Local maintenance  : True
Host timestamp : 331389
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=331389 (Tue Mar  3 14:48:25 2015)
host-id=3
score=0
maintenance=True
state=LocalMaintenance


--== Host 4 status ==--

Status up-to-date  : False
Hostname   : bmh0004.place.brazil
Host ID: 4
Engine status  : unknown stale-data
Score  : 0
Local maintenance  : True
Host timestamp : 364358
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=364358 (Tue Mar  3 16:10:36 2015)
host-id=4
score=0
maintenance=True
state=LocalMaintenance


--== Host 5 status ==--

Status up-to-date  : False
Hostname   : bmh0005.place.brazil
Host ID: 5
Engine status  : unknown stale-data
Score  : 0
Local maintenance  : True
Host timestamp : 241930
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=241930 (Fri Mar  6 09:40:31 2015)
host-id=5
score=0
maintenance=True
state=LocalMaintenance


--== Host 6 status ==--

Status up-to-date  : False
Hostname   : bmh0006.place.brazil
Host ID: 6
Engine status  : unknown stale-data
Score  : 0
Local maintenance  : True
Host timestamp : 77376
Extra metadata (valid at timestamp):
metadata_parse_version=1
metadata_feature_version=1
timestamp=77376 (Wed Mar  4 09:11:17 2015)
host-id=6
score=0
maintenance=True
state=LocalMaintenance
[root@bmh0001 ~]#
thank you very much.

-- 
Regards
*Filipe Guarino*
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM memory consumption

2015-03-09 Thread Dan Kenigsberg
On Mon, Mar 09, 2015 at 10:40:51AM -0500, Darrell Budic wrote:
  On Mar 9, 2015, at 4:51 AM, Dan Kenigsberg dan...@redhat.com wrote:
  
  On Fri, Mar 06, 2015 at 10:58:53AM -0600, Darrell Budic wrote:
  I believe the supervdsm leak was fixed, but 3.5.1 versions of vdsmd still 
  leaks slowly, ~300k/hr, yes.
  
  https://bugzilla.redhat.com/show_bug.cgi?id=1158108
  
  
  On Mar 6, 2015, at 10:23 AM, Chris Adams c...@cmadams.net wrote:
  
  Once upon a time, Federico Alberto Sayd fs...@uncu.edu.ar said:
  I am experiencing troubles with VDSM memory consuption.
  
  I am running
  
  Engine: ovirt 3.5.1
  
  Nodes:
  
  Centos 6.6
  VDSM 4.16.10-8
  Libvirt: libvirt-0.10.2-46
  Kernel: 2.6.32
  
  When the host boots, memory consuption is normal, but after 2 or 3
  days running, VDSM memory consuption grows and it consumes more
  memory that all vm's running in the host. If I restart the vdsm
  service, memory consuption normalizes, but then it start growing
  again.
  
  I have seen some BZ about vdsm and supervdsm about memory leaks, but
  I don't know if VDSM 4.6.10.8 is still affected by a related bug.
  
  Can't help, but I see the same thing with CentOS 7 nodes and the same
  version of vdsm.
  -- 
  Chris Adams c...@cmadams.net
  ___
  Users mailing list
  Users@ovirt.org
  http://lists.ovirt.org/mailman/listinfo/users
  
  I'm afraid that we are yet to find a solution for this issue, which is
  completly different from the horrible leak of supervdsm  4.16.7.
  
  Could you corroborate the claim of
 Bug 1147148 - M2Crypto usage in vdsm leaks memory
  ? Does the leak disappear once you start using plaintext transport?
  
  Regards,
  Dan.
 
 I don’t think this is crypto related, but I could try that if you still need 
 some confirmation (and point me at a quick doc on switching to plaintext?).
 
 This is from #ovirt around November 18th I think, Saggi thought he’d found 
 something related:
 
 9:58:43 AM saggi: YamakasY: Found the leak
 9:58:48 AM saggi: YamakasY: Or at least the flow
 9:58:57 AM saggi: YamakasY: The good news is that I can reproduce
 9:59:20 AM YamakasY: saggi: that's kewl!
 9:59:25 AM YamakasY: saggi: what happens ?
 9:59:41 AM YamakasY: I know from Telsin (ping ping!) that he sees it going 
 faster on gluster usage
 tdosek left the room (quit: Ping timeout: 480 seconds). (10:00:02 AM)
 djasa left the room (quit: Quit: Leaving). (10:00:24 AM)
 mlipchuk left the room (quit: Quit: Leaving.). (10:00:29 AM)
 laravot left the room (quit: Quit: Leaving.). (10:01:19 AM)
 10:01:54 AM saggi: YamakasY: it's in getCapabilities(). Here is the RSS 
 graph. The flatlines are when I stopped calling it and called other verbs. 
 http://i.imgur.com/CLm0Q75.png

I do recall what is the issue Saggi and YamakasY were dicussing (CCing
the pair), or if it reached fruition as a patch. It is certainly
something other than Bug 1158108, as the latter speak about a leak in a
normal working state, with no getCapabilities calls.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM memory consumption

2015-03-09 Thread Dan Kenigsberg
On Mon, Mar 09, 2015 at 12:17:00PM -0500, Chris Adams wrote:
 Once upon a time, Dan Kenigsberg dan...@redhat.com said:
  I'm afraid that we are yet to find a solution for this issue, which is
  completly different from the horrible leak of supervdsm  4.16.7.
  
  Could you corroborate the claim of
  Bug 1147148 - M2Crypto usage in vdsm leaks memory
  ? Does the leak disappear once you start using plaintext transport?
 
 So, to confirm, it looks like to do that, the steps would be:
 
 - In the [vars] section of /etc/vdsm/vdsm.conf, set ssl = false.
 - Restart the vdsmd service.
 
 Is that all that is needed?

No. You'd have to reconfigure libvirtd to work in plaintext

vdsm-tool congfigure --force

and also set you Engine to work in plaintext (unfortunately, I don't
recall how's that done. surely Yaniv does)

 Is it safe to restart vdsmd on a node with
 active VMs?

It's safe in the sense that I have not heard of a single failure to
reconnected to already-running VMs in years. However, this is still not
recommended for production environment, and particularly not if one of
the VMs is defined as highly-available. This can end up with your host
being fenced and all your VMs dead.

Dan.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM memory consumption

2015-03-09 Thread Matt .
Hi,

I also see this on the latest 3.5 version, I'm thinking about setting
up a cronjob to restart vdsm every night.

I cannot believe that people say they don't have this issue.

Can someone of the devs dive in maybe ?

Thanks!

Matt



2015-03-09 23:29 GMT+01:00 Dan Kenigsberg dan...@redhat.com:
 On Mon, Mar 09, 2015 at 10:40:51AM -0500, Darrell Budic wrote:
  On Mar 9, 2015, at 4:51 AM, Dan Kenigsberg dan...@redhat.com wrote:
 
  On Fri, Mar 06, 2015 at 10:58:53AM -0600, Darrell Budic wrote:
  I believe the supervdsm leak was fixed, but 3.5.1 versions of vdsmd still 
  leaks slowly, ~300k/hr, yes.
 
  https://bugzilla.redhat.com/show_bug.cgi?id=1158108
 
 
  On Mar 6, 2015, at 10:23 AM, Chris Adams c...@cmadams.net wrote:
 
  Once upon a time, Federico Alberto Sayd fs...@uncu.edu.ar said:
  I am experiencing troubles with VDSM memory consuption.
 
  I am running
 
  Engine: ovirt 3.5.1
 
  Nodes:
 
  Centos 6.6
  VDSM 4.16.10-8
  Libvirt: libvirt-0.10.2-46
  Kernel: 2.6.32
 
  When the host boots, memory consuption is normal, but after 2 or 3
  days running, VDSM memory consuption grows and it consumes more
  memory that all vm's running in the host. If I restart the vdsm
  service, memory consuption normalizes, but then it start growing
  again.
 
  I have seen some BZ about vdsm and supervdsm about memory leaks, but
  I don't know if VDSM 4.6.10.8 is still affected by a related bug.
 
  Can't help, but I see the same thing with CentOS 7 nodes and the same
  version of vdsm.
  --
  Chris Adams c...@cmadams.net
  ___
  Users mailing list
  Users@ovirt.org
  http://lists.ovirt.org/mailman/listinfo/users
 
  I'm afraid that we are yet to find a solution for this issue, which is
  completly different from the horrible leak of supervdsm  4.16.7.
 
  Could you corroborate the claim of
 Bug 1147148 - M2Crypto usage in vdsm leaks memory
  ? Does the leak disappear once you start using plaintext transport?
 
  Regards,
  Dan.

 I don’t think this is crypto related, but I could try that if you still need 
 some confirmation (and point me at a quick doc on switching to plaintext?).

 This is from #ovirt around November 18th I think, Saggi thought he’d found 
 something related:

 9:58:43 AM saggi: YamakasY: Found the leak
 9:58:48 AM saggi: YamakasY: Or at least the flow
 9:58:57 AM saggi: YamakasY: The good news is that I can reproduce
 9:59:20 AM YamakasY: saggi: that's kewl!
 9:59:25 AM YamakasY: saggi: what happens ?
 9:59:41 AM YamakasY: I know from Telsin (ping ping!) that he sees it going 
 faster on gluster usage
 tdosek left the room (quit: Ping timeout: 480 seconds). (10:00:02 AM)
 djasa left the room (quit: Quit: Leaving). (10:00:24 AM)
 mlipchuk left the room (quit: Quit: Leaving.). (10:00:29 AM)
 laravot left the room (quit: Quit: Leaving.). (10:01:19 AM)
 10:01:54 AM saggi: YamakasY: it's in getCapabilities(). Here is the RSS 
 graph. The flatlines are when I stopped calling it and called other verbs. 
 http://i.imgur.com/CLm0Q75.png

 I do recall what is the issue Saggi and YamakasY were dicussing (CCing
 the pair), or if it reached fruition as a patch. It is certainly
 something other than Bug 1158108, as the latter speak about a leak in a
 normal working state, with no getCapabilities calls.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Error during host deploy for 3.5.1, package installation

2015-03-09 Thread Erik Brakke
Hello,
When deploying a new host from the admin portal to FC20 target, the package
dependency check fails (host-deploy log):

ERROR otopi.plugins.otopi.packagers.yumpackager yumpackager.error:97 Yum
[u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-xmlrpc = 4.14.8.1-0.fc20',
u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python = 4.14.8.1-0.fc20',
u'vdsm-4.14.8.1-0.fc20.i686 requires vdsm-python-zombiereaper =
4.14.8.1-0.fc20']

I've tried the release 3.5 and 3.5-snapshot repos.  Installing the packages
manually does not satisfy host deploy.  It appears vdsm 4.16 packages are
available in the repository.

Engine was previously running 3.5.0, updated to 3.5.1, no change.  I was
able to deploy hosts in January with 3.5.0.

Any assistance greatly appreciated!

Best - Erik
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Troubles starting hosted engine

2015-03-09 Thread Simone Tiraboschi


- Original Message -
 From: John Florian jflor...@doubledog.org
 To: users@ovirt.org
 Sent: Sunday, March 8, 2015 9:37:39 PM
 Subject: [ovirt-users] Troubles starting hosted engine
 
 I have lots of extra fun bringing up my hosted engine right now due to
 two issues.
 
 First, either during the hosted-engine --deploy or engine-setup (I can't
 remember) I was prompted for the IP address of my gateway.  Since then
 that address has changed.  I'm unable to start the engine VM if that
 address isn't reachable so my temporary workaround is to add this old
 address onto the current gateway.  How/where do I change things so that
 this old address can be truly retired?

It's written in /etc/ovirt-hosted-engine/hosted-engine.conf
If you deployed more than one host, you need to explicitly fix it on each of 
them.

 My second issue might be harder.  Again during the setup I was prompted
 for a location of an ISO file for installing the engine's OS.  That
 location is served by NFS and is auto-mounted by /etc/fstab (and
 systemd).  Here's the hitch: my NFS server is now a VM in my cluster.
 :-)  Since I only have a single hypervisor host right now that ISO isn't
 reachable when I'm trying to start my engine VM so that I can also start
 the VM that provides the NFS share.  I'm getting away with evil right
 now by touching an empty file at the same path, which gets obscured once
 the NFS share is mounted, but it's enough.

You need that ISO file just to install the OS when you create the engine VM on 
the first host: you don't need a shared domain for that.
So my suggestion is just to copy that ISO image on the first host and use it 
locally. You can destroy it when the setup is done.
 
 It's not at all clear to me how I'm supposed to edit things for my
 hosted engine setup.
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 (Cannot add the host to cluster ... SSH has failed)

2015-03-09 Thread Simone Tiraboschi


- Original Message -
 From: Bob Doolittle b...@doolittle.us.com
 To: users-ovirt users@ovirt.org
 Sent: Friday, March 6, 2015 9:21:20 PM
 Subject: [ovirt-users] Error during hosted-engine-setup for 3.5.1 on F20 
 (Cannot add the host to cluster ... SSH has
 failed)
 
 Hi,
 
 I'm following the instructions here: http://www.ovirt.org/Hosted_Engine_Howto
 
 My self-hosted install failed near the end:
 
 To continue make a selection from the options below:
   (1) Continue setup - engine installation is complete
   (2) Power off and restart the VM
   (3) Abort setup
   (4) Destroy VM and abort setup
  
   (1, 2, 3, 4)[1]: 1
 [ INFO  ] Engine replied: DB Up!Welcome to Health Status!
   Enter the name of the cluster to which you want to add the host
   (Default) [Default]:
 [ ERROR ] Cannot automatically add the host to cluster Default: Cannot add
 Host. Connecting to host via SSH has failed, verify that the host is
 reachable (IP address, routable address etc.) You may refer to the
 engine.log file for further details.
 [ ERROR ] Failed to execute stage 'Closing up': Cannot add the host to
 cluster Default
 [ INFO  ] Stage: Clean up
 [ INFO  ] Generating answer file
 '/var/lib/ovirt-hosted-engine-setup/answers/answers-20150306135624.conf'
 [ INFO  ] Stage: Pre-termination
 [ INFO  ] Stage: Termination
 
 I can ssh into the engine VM both locally and remotely. There is no
 /root/.ssh directory, however. Did I need to set that up somehow?

It's the engine that needs to open an SSH connection to the host calling it by 
its hostname.
So please be sure that you can SSH to the host from the engine using its 
hostname and not its IP address.
Till hosted-engine hosts were simply identified by their IP address but than we 
had some bug report on side effects of that.
So now we generate and sign certs using host hostnames and so the engine should 
be able to correctly resolve them.

 When I log into the Administration portal, the engine VM does not appear
 under the Virtual machine view (it's empty).

It's cause the setup didn't complete.

 I've attached what I think are the relevant logs.
 
 Also, when my host reboots, the ovirt-ha-broker and ovirt-ha-agent services
 do not come up automatically. I have to use systemctl to start them
 manually.

It's cause the setup didn't complete.

 This is a fresh Fedora 20 machine installing a fresh copy of Ovirt 3.5.1.
 
 What's the cleanest approach to restore/complete sanity of my setup please?

First step is to clarify what went wrong in order to avoid it in the future.
Than, if you want a really sanity environment for production use I'd suggest to 
redeploy.

So
 hosted-engine --vm-poweroff
empty the storage domain share and deploy again

 Thanks,
 Bob
 
 
 I've linked 3 files to this email:
 server.log (12.4 MB) Dropbox https://db.tt/g5p09AaD
 vdsm.log (3.2 MB) Dropbox https://db.tt/P4572SUm
 ovirt-hosted-engine-setup-20150306123622-tad1fy.log (413 KB) Dropbox
 https://db.tt/XAM9ffhi
 Mozilla Thunderbird makes it easy to share large files over email.
 
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] engine-image-uploader failing to update OVF

2015-03-09 Thread Simone Tiraboschi


- Original Message -
 From: Stephen Repetski srepe...@srepetsk.net
 To: users users@ovirt.org
 Sent: Friday, March 6, 2015 8:46:38 PM
 Subject: [ovirt-users] engine-image-uploader failing to update OVF
 
 Hi all,
 I'm trying to import an OVA (containing .ovf, disk, and disk.meta) into my
 ovirt environment, but it's failing during the import where it seems that it
 shouldn't. The command I'm running is:
 
 export TMPDIR=/data/backup/test; engine-image-uploader upload
 /data/convert/srepetsk-vm.ova --insecure --name=srepetsk-testvmimport -n
 $nfs_server:/backup/2fff9385-10b8-41e5-93c6-c0ef18b9840f -v
 
 The command mounts the nfs server, extracts the OVA into
 /data/backup/test/tmpxEpuMc/, parses the OVF file
 (/data/backup/test/tmpxEpuMc/srepetsk-vm.ovf), and creates the new .meta
 file and whatnot for the disk image.
 
 It then proceeds to fail saying:
 
 ERROR: Unable to update the OVF XML file. Message: [Errno 2] No such file or
 directory: '/data/backup/test/tmpxEpuMc/srepetsk-vm.ovf'
 
 however, this is the same file that it extracted and read earlier. What might
 I be doing wrong?

At that point it has to update the OVF file so it has to be able to write it.
Is /data/backup/test/ a local directory?
Can you please check SELinux logs?
 
 Full(er) log:
 DEBUG: local extract directory for OVF is /data/backup/test/tmpxEpuMc
 DEBUG: Size of /data/convert/srepetsk-vm.ova: 17179876069 bytes 16777222.7
 1K-blocks 16384.0 MB
 DEBUG: Available space in /data/backup/test/tmpxEpuMc: 5206184878080 bytes
 5084164920 .0 1K-blocks 4965004.8 MB
 DEBUG: File is /data/backup/test/tmpxEpuMc/srepetsk-vm.ovf
 DEBUG: tag(Section) text(None) attr({'{
 http://www.w3.org/2001/XMLSchema-instance}type ':
 'ovf:VirtualHardwareSection_Type'}) class(Element Section at 1e752b8)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75368)
 DEBUG: tag({
 http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData}Caption
 ) value(1 virtual cpu)
 snip
 DEBUG: old meta
 file(/data/backup/test/tmpxEpuMc/5f63da48-3ced-42ad-b684-72b626aec727.meta)
 new meta
 file(/data/backup/test/tmpxEpuMc/fffc878e-8df9-4b81-b04a-614a0af437a3.meta)
 DEBUG: old dir(/data/backup/test/tmpxEpuMc) new
 dir(/data/backup/test/4cd70a4f-f979-4b42-a416-2c8b4d028a88)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75470)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e754c8)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75520)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75578)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e755d0)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75628)
 DEBUG: tag(Section) text(None) attr({'{
 http://www.w3.org/2001/XMLSchema-instance}type ':
 'ovf:VirtualHardwareSection_Type'}) class(Element Section at 1e752b8)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75368)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e753c0)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75418)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75470)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e754c8)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75520)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75578)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e755d0)
 DEBUG: item tag(Item) item text(None) item attr({}) class(Element Item at
 1e75628)
 ERROR: Unable to update the OVF XML file. Message: [Errno 2] No such file or
 directory: '/data/backup/test/tmpxEpuMc/srepetsk-vm.ovf'
 DEBUG: Cleaning up OVF extract directory /data/backup/test/tmpxEpuMc
 DEBUG: [Errno 2] No such file or directory: '/data/backup/test/tmpxEpuMc'
 DEBUG: /bin/umount -t nfs -f /data/backup/test/tmpd8kH1X
 DEBUG: /bin/umount -t nfs -f /data/backup/test/tmpd8kH1X
 DEBUG: _cmds(['/bin/umount', '-t', 'nfs', '-f',
 '/data/backup/test/tmpd8kH1X'])
 DEBUG: returncode(0)
 DEBUG: STDOUT()
 DEBUG: STDERR()
 
 
 Thanks,
 Stephen
 
 Stephen Repetski
 Rochester Institute of Technology '13 | http://srepetsk.net
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] VDSM memory consumption

2015-03-09 Thread Dan Kenigsberg
On Fri, Mar 06, 2015 at 10:58:53AM -0600, Darrell Budic wrote:
 I believe the supervdsm leak was fixed, but 3.5.1 versions of vdsmd still 
 leaks slowly, ~300k/hr, yes.
 
 https://bugzilla.redhat.com/show_bug.cgi?id=1158108
 
 
  On Mar 6, 2015, at 10:23 AM, Chris Adams c...@cmadams.net wrote:
  
  Once upon a time, Federico Alberto Sayd fs...@uncu.edu.ar said:
  I am experiencing troubles with VDSM memory consuption.
  
  I am running
  
  Engine: ovirt 3.5.1
  
  Nodes:
  
  Centos 6.6
  VDSM 4.16.10-8
  Libvirt: libvirt-0.10.2-46
  Kernel: 2.6.32
  
  When the host boots, memory consuption is normal, but after 2 or 3
  days running, VDSM memory consuption grows and it consumes more
  memory that all vm's running in the host. If I restart the vdsm
  service, memory consuption normalizes, but then it start growing
  again.
  
  I have seen some BZ about vdsm and supervdsm about memory leaks, but
  I don't know if VDSM 4.6.10.8 is still affected by a related bug.
  
  Can't help, but I see the same thing with CentOS 7 nodes and the same
  version of vdsm.
  -- 
  Chris Adams c...@cmadams.net
  ___
  Users mailing list
  Users@ovirt.org
  http://lists.ovirt.org/mailman/listinfo/users

I'm afraid that we are yet to find a solution for this issue, which is
completly different from the horrible leak of supervdsm  4.16.7.

Could you corroborate the claim of
Bug 1147148 - M2Crypto usage in vdsm leaks memory
? Does the leak disappear once you start using plaintext transport?

Regards,
Dan.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users