Re: [ovirt-users] power outage: HA vms not restarted

2014-05-21 Thread Yuriy Demchenko

Hi,

sorry for delay, guess i'll gonna plan to upgrade to 3.4 soon

Eli, Artyom, Omer - big thanks for your valuable help, it was important 
to me to understand what went wrong in that incident.


Yuriy Demchenko

On 05/19/2014 06:26 PM, Artyom Lukianov wrote:

Bug already fixed in 3.3 https://bugzilla.redhat.com/show_bug.cgi?id=1074478 
and 3.4 https://bugzilla.redhat.com/show_bug.cgi?id=1078553
Thanks.

- Original Message -
From: Yuriy Demchenko demchenko...@gmail.com
To: Eli Mesika emes...@redhat.com
Cc: users@ovirt.org
Sent: Monday, May 19, 2014 4:29:54 PM
Subject: Re: [ovirt-users] power outage: HA vms not restarted

On 05/19/2014 05:13 PM, Eli Mesika wrote:

- Original Message -

From: Yuriy Demchenko demchenko...@gmail.com
To: Eli Mesika emes...@redhat.com
Cc: users@ovirt.org
Sent: Monday, May 19, 2014 4:01:04 PM
Subject: Re: [ovirt-users] power outage: HA vms not restarted

On 05/19/2014 04:56 PM, Eli Mesika wrote:

but shouldn't engine restart corresponded vms after holding host came

up? (without manual fence)
because they up - so engine can query them about running/not running vms
and get actual state of vms - running or not
the only host were down at that point is srv5, which holded only 1 vm -
and it were correctly put in 'unknown' state, other vms were just 'down'
until we manually started them

Are you sure that those VMs are defined as Highly Available VMs ???


yes, i'm sure. double checked in webinterface, plus log entries like:

May this be related, I think that in your case host came up very fast while the 
fencing operation already started 
https://bugzilla.redhat.com/show_bug.cgi?id=1064860

doesn't seems so, as vm wasnt put into 'unknown' state and srv19 were
allready up when engine booted, so no fence attempt ever made for it


2014-05-17 00:23:10,565 INFO
[org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
(DefaultQuartzScheduler_Worker-14) vm prod.gui running in db and not
running in vds - add to rerun treatment. vds srv19
2014-05-17 00:23:10,909 INFO
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
(DefaultQuartzScheduler_Worker-14) [2989840c] Correlation ID: null, Call
Stack: null, Custom Event ID: -1, Message: Highly Available VM prod.gui
failed. It will be restarted automatically.
2014-05-17 00:23:10,911 INFO
[org.ovirt.engine.core.bll.VdsEventListener]
(DefaultQuartzScheduler_Worker-14) [2989840c] Highly Available VM went
down. Attempting to restart. VM Name: prod.gui, VM
Id:bbb7a605-d511-461d-99d2-c5a5bf8d9958



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine problem - Engine VM will not start

2014-05-21 Thread Sandro Bonazzola
Il 20/05/2014 20:43, Bob Doolittle ha scritto:
 
 On 05/20/2014 10:41 AM, Sandro Bonazzola wrote:
 Il 20/05/2014 16:36, Bob Doolittle ha scritto:
 On 05/20/2014 10:23 AM, Sandro Bonazzola wrote:
 Il 20/05/2014 16:06, Bob Doolittle ha scritto:
 On 05/20/2014 09:42 AM, Sandro Bonazzola wrote:
 Il 20/05/2014 15:09, Jiri Moskovcak ha scritto:
 On 05/20/2014 02:57 PM, Bob Doolittle wrote:
 Well that was interesting.
 When I ran hosted-engine --connect-storage, the Data Center went green,
 and I could see an unattached ISO domain and ovirt-image-repository 
 (but
 no Data domain).
 But after restarting ovirt-ha-broker and ovirt-ha-agent, the storage
 disappeared again and the Data Center went red.

 In retrospect, there appears to be a problem with iptables/firewalld
 that could be related.
 I noticed two things:
 - firewalld is stopped and disabled on the host
 Correct, hosted engine support iptables only.
 You should have iptables configured and enabled.
 - I could not manually NFS mount (v3 or v4) from the host to the 
 engine,
 unless I did service iptables stop

 So it doesn't appear to me that hosted-engine did the right things with
 firewalld/iptables. If these problems occurred during the --deploy,
 could that result in this situation?
 I don't think so
 I have temporarily disabled iptables until I get things working, but
 clearly that's insufficient to resolve the problem at this point.
 - iptables/firewalld is configured during the setup, which is Sandro's 
 domain. Sandro, could you please take a look at this?
 iptables configuration is performed by the engine when adding the host.
 please attach iptables-save output from the host  and host-deploy logs 
 from the hosted-engine vm.
 host-deploy logs are ^^ in this thread.
 I see ovirt-hosted-engine-setup logs, not 
 /var/log/ovirt-engine/host-deploy logs.
 Oh sorry - from the engine then. Attached.

 But my problem is with the firewall on the host.

 I cannot NFS mount a share on the host (e.g. my Data Domain) on the engine.
 In this case the host is the NFS server, and the engine is the NFS client.
 Only the host firewall should be relevant, correct?

 Maybe what you are saying is that hosted-engine does not attempt to 
 configure the iptables on the host to allow NFS shares?
 Yes, to be clear:
 ovirt-hosted-engine-setup just enable ports for spice / vnc connection from 
 remote host to VM while performing OS install on the VM.
 Once the VM is installed ovirt-engine configure iptables on the host using 
 ovirt-host-deploy package when the host is added to the engine.
 If you need other services on the host running the hosted engine you'll need 
 to configure manually iptables.
 
 Thanks,
 
 Jirka - since Sandro says this NFS issue is irrelevant to Hosted operation, 
 do you have any other suggestions or can I provide any additional data to
 help diagnose why my configuration is non-operational?
 I will eventually want to fix this and add Data and Export domains from my 
 host, but for the moment it appears no NFS exports from the host are
 required for oVirt operation.

I'm not saying NFS issue is irrelevant :-)
I'm saying that if you're adding NFS service on the node running hosted engine 
you'll need to configure iptables for allowing to mount the shares.
This means at least opening rpc-bind port 111 and NFS port 2049 and ports 662 
875 892 32769 32803 assuming you've configured NFS with:

RPCRQUOTADOPTS=-p 875
LOCKD_TCPPORT=32803
LOCKD_UDPPORT=32769
RPCMOUNTDOPTS=-p 892
STATDARG=-p 662 -o 2020

Alternative is to use NFS storage on a different host.



 So where are my domains? :)
 
 Thanks,
 Bob
 


 I have attached iptables-save output.
 I can't see anything blocking the mount from the hots toward  the engine 
 vm.
 Can you attach iptables-save also from the engine vm?
 (IIUC you've a nfs share there and you're trying to mount it from the host 
 right?)
 Visa-versa. My Data domain is on my host. So is my Export domain, but I 
 haven't tried to import it yet since the Datacenter is not operational.

 Thanks,
 Bob


 


-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] vm's not shutting down from admin portal

2014-05-21 Thread Michal Skrivanek

On May 20, 2014, at 02:05 , Jeff Clay jeffc...@gmail.com wrote:

 When selecting to shutdown vm's from the admi portal, it often doesn't work 
 although, sometimes it does. These machines are all stateless and in the same 
 pool, yet sometimes they will shutdown from the portal, most of the time they 
 don't. here's what I see in engine.log when they don't shutdown. 
 
 
 2014-05-19 18:17:42,477 INFO  
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
 (org.ovirt.thread.pool-6-thread-2) [4d427221] Correlation ID: 4d427221, Job 
 ID: ce662a5c-9474-4406-90f5-e941e130b47d, Call Stack: null, Custom Event ID: 
 -1, Message: VM shutdown initiated by Jeff.Clay on VM USAROVRTVZ-13 (Host: 
 USARPAOVRTHOST02).
 2014-05-19 18:22:45,333 INFO  
 [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] 
 (DefaultQuartzScheduler_Worker-53) VM USAROVRTVZ-13 
 67a51ec0-659d-4372-b4f1-85a56e6c0992 moved from PoweringDown -- Up

There is a 5 min timeout for the guest to poweroff. If it fails to do so we 
move the state back to Up.
Your guest is likely not configured properly (either missing ovirt-guest-agent 
or ACPI disabled/misconfigured in OS)

Thanks,
michal

 2014-05-19 18:22:45,381 INFO  
 [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
 (DefaultQuartzScheduler_Worker-53) Correlation ID: null, Call Stack: null, 
 Custom Event ID: -1, Message: Shutdown of VM USAROVRTVZ-13 failed.
 
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine problem - Engine VM will not start

2014-05-21 Thread Sven Kieske
I'd want to add that these rules are for NFSv3
asking Bob if he is maybe useing NFSv4 ?

Am 21.05.2014 08:43, schrieb Sandro Bonazzola:
 I'm not saying NFS issue is irrelevant :-)
 I'm saying that if you're adding NFS service on the node running hosted 
 engine you'll need to configure iptables for allowing to mount the shares.
 This means at least opening rpc-bind port 111 and NFS port 2049 and ports 662 
 875 892 32769 32803 assuming you've configured NFS with:
 
 RPCRQUOTADOPTS=-p 875
 LOCKD_TCPPORT=32803
 LOCKD_UDPPORT=32769
 RPCMOUNTDOPTS=-p 892
 STATDARG=-p 662 -o 2020
 
 Alternative is to use NFS storage on a different host.

-- 
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH  Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] glusterfs quetions/tips

2014-05-21 Thread Gabi C

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] [QE][ACTION NEEDED] oVirt 3.4.2 RC status

2014-05-21 Thread Sandro Bonazzola
Hi,
We're going to start composing oVirt 3.4.2 RC on *2014-05-27 08:00 UTC* from 
3.4 branches.

The bug tracker [1] shows no blocking bugs for the release

There are still 71 bugs [2] targeted to 3.4.2.
Excluding node and documentation bugs we still have 39 bugs [3] targeted to 
3.4.2.

Maintainers / Assignee:
- Please add the bugs to the tracker if you think that 3.4.2 should not be 
released without them fixed.
- Please update the target to any next release for bugs that won't be in 3.4.2:
  it will ease gathering the blocking bugs for next releases.
- Please fill release notes, the page has been created here [4]
- If you need to rebuild packages, please build them before *2014-05-26 15:00 
UTC*.
  Otherwise we'll take last 3.4 snapshot available.

Community:
- If you're going to test this release, please add yourself to the test page [5]

[1] http://bugzilla.redhat.com/1095370
[2] http://red.ht/1oqLLlr
[3] http://red.ht/1nIAZXO
[4] http://www.ovirt.org/OVirt_3.4.2_Release_Notes
[5] http://www.ovirt.org/Testing/oVirt_3.4.2_Testing


Thanks,

-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Jpackage down - permanently?

2014-05-21 Thread Jorick Astrego

Jpackage.org is down again, there isn't even an authoritive dns server
configured anymore

Anyone know what's going on there? It seems to have dropped of the
planet and this messes up the dependency repo big time.

Kind regards,

Jorick Astrego
Netbulae BV

On Mon, 2014-05-19 at 10:31 +0200, Sandro Bonazzola wrote: 

 Il 19/05/2014 10:26, Neil ha scritto:
  Hi guys,
  
  Sorry for the late reply.
  
  Thank you for all the responses, I'll do as suggested with regards to
  the Jboss and thank you for the upgrading links.
  
  Much appreciated.
 
 You're welcome :-)
 
 
  
  Regards.
  
  Neil Wilson.
  
  
  
  On Mon, May 19, 2014 at 8:28 AM, Sandro Bonazzola sbona...@redhat.com 
  wrote:
  Il 16/05/2014 16:07, Neil ha scritto:
  Hi guys,
 
  I'm doing an urgent ovirt upgrade from 3.2(dreyou) to 3.4 (official)
  but I see that www.jpackage.org is down.
 
  I've managed to install jboss-as-7.1.1-11.el6.x86_64 from another
  location, however I'm not sure if this will be compatible with 3.4, as
  per the instructions from
  http://wiki.dreyou.org/dokuwiki/doku.php?id=ovirt_rpm_start33;
 
  Hi,
  We ship a jboss-as package within oVirt EL6 repository, so no need for 
  jpackage repository just for jboss-as.
  http://resources.ovirt.org/pub/ovirt-3.4/rpm/el6/x86_64/jboss-as-7.1.1-11.el6.x86_64.rpm
 
 
 
 
  I've tested browsing to the site on two different internet links, as
  well as from the remote server but I'm just getting...
 
  RepoError: Cannot retrieve repository metadata (repomd.xml) for
  repository: ovirt-jpackage-6.0-generic. Please verify its path and try
  again
 
  Can I go ahead and attempt the upgrade using 7.1.1-11?
 
 
  You're upgrading from 3.2 so please follow upgrade instructions for 
  upgrading to 3.3.5 before trying to upgrade to 3.4.1:
  - http://www.ovirt.org/OVirt_3.3.5_release_notes
  - http://www.ovirt.org/OVirt_3.4.1_release_notes
 
 
 
 
  Thanks!
 
  Regards.
 
  Neil Wilson.
  ___
  Users mailing list
  Users@ovirt.org
  http://lists.ovirt.org/mailman/listinfo/users
 
 
 
  --
  Sandro Bonazzola
  Better technology. Faster innovation. Powered by community collaboration.
  See how it works at redhat.com
 
 


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [ovirt-devel] [QE][ACTION NEEDED] oVirt 3.4.2 RC status

2014-05-21 Thread Sven Kieske
Hi,

and thanks for the list with the open bugs.

Here are my proposed blockers (I won't add
them directly without dev agreement)
The list is not ordered in any way:

Cache records in memory
https://bugzilla.redhat.com/show_bug.cgi?id=870330
Status: Post

ovirt-log-collector: conflicts with file from package sos = 3.0
https://bugzilla.redhat.com/show_bug.cgi?id=1037663
Status: Post

Firmware missing
https://bugzilla.redhat.com/show_bug.cgi?id=1063001
Status: Post

Hot plug CPU - allow over-commit
https://bugzilla.redhat.com/show_bug.cgi?id=1097195
Status: New

Stucked tuned service during host deploying
https://bugzilla.redhat.com/show_bug.cgi?id=1069119
Status: New

Failed to reconfigure libvirt for VDSM
https://bugzilla.redhat.com/show_bug.cgi?id=1078309
Status: NEW

[vdsm] Fix make distcheck target
https://bugzilla.redhat.com/show_bug.cgi?id=1098179
Status: NEW

Run vm with odd number of cores drop libvirt error
https://bugzilla.redhat.com/show_bug.cgi?id=1070890
Status: New



-- 
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH  Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] [QE][ACTION NEEDED] oVirt 3.5.0 Alpha status

2014-05-21 Thread Sandro Bonazzola
Hi,
We released oVirt 3.5.0 Alpha on *2014-05-20* and we're now preparing for 
feature freeze scheduled for 2014-05-30.
We're going to compose a second Alpha on Firday *2014-05-30 08:00 UTC*.
Maintainers:
- Please be sure that master snapshot allow to create VMs before *2014-05-29 
15:00 UTC*


The bug tracker [1] shows the following proposed blockers to be reviewed:

Bug ID  Whiteboard  Status  Summary
1001100 integration NEW Add log gathering for a new ovirt 
module (External scheduler)
1073944 integration ASSIGNEDAdd log gathering for a new ovirt 
module (External scheduler)
1060198 integration NEW [RFE] add support for Fedora 20
1099432 virtNEW noVNC client doesn't work: Server 
disconnected (code: 1006)


Feature freeze has been postponed to 2014-05-30 and the following features 
should be testable in 3.5.0 Alpha according to Features Status Table [2]

Group   oVirt BZTitle
gluster 1096713 Monitoring (UI plugin) Dashboard (Integrated 
with Nagios monitoring)
infra   1090530 [RFE] Please add host count and guest count 
columns to Clusters tab in webadmin
infra   1054778 [RFE] Allow to perform fence operations from a 
host in another DC
infra   1090803 [RFE] Change the Slot field to Service 
Profile when cisco_ucs is selected as the fencing type
infra   1090511 [RFE] Improve fencing robustness by retrying 
failed attempts
infra   1090794 [RFE] Search VMs based on MAC address from 
web-admin portal
infra   1090793 consider the event type while printing events 
to engine.log
infra   1090796 [RFE] Re-work engine ovirt-node host-deploy 
sequence
infra   1090798 [RFE] Admin GUI - Add host uptime information 
to the General tab
infra   1090808 [RFE] Ability to dismiss alerts and events from 
web-admin portal
infra-api   1090797 [RFE] RESTAPI: Add /tags sub-collection for 
Template resource
infra-dwh   1091686 prevent OutOfMemoryError after starting the dwh 
service.
network 1078836 Add a warning when adding display network
network 1079719 Display of NIC Slave/Bond fault on Event Log
network 1080987 Support ethtool_opts functionality within oVirt
storage 1054241 Store OVF on any domains
storage 1083312 Disk alias recycling in web-admin portal
ux  1064543 oVirt new look and feel [PatternFly adoption] - 
phase #1
virt1058832 Allow to clone a (down) VM without 
snapshot/template
virt1031040 can't set different keymap for vnc via runonce 
option
virt1043471 oVirt guest agent for SLES
virt1083049 add progress bar for vm migration
virt1083065 EL 7 guest compatibility
virt1083059 Instance types (new template handling) - 
adding flavours
virtAllow guest serial number to be configurable
virt1047624 [RFE] support BIOS boot device menu
virt1083129 allows setting netbios name, locale, language 
and keyboard settings for windows vm's
virt1038632 spice-html5 button to show debug console/output 
window
virt1080002 [RFE] Enable user defined Windows Sysprep file  
done



Some more features may be included since they were near to be completed on last 
sync meeting.
The table will be updated on next sync meeting scheduled for 2014-05-21.

There are still 382 bugs [3] targeted to 3.5.0.
Excluding node and documentation bugs we still have 319 bugs [4] targeted to 
3.5.0.

Maintainers / Assignee:
- Please remember to rebuild your packages before *2014-05-29 15:00 UTC* if 
needed, otherwise nightly snapshot will be taken.
- Please be sure that master snapshot allow to create VMs before *2014-05-29 
15:00 UTC*
- If you find a blocker bug please remember to add it to the tracker [1]
- Please start filling release notes, the page has been created here [5]

Community:
- You're welcome to join us testing this alpha release and getting involved in 
oVirt Quality Assurance[6]!


[1] http://bugzilla.redhat.com/1073943
[2] http://bit.ly/17qBn6F
[3] http://red.ht/1pVEk7H
[4] http://red.ht/1rLCJwF
[5] http://www.ovirt.org/OVirt_3.5_Release_Notes
[6] http://www.ovirt.org/OVirt_Quality_Assurance

Thanks,


-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [ovirt-devel] [QE][ACTION NEEDED] oVirt 3.4.2 RC status

2014-05-21 Thread Sandro Bonazzola
Il 21/05/2014 11:08, Sven Kieske ha scritto:
 Hi,
 
 and thanks for the list with the open bugs.
 
 Here are my proposed blockers (I won't add
 them directly without dev agreement)
 The list is not ordered in any way:
 
 Cache records in memory
 https://bugzilla.redhat.com/show_bug.cgi?id=870330
 Status: Post
 
 ovirt-log-collector: conflicts with file from package sos = 3.0
 https://bugzilla.redhat.com/show_bug.cgi?id=1037663
 Status: Post

I'm working with sos maintainer for coordinating oVirt 3.4.2 release with sos 
3.1.1 release fixing the issue
by moving all oVirt sos plugins to upstream sos fixing conflicts once for all.
Since a workaround exists (yum downgrade sos to 2.2-31.fc19 still available in 
F19) I don't think it should be a blocker.


 
 Firmware missing
 https://bugzilla.redhat.com/show_bug.cgi?id=1063001
 Status: Post
 
 Hot plug CPU - allow over-commit
 https://bugzilla.redhat.com/show_bug.cgi?id=1097195
 Status: New
 
 Stucked tuned service during host deploying
 https://bugzilla.redhat.com/show_bug.cgi?id=1069119
 Status: New
 
 Failed to reconfigure libvirt for VDSM
 https://bugzilla.redhat.com/show_bug.cgi?id=1078309
 Status: NEW

Let's discuss this on sync meeting, not sure it's important enough for blocking 
the release.

 
 [vdsm] Fix make distcheck target
 https://bugzilla.redhat.com/show_bug.cgi?id=1098179
 Status: NEW
 
 Run vm with odd number of cores drop libvirt error
 https://bugzilla.redhat.com/show_bug.cgi?id=1070890
 Status: New
 
 
 


-- 
Sandro Bonazzola
Better technology. Faster innovation. Powered by community collaboration.
See how it works at redhat.com
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] engine upgrade 3.2.2 -- 3.2.3 Database rename failed

2014-05-21 Thread Sven Kieske
Hi,

I don't know the exact resolution for this, but I'll add some people
who managed to make it work, following this tutorial:
http://wiki.dreyou.org/dokuwiki/doku.php?id=ovirt_rpm_start33

See this thread on the users ML:

http://lists.ovirt.org/pipermail/users/2013-December/018341.html

HTH


Am 20.05.2014 17:00, schrieb Neil:
 Hi guys,
 
 I'm trying to upgrade from Dreyou to the official repo, I've installed
 the official 3.2 repo (I'll do the 3.3 update once this works). I've
 updated to ovirt-engine-setup.noarch 0:3.2.3-1.el6 and when I run
 engine upgrade it bombs out when trying to rename my database with the
 following error...
 
 [root@engine01 /]#  cat
 /var/log/ovirt-engine/ovirt-engine-upgrade_2014_05_20_16_34_21.log
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB host value
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB port value
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB user value
 2014-05-20 16:34:21::DEBUG::common_utils::332::root:: YUM: VERB:
 Loaded plugins: refresh-packagekit, versionlock
 2014-05-20 16:34:21::INFO::engine-upgrade::969::root:: Info:
 /etc/ovirt-engine/.pgpass file found. Continue.
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB admin value
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB host value
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB port value
 2014-05-20 16:34:21::DEBUG::common_utils::481::root:: running sql
 query 'SELECT pg_database_size('engine')' on db server: 'localhost'.
 2014-05-20 16:34:21::DEBUG::common_utils::434::root:: Executing
 command -- '/usr/bin/psql -h localhost -p 5432 -U postgres -d
 postgres -c SELECT pg_database_size('engine')'
 2014-05-20 16:34:21::DEBUG::common_utils::472::root:: output =  
 pg_database_size
 --
  11976708
 (1 row)
 
 
 2014-05-20 16:34:21::DEBUG::common_utils::473::root:: stderr =
 2014-05-20 16:34:21::DEBUG::common_utils::474::root:: retcode = 0
 2014-05-20 16:34:21::DEBUG::common_utils::1567::root:: Found mount
 point of '/var/cache/yum' at '/'
 2014-05-20 16:34:21::DEBUG::common_utils::663::root:: Checking
 available space on /var/cache/yum
 2014-05-20 16:34:21::DEBUG::common_utils::668::root:: Available space
 on /var/cache/yum is 172329
 2014-05-20 16:34:21::DEBUG::common_utils::1567::root:: Found mount
 point of '/var/lib/ovirt-engine/backups' at '/'
 2014-05-20 16:34:21::DEBUG::common_utils::663::root:: Checking
 available space on /var/lib/ovirt-engine/backups
 2014-05-20 16:34:21::DEBUG::common_utils::668::root:: Available space
 on /var/lib/ovirt-engine/backups is 172329
 2014-05-20 16:34:21::DEBUG::common_utils::1567::root:: Found mount
 point of '/usr/share' at '/'
 2014-05-20 16:34:21::DEBUG::common_utils::663::root:: Checking
 available space on /usr/share
 2014-05-20 16:34:21::DEBUG::common_utils::668::root:: Available space
 on /usr/share is 172329
 2014-05-20 16:34:21::DEBUG::common_utils::1590::root:: Mount points
 are: {'/': {'required': 1511, 'free': 172329}}
 2014-05-20 16:34:21::DEBUG::common_utils::1599::root:: Comparing free
 space 172329 MB with required 1511 MB
 2014-05-20 16:34:21::DEBUG::common_utils::481::root:: running sql
 query 'SELECT compatibility_version FROM storage_pool;' on db server:
 'localhost'.
 2014-05-20 16:34:21::DEBUG::common_utils::434::root:: Executing
 command -- '/usr/bin/psql -h localhost -p 5432 -U engine -d engine -c
 SELECT compatibility_version FROM storage_pool;'
 2014-05-20 16:34:21::DEBUG::common_utils::472::root:: output =
 compatibility_version
 ---
  3.2
 (1 row)
 
 
 2014-05-20 16:34:21::DEBUG::common_utils::473::root:: stderr =
 2014-05-20 16:34:21::DEBUG::common_utils::474::root:: retcode = 0
 2014-05-20 16:34:21::DEBUG::common_utils::481::root:: running sql
 query 'SELECT compatibility_version FROM vds_groups;' on db server:
 'localhost'.
 2014-05-20 16:34:21::DEBUG::common_utils::434::root:: Executing
 command -- '/usr/bin/psql -h localhost -p 5432 -U engine -d engine -c
 SELECT compatibility_version FROM vds_groups;'
 2014-05-20 16:34:21::DEBUG::common_utils::472::root:: output =
 compatibility_version
 ---
  3.2
 (1 row)
 
 
 2014-05-20 16:34:21::DEBUG::common_utils::473::root:: stderr =
 2014-05-20 16:34:21::DEBUG::common_utils::474::root:: retcode = 0
 2014-05-20 16:34:21::DEBUG::engine-upgrade::280::root:: Yum unlock started
 2014-05-20 16:34:21::DEBUG::engine-upgrade::292::root:: Yum unlock
 completed successfully
 2014-05-20 16:34:22::DEBUG::common_utils::332::root:: YUM: VERB:
 Downloading: repomdu5SB03tmp.xml (0%)
 2014-05-20 

Re: [ovirt-users] glusterfs tips/questions

2014-05-21 Thread Kanagaraj


On 05/21/2014 02:04 PM, Gabi C wrote:

Hello!

I have an ovirt setup, 3.4.1, up-to date, with gluster package 
3.5.0-3.fc19 on all 3 nodes. Glusterfs setup is replicated on 3 
bricks. On 2 nodes 'gluster peeer status' raise 2 peer connected with 
it's UUID. On third node 'gluster peer status' raise 3 peers, out of 
which, two reffer to same node/IP but different UUID.


in every node you can find the peers in /var/lib/glusterd/peers/

you can get the uuid of the current node using the command gluster 
system:: uuid get


From this you can find which file is wrong in the above location.

[Adding gluster-us...@ovirt.org]



What I have tried:
- stopped gluster volumes, put 3rd node in maintenace, reboor - no 
effect;
- stopped  volumes, removed bricks belonging to 3rd node, readded it, 
start volumes but still no effect.



Any ideas, hints?

TIA


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] engine upgrade 3.2.2 -- 3.2.3 Database rename failed

2014-05-21 Thread Neil
Hi guys,

Just a little more info on the problem. I've upgraded another oVirt
system before from Dreyou and it worked perfectly, however on this
particular system, we had to restore from backups (DB PKI and
etc/ovirt-engine) as the physical machine died that was hosting the
engine, so perhaps this is the reason we encountering this problem
this time around...

Any help is greatly appreciated.

Thank you.

Regards.

Neil Wilson.



On Wed, May 21, 2014 at 11:46 AM, Sven Kieske s.kie...@mittwald.de wrote:
 Hi,

 I don't know the exact resolution for this, but I'll add some people
 who managed to make it work, following this tutorial:
 http://wiki.dreyou.org/dokuwiki/doku.php?id=ovirt_rpm_start33

 See this thread on the users ML:

 http://lists.ovirt.org/pipermail/users/2013-December/018341.html

 HTH


 Am 20.05.2014 17:00, schrieb Neil:
 Hi guys,

 I'm trying to upgrade from Dreyou to the official repo, I've installed
 the official 3.2 repo (I'll do the 3.3 update once this works). I've
 updated to ovirt-engine-setup.noarch 0:3.2.3-1.el6 and when I run
 engine upgrade it bombs out when trying to rename my database with the
 following error...

 [root@engine01 /]#  cat
 /var/log/ovirt-engine/ovirt-engine-upgrade_2014_05_20_16_34_21.log
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB host value
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB port value
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB user value
 2014-05-20 16:34:21::DEBUG::common_utils::332::root:: YUM: VERB:
 Loaded plugins: refresh-packagekit, versionlock
 2014-05-20 16:34:21::INFO::engine-upgrade::969::root:: Info:
 /etc/ovirt-engine/.pgpass file found. Continue.
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB admin value
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB host value
 2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
 pgpass file /etc/ovirt-engine/.pgpass, fetching DB port value
 2014-05-20 16:34:21::DEBUG::common_utils::481::root:: running sql
 query 'SELECT pg_database_size('engine')' on db server: 'localhost'.
 2014-05-20 16:34:21::DEBUG::common_utils::434::root:: Executing
 command -- '/usr/bin/psql -h localhost -p 5432 -U postgres -d
 postgres -c SELECT pg_database_size('engine')'
 2014-05-20 16:34:21::DEBUG::common_utils::472::root:: output =  
 pg_database_size
 --
  11976708
 (1 row)


 2014-05-20 16:34:21::DEBUG::common_utils::473::root:: stderr =
 2014-05-20 16:34:21::DEBUG::common_utils::474::root:: retcode = 0
 2014-05-20 16:34:21::DEBUG::common_utils::1567::root:: Found mount
 point of '/var/cache/yum' at '/'
 2014-05-20 16:34:21::DEBUG::common_utils::663::root:: Checking
 available space on /var/cache/yum
 2014-05-20 16:34:21::DEBUG::common_utils::668::root:: Available space
 on /var/cache/yum is 172329
 2014-05-20 16:34:21::DEBUG::common_utils::1567::root:: Found mount
 point of '/var/lib/ovirt-engine/backups' at '/'
 2014-05-20 16:34:21::DEBUG::common_utils::663::root:: Checking
 available space on /var/lib/ovirt-engine/backups
 2014-05-20 16:34:21::DEBUG::common_utils::668::root:: Available space
 on /var/lib/ovirt-engine/backups is 172329
 2014-05-20 16:34:21::DEBUG::common_utils::1567::root:: Found mount
 point of '/usr/share' at '/'
 2014-05-20 16:34:21::DEBUG::common_utils::663::root:: Checking
 available space on /usr/share
 2014-05-20 16:34:21::DEBUG::common_utils::668::root:: Available space
 on /usr/share is 172329
 2014-05-20 16:34:21::DEBUG::common_utils::1590::root:: Mount points
 are: {'/': {'required': 1511, 'free': 172329}}
 2014-05-20 16:34:21::DEBUG::common_utils::1599::root:: Comparing free
 space 172329 MB with required 1511 MB
 2014-05-20 16:34:21::DEBUG::common_utils::481::root:: running sql
 query 'SELECT compatibility_version FROM storage_pool;' on db server:
 'localhost'.
 2014-05-20 16:34:21::DEBUG::common_utils::434::root:: Executing
 command -- '/usr/bin/psql -h localhost -p 5432 -U engine -d engine -c
 SELECT compatibility_version FROM storage_pool;'
 2014-05-20 16:34:21::DEBUG::common_utils::472::root:: output =
 compatibility_version
 ---
  3.2
 (1 row)


 2014-05-20 16:34:21::DEBUG::common_utils::473::root:: stderr =
 2014-05-20 16:34:21::DEBUG::common_utils::474::root:: retcode = 0
 2014-05-20 16:34:21::DEBUG::common_utils::481::root:: running sql
 query 'SELECT compatibility_version FROM vds_groups;' on db server:
 'localhost'.
 2014-05-20 16:34:21::DEBUG::common_utils::434::root:: Executing
 command -- '/usr/bin/psql -h localhost -p 5432 -U engine -d engine -c
 SELECT compatibility_version FROM vds_groups;'
 2014-05-20 

Re: [ovirt-users] glusterfs tips/questions

2014-05-21 Thread Gabi C
On afected node:

gluster peer status

gluster peer status
Number of Peers: 3

Hostname: 10.125.1.194
Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
State: Peer in Cluster (Connected)

Hostname: 10.125.1.196
Uuid: c22e41b8-2818-4a96-a6df-a237517836d6
State: Peer in Cluster (Connected)

Hostname: 10.125.1.194
Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
State: Peer in Cluster (Connected)





ls -la /var/lib/gluster



ls -la /var/lib/glusterd/peers/
total 20
drwxr-xr-x. 2 root root 4096 May 21 11:10 .
drwxr-xr-x. 9 root root 4096 May 21 11:09 ..
-rw---. 1 root root   73 May 21 11:10
85c2a08c-a955-47cc-a924-cf66c6814654
-rw---. 1 root root   73 May 21 10:52
c22e41b8-2818-4a96-a6df-a237517836d6
-rw---. 1 root root   73 May 21 11:10
d95558a0-a306-4812-aec2-a361a9ddde3e


Shoul I delete d95558a0-a306-4812-aec2-a361a9ddde3e??





On Wed, May 21, 2014 at 12:00 PM, Kanagaraj kmayi...@redhat.com wrote:


 On 05/21/2014 02:04 PM, Gabi C wrote:

   Hello!

  I have an ovirt setup, 3.4.1, up-to date, with gluster package
 3.5.0-3.fc19 on all 3 nodes. Glusterfs setup is replicated on 3 bricks. On
 2 nodes 'gluster peeer status' raise 2 peer connected with it's UUID. On
 third node 'gluster peer status' raise 3 peers, out of which, two reffer to
 same node/IP but different UUID.


 in every node you can find the peers in /var/lib/glusterd/peers/

 you can get the uuid of the current node using the command gluster
 system:: uuid get

 From this you can find which file is wrong in the above location.

 [Adding gluster-us...@ovirt.org]


  What I have tried:
  - stopped gluster volumes, put 3rd node in maintenace, reboor - no
 effect;
  - stopped  volumes, removed bricks belonging to 3rd node, readded it,
 start volumes but still no effect.


  Any ideas, hints?

  TIA


 ___
 Users mailing listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] [ovirt-devel] [QE][ACTION NEEDED] oVirt 3.4.2 RC status

2014-05-21 Thread Dan Kenigsberg
On Wed, May 21, 2014 at 11:33:13AM +0200, Sandro Bonazzola wrote:
 Il 21/05/2014 11:08, Sven Kieske ha scritto:
  Hi,
  
  and thanks for the list with the open bugs.
  
  Here are my proposed blockers (I won't add
  them directly without dev agreement)
  The list is not ordered in any way:
  
  Cache records in memory
  https://bugzilla.redhat.com/show_bug.cgi?id=870330
  Status: Post
  
  ovirt-log-collector: conflicts with file from package sos = 3.0
  https://bugzilla.redhat.com/show_bug.cgi?id=1037663
  Status: Post
 
 I'm working with sos maintainer for coordinating oVirt 3.4.2 release with sos 
 3.1.1 release fixing the issue
 by moving all oVirt sos plugins to upstream sos fixing conflicts once for all.
 Since a workaround exists (yum downgrade sos to 2.2-31.fc19 still available 
 in F19) I don't think it should be a blocker.
 
 
  
  Firmware missing
  https://bugzilla.redhat.com/show_bug.cgi?id=1063001
  Status: Post
  
  Hot plug CPU - allow over-commit
  https://bugzilla.redhat.com/show_bug.cgi?id=1097195
  Status: New
  
  Stucked tuned service during host deploying
  https://bugzilla.redhat.com/show_bug.cgi?id=1069119
  Status: New
  
  Failed to reconfigure libvirt for VDSM
  https://bugzilla.redhat.com/show_bug.cgi?id=1078309
  Status: NEW
 
 Let's discuss this on sync meeting, not sure it's important enough for 
 blocking the release.
 
  
  [vdsm] Fix make distcheck target
  https://bugzilla.redhat.com/show_bug.cgi?id=1098179
  Status: NEW

and this most certainly should not block a micro release. it is a
code-only change, with no functional effects.

  
  Run vm with odd number of cores drop libvirt error
  https://bugzilla.redhat.com/show_bug.cgi?id=1070890
  Status: New
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] sanlock + gluster recovery -- RFE

2014-05-21 Thread Federico Simoncelli
- Original Message -
 From: Ted Miller tmil...@hcjb.org
 To: users users@ovirt.org
 Sent: Tuesday, May 20, 2014 11:31:42 PM
 Subject: [ovirt-users] sanlock + gluster recovery -- RFE
 
 As you are aware, there is an ongoing split-brain problem with running
 sanlock on replicated gluster storage. Personally, I believe that this is
 the 5th time that I have been bitten by this sanlock+gluster problem.
 
 I believe that the following are true (if not, my entire request is probably
 off base).
 
 
 * ovirt uses sanlock in such a way that when the sanlock storage is on a
 replicated gluster file system, very small storage disruptions can
 result in a gluster split-brain on the sanlock space

Although this is possible (at the moment) we are working hard to avoid it.
The hardest part here is to ensure that the gluster volume is properly
configured.

The suggested configuration for a volume to be used with ovirt is:

Volume Name: (...)
Type: Replicate
Volume ID: (...)
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
(...three bricks...)
Options Reconfigured:
network.ping-timeout: 10
cluster.quorum-type: auto

The two options ping-timeout and quorum-type are really important.

You would also need a build where this bug is fixed in order to avoid any
chance of a split-brain:

https://bugzilla.redhat.com/show_bug.cgi?id=1066996

 How did I get into this mess?
 
 ...
 
 What I would like to see in ovirt to help me (and others like me). Alternates
 listed in order from most desirable (automatic) to least desirable (set of
 commands to type, with lots of variables to figure out).

The real solution is to avoid the split-brain altogether. At the moment it
seems that using the suggested configurations and the bug fix we shouldn't
hit a split-brain.

 1. automagic recovery
 
 2. recovery subcommand
 
 3. script
 
 4. commands

I think that the commands to resolve a split-brain should be documented.
I just started a page here:

http://www.ovirt.org/Gluster_Storage_Domain_Reference

Could you add your documentation there? Thanks!

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine problem - Engine VM will not start

2014-05-21 Thread Bob Doolittle

On 05/21/2014 03:09 AM, Sven Kieske wrote:

I'd want to add that these rules are for NFSv3
asking Bob if he is maybe useing NFSv4 ?


At the moment I don't need either one. I need to solve my major issues 
first. Then when things are working I'll worry about setting up NFS to 
export new domains from my host.


Like - why didn't my default domains get configured properly?

Where is my Data Domain, and why is my ISO Domain unattached?
Why didn't hosted-engine --deploy set this up properly? I took the 
defaults during deployment for domain setup.


When I first login to webadmin #vms, it shows HostedEngine as green/up.
At #storage it shows my ISO Domain and ovirt-image-repository as 
unattached. No Data Domain.

At #dataCenters it shows my Default datacenter as down/uninitialized
If I go to #storage and select ISO_DOMAIN and select it's Data Center 
tab (#storage-data_center), it doesn't show any Data Centers to attach to.


-Bob



Am 21.05.2014 08:43, schrieb Sandro Bonazzola:

I'm not saying NFS issue is irrelevant :-)
I'm saying that if you're adding NFS service on the node running hosted engine 
you'll need to configure iptables for allowing to mount the shares.
This means at least opening rpc-bind port 111 and NFS port 2049 and ports 662 
875 892 32769 32803 assuming you've configured NFS with:

RPCRQUOTADOPTS=-p 875
LOCKD_TCPPORT=32803
LOCKD_UDPPORT=32769
RPCMOUNTDOPTS=-p 892
STATDARG=-p 662 -o 2020

Alternative is to use NFS storage on a different host.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] glusterfs tips/questions

2014-05-21 Thread Gabi C
..or should I:

-stop volumes
-remove brick belonging to the affected node
-remove afected node/peer
-add thenode, brick then start volumes?



On Wed, May 21, 2014 at 1:13 PM, Gabi C gab...@gmail.com wrote:

 On afected node:

 gluster peer status

 gluster peer status
 Number of Peers: 3

 Hostname: 10.125.1.194
 Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
 State: Peer in Cluster (Connected)

 Hostname: 10.125.1.196
 Uuid: c22e41b8-2818-4a96-a6df-a237517836d6
 State: Peer in Cluster (Connected)

 Hostname: 10.125.1.194
 Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
 State: Peer in Cluster (Connected)





 ls -la /var/lib/gluster



 ls -la /var/lib/glusterd/peers/
 total 20
 drwxr-xr-x. 2 root root 4096 May 21 11:10 .
 drwxr-xr-x. 9 root root 4096 May 21 11:09 ..
 -rw---. 1 root root   73 May 21 11:10
 85c2a08c-a955-47cc-a924-cf66c6814654
 -rw---. 1 root root   73 May 21 10:52
 c22e41b8-2818-4a96-a6df-a237517836d6
 -rw---. 1 root root   73 May 21 11:10
 d95558a0-a306-4812-aec2-a361a9ddde3e


 Shoul I delete d95558a0-a306-4812-aec2-a361a9ddde3e??





 On Wed, May 21, 2014 at 12:00 PM, Kanagaraj kmayi...@redhat.com wrote:


 On 05/21/2014 02:04 PM, Gabi C wrote:

   Hello!

  I have an ovirt setup, 3.4.1, up-to date, with gluster package
 3.5.0-3.fc19 on all 3 nodes. Glusterfs setup is replicated on 3 bricks. On
 2 nodes 'gluster peeer status' raise 2 peer connected with it's UUID. On
 third node 'gluster peer status' raise 3 peers, out of which, two reffer to
 same node/IP but different UUID.


 in every node you can find the peers in /var/lib/glusterd/peers/

 you can get the uuid of the current node using the command gluster
 system:: uuid get

 From this you can find which file is wrong in the above location.

 [Adding gluster-us...@ovirt.org]


  What I have tried:
  - stopped gluster volumes, put 3rd node in maintenace, reboor - no
 effect;
  - stopped  volumes, removed bricks belonging to 3rd node, readded it,
 start volumes but still no effect.


  Any ideas, hints?

  TIA


 ___
 Users mailing 
 listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine problem - Engine VM will not start

2014-05-21 Thread Jiri Moskovcak

On 05/21/2014 02:49 PM, Bob Doolittle wrote:

On 05/21/2014 03:09 AM, Sven Kieske wrote:

I'd want to add that these rules are for NFSv3
asking Bob if he is maybe useing NFSv4 ?


At the moment I don't need either one. I need to solve my major issues
first. Then when things are working I'll worry about setting up NFS to
export new domains from my host.

Like - why didn't my default domains get configured properly?

Where is my Data Domain, and why is my ISO Domain unattached?
Why didn't hosted-engine --deploy set this up properly? I took the
defaults during deployment for domain setup.

When I first login to webadmin #vms, it shows HostedEngine as green/up.
At #storage it shows my ISO Domain and ovirt-image-repository as
unattached. No Data Domain.
At #dataCenters it shows my Default datacenter as down/uninitialized
If I go to #storage and select ISO_DOMAIN and select it's Data Center
tab (#storage-data_center), it doesn't show any Data Centers to attach to.

-Bob


- can you login to the VM running the engine and try to mount the nfs 
share manually to some directory, just to see if it works? Neither 
engine nor setup is responsible for setting the nfs share (and 
configuring iptables for nfs server), so it's up to you to set it up 
properly and make sure it's mountable from engine.


--Jirka





Am 21.05.2014 08:43, schrieb Sandro Bonazzola:

I'm not saying NFS issue is irrelevant :-)
I'm saying that if you're adding NFS service on the node running
hosted engine you'll need to configure iptables for allowing to mount
the shares.
This means at least opening rpc-bind port 111 and NFS port 2049 and
ports 662 875 892 32769 32803 assuming you've configured NFS with:

RPCRQUOTADOPTS=-p 875
LOCKD_TCPPORT=32803
LOCKD_UDPPORT=32769
RPCMOUNTDOPTS=-p 892
STATDARG=-p 662 -o 2020

Alternative is to use NFS storage on a different host.


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] glusterfs tips/questions

2014-05-21 Thread Kanagaraj

What are the steps which led this situation?

Did you re-install one of the nodes after forming the cluster or reboot 
which could have changed the ip?



On 05/21/2014 03:43 PM, Gabi C wrote:

On afected node:

gluster peer status

gluster peer status
Number of Peers: 3

Hostname: 10.125.1.194
Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
State: Peer in Cluster (Connected)

Hostname: 10.125.1.196
Uuid: c22e41b8-2818-4a96-a6df-a237517836d6
State: Peer in Cluster (Connected)

Hostname: 10.125.1.194
Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
State: Peer in Cluster (Connected)





ls -la /var/lib/gluster



ls -la /var/lib/glusterd/peers/
total 20
drwxr-xr-x. 2 root root 4096 May 21 11:10 .
drwxr-xr-x. 9 root root 4096 May 21 11:09 ..
-rw---. 1 root root   73 May 21 11:10 
85c2a08c-a955-47cc-a924-cf66c6814654
-rw---. 1 root root   73 May 21 10:52 
c22e41b8-2818-4a96-a6df-a237517836d6
-rw---. 1 root root   73 May 21 11:10 
d95558a0-a306-4812-aec2-a361a9ddde3e



Shoul I delete d95558a0-a306-4812-aec2-a361a9ddde3e??





On Wed, May 21, 2014 at 12:00 PM, Kanagaraj kmayi...@redhat.com 
mailto:kmayi...@redhat.com wrote:



On 05/21/2014 02:04 PM, Gabi C wrote:

Hello!

I have an ovirt setup, 3.4.1, up-to date, with gluster package
3.5.0-3.fc19 on all 3 nodes. Glusterfs setup is replicated on 3
bricks. On 2 nodes 'gluster peeer status' raise 2 peer connected
with it's UUID. On third node 'gluster peer status' raise 3
peers, out of which, two reffer to same node/IP but different UUID.


in every node you can find the peers in /var/lib/glusterd/peers/

you can get the uuid of the current node using the command
gluster system:: uuid get

From this you can find which file is wrong in the above location.

[Adding gluster-us...@ovirt.org mailto:gluster-us...@ovirt.org]



What I have tried:
- stopped gluster volumes, put 3rd node in maintenace, reboor -
no effect;
- stopped  volumes, removed bricks belonging to 3rd node, readded
it, start volumes but still no effect.


Any ideas, hints?

TIA


___
Users mailing list
Users@ovirt.org  mailto:Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] glusterfs tips/questions

2014-05-21 Thread Gabi C
Hello!


I haven't change the IP, nor reinstall nodes. All nodes are updated via
yum. All I can think of was that after having some issue with gluster,from
WebGUI I deleted VM, deactivate and detach storage domains ( I have 2) ,
than, *manually*, from one of the nodes , remove bricks, then detach peers,
probe them, add bricks again, bring the volume up, and readd storage
domains from the webGUI.


On Wed, May 21, 2014 at 4:26 PM, Kanagaraj kmayi...@redhat.com wrote:

  What are the steps which led this situation?

 Did you re-install one of the nodes after forming the cluster or reboot
 which could have changed the ip?



 On 05/21/2014 03:43 PM, Gabi C wrote:

  On afected node:

 gluster peer status

 gluster peer status
 Number of Peers: 3

 Hostname: 10.125.1.194
 Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
 State: Peer in Cluster (Connected)

 Hostname: 10.125.1.196
 Uuid: c22e41b8-2818-4a96-a6df-a237517836d6
 State: Peer in Cluster (Connected)

 Hostname: 10.125.1.194
 Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
 State: Peer in Cluster (Connected)





  ls -la /var/lib/gluster



 ls -la /var/lib/glusterd/peers/
 total 20
 drwxr-xr-x. 2 root root 4096 May 21 11:10 .
 drwxr-xr-x. 9 root root 4096 May 21 11:09 ..
 -rw---. 1 root root   73 May 21 11:10
 85c2a08c-a955-47cc-a924-cf66c6814654
 -rw---. 1 root root   73 May 21 10:52
 c22e41b8-2818-4a96-a6df-a237517836d6
 -rw---. 1 root root   73 May 21 11:10
 d95558a0-a306-4812-aec2-a361a9ddde3e


  Shoul I delete d95558a0-a306-4812-aec2-a361a9ddde3e??





 On Wed, May 21, 2014 at 12:00 PM, Kanagaraj kmayi...@redhat.com wrote:


 On 05/21/2014 02:04 PM, Gabi C wrote:

   Hello!

  I have an ovirt setup, 3.4.1, up-to date, with gluster package
 3.5.0-3.fc19 on all 3 nodes. Glusterfs setup is replicated on 3 bricks. On
 2 nodes 'gluster peeer status' raise 2 peer connected with it's UUID. On
 third node 'gluster peer status' raise 3 peers, out of which, two reffer to
 same node/IP but different UUID.


  in every node you can find the peers in /var/lib/glusterd/peers/

 you can get the uuid of the current node using the command gluster
 system:: uuid get

 From this you can find which file is wrong in the above location.

 [Adding gluster-us...@ovirt.org]


  What I have tried:
  - stopped gluster volumes, put 3rd node in maintenace, reboor - no
 effect;
  - stopped  volumes, removed bricks belonging to 3rd node, readded it,
 start volumes but still no effect.


  Any ideas, hints?

  TIA


  ___
 Users mailing 
 listUsers@ovirt.orghttp://lists.ovirt.org/mailman/listinfo/users





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] glusterfs tips/questions

2014-05-21 Thread Kanagaraj

Ok.

I am not sure deleting the file or re-peer probe would be the right way 
to go.


Gluster-users can help you here.


On 05/21/2014 07:08 PM, Gabi C wrote:

Hello!


I haven't change the IP, nor reinstall nodes. All nodes are updated 
via yum. All I can think of was that after having some issue with 
gluster,from WebGUI I deleted VM, deactivate and detach storage 
domains ( I have 2) , than, _manually_, from one of the nodes , remove 
bricks, then detach peers, probe them, add bricks again, bring the 
volume up, and readd storage domains from the webGUI.



On Wed, May 21, 2014 at 4:26 PM, Kanagaraj kmayi...@redhat.com 
mailto:kmayi...@redhat.com wrote:


What are the steps which led this situation?

Did you re-install one of the nodes after forming the cluster or
reboot which could have changed the ip?



On 05/21/2014 03:43 PM, Gabi C wrote:

On afected node:

gluster peer status

gluster peer status
Number of Peers: 3

Hostname: 10.125.1.194
Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
State: Peer in Cluster (Connected)

Hostname: 10.125.1.196
Uuid: c22e41b8-2818-4a96-a6df-a237517836d6
State: Peer in Cluster (Connected)

Hostname: 10.125.1.194
Uuid: 85c2a08c-a955-47cc-a924-cf66c6814654
State: Peer in Cluster (Connected)





ls -la /var/lib/gluster



ls -la /var/lib/glusterd/peers/
total 20
drwxr-xr-x. 2 root root 4096 May 21 11:10 .
drwxr-xr-x. 9 root root 4096 May 21 11:09 ..
-rw---. 1 root root   73 May 21 11:10
85c2a08c-a955-47cc-a924-cf66c6814654
-rw---. 1 root root   73 May 21 10:52
c22e41b8-2818-4a96-a6df-a237517836d6
-rw---. 1 root root   73 May 21 11:10
d95558a0-a306-4812-aec2-a361a9ddde3e


Shoul I delete d95558a0-a306-4812-aec2-a361a9ddde3e??





On Wed, May 21, 2014 at 12:00 PM, Kanagaraj kmayi...@redhat.com
mailto:kmayi...@redhat.com wrote:


On 05/21/2014 02:04 PM, Gabi C wrote:

Hello!

I have an ovirt setup, 3.4.1, up-to date, with gluster
package 3.5.0-3.fc19 on all 3 nodes. Glusterfs setup is
replicated on 3 bricks. On 2 nodes 'gluster peeer status'
raise 2 peer connected with it's UUID. On third node
'gluster peer status' raise 3 peers, out of which, two
reffer to same node/IP but different UUID.


in every node you can find the peers in /var/lib/glusterd/peers/

you can get the uuid of the current node using the command
gluster system:: uuid get

From this you can find which file is wrong in the above location.

[Adding gluster-us...@ovirt.org mailto:gluster-us...@ovirt.org]



What I have tried:
- stopped gluster volumes, put 3rd node in maintenace,
reboor - no effect;
- stopped  volumes, removed bricks belonging to 3rd node,
readded it, start volumes but still no effect.


Any ideas, hints?

TIA


___
Users mailing list
Users@ovirt.org  mailto:Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users








___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Hosted engine problem - Engine VM will not start

2014-05-21 Thread Bob Doolittle


On 05/21/2014 09:24 AM, Jiri Moskovcak wrote:

On 05/21/2014 02:49 PM, Bob Doolittle wrote:

On 05/21/2014 03:09 AM, Sven Kieske wrote:

I'd want to add that these rules are for NFSv3
asking Bob if he is maybe useing NFSv4 ?


At the moment I don't need either one. I need to solve my major issues
first. Then when things are working I'll worry about setting up NFS to
export new domains from my host.

Like - why didn't my default domains get configured properly?

Where is my Data Domain, and why is my ISO Domain unattached?
Why didn't hosted-engine --deploy set this up properly? I took the
defaults during deployment for domain setup.

When I first login to webadmin #vms, it shows HostedEngine as green/up.
At #storage it shows my ISO Domain and ovirt-image-repository as
unattached. No Data Domain.
At #dataCenters it shows my Default datacenter as down/uninitialized
If I go to #storage and select ISO_DOMAIN and select it's Data Center
tab (#storage-data_center), it doesn't show any Data Centers to 
attach to.


-Bob


- can you login to the VM running the engine and try to mount the nfs 
share manually to some directory, just to see if it works? Neither 
engine nor setup is responsible for setting the nfs share (and 
configuring iptables for nfs server), so it's up to you to set it up 
properly and make sure it's mountable from engine.


I'm afraid NFS was a red herring. NFS shares from the host to the engine 
are not required for basic oVirt operation, correct?
If I understand Sandro correctly, that should not be affecting my 
storage connections to engine. I'm sorry I brought it up - it was my 
misunderstanding.


I believe the first thing to look at is why the ISO domain is unattached 
to my Default Datacenter, is that correct? Then my Datacenter should 
become operational, and I can add a Data Domain.


I can manually mount the ISO_DOMAIN directory on both my host and my 
engine without issues (it resides on my engine).


So why is my Datacenter not visible when I go to attach my ISO domain?

-Bob

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] oVirt Weekly Meeting Minutes: May 21, 2014

2014-05-21 Thread Brian Proffitt
Minutes:http://ovirt.org/meetings/ovirt/2014/ovirt.2014-05-21-14.02.html
Minutes (text): http://ovirt.org/meetings/ovirt/2014/ovirt.2014-05-21-14.02.txt
Log:
http://ovirt.org/meetings/ovirt/2014/ovirt.2014-05-21-14.02.log.html

=
#ovirt: oVirt Weekly Sync
=


Meeting started by bkp at 14:02:03 UTC. The full logs are available at
http://ovirt.org/meetings/ovirt/2014/ovirt.2014-05-21-14.02.log.html .



Meeting summary
---
* Agenda and Roll Call  (bkp, 14:02:19)
  * infra updates  (bkp, 14:02:42)
  * 3.4.z updates  (bkp, 14:02:42)
  * 3.5 planning  (bkp, 14:02:42)
  * conferences and workshops  (bkp, 14:02:42)
  * other topics  (bkp, 14:02:42)

* infra updates  (bkp, 14:03:01)
  * infra os1 issues solved, will add new slaves in the following days
(bkp, 14:06:28)

* 3.4.z updates  (bkp, 14:08:05)
  * 3.4.z updates 3.4.2: RC scheduled for 2014-05-27  (bkp, 14:11:53)
  * 3.4.z updates SvenKieske sent a list of proposed blockers to be
reviewed:
http://lists.ovirt.org/pipermail/users/2014-May/024518.html  (bkp,
14:11:53)
  * 3.4.z updates Bug 1037663 status: ovirt-log-collector: conflicts
with file from package sos = 3.0, sbonazzo working with sos
maintainer for coordinating oVirt 3.4.2 release. Not a blocker.
(bkp, 14:11:53)
  * 3.4.z updates Discussion is ongoing on mailing list for other bugs
(bkp, 14:11:54)

* 3.5 planning  (bkp, 14:12:20)
  * 3.5 planning UX Phase one of patternfly is in, and the sorting infra
(bkp, 14:14:53)
  * 3.5 planning UX Issues fixed as they are reported.  (bkp, 14:14:53)
  * 3.5 planning SLA All planned 3.5 features will be ready on time,
fingers crossed :)  (bkp, 14:18:22)
  * 3.5 planning SLA NUMA feature: APIs (GUI and RESTful) still missing.
(bkp, 14:18:22)
  * 3.5 planning SLA optaplanner: in advanced development state.  (bkp,
14:18:22)
  * 3.5 planning SLA Limits (blkio and cpu, incluiding disk and cpu
profile, including refactoring current network qos and vnic profile
to suit new sla/profiles infra): we are working on them nowadays (no
real issues there)  (bkp, 14:18:22)
  * 3.5 planning SLA Scheduling RESTful API: developed now - patch this
evening  (bkp, 14:18:22)
  * LINK: http://www.ovirt.org/Features/Self_Hosted_Engine_iSCSI_Support
all patches have been pushed and under review  (sbonazzo, 14:20:07)
  * LINK: http://www.ovirt.org/Features/oVirt_Windows_Guest_Tools
(sbonazzo, 14:20:55)
  * 3.5 planning integration ovirt 3.5.0 alpha released yesterday  (bkp,
14:22:47)
  * 3.5 planning integration ovirt live iso uploaded this afternoon
(bkp, 14:22:47)
  * 3.5 planning integration ovirt node iso will follow  (bkp, 14:22:47)
  * 3.5 planning integration 3.5.0 Second Alpha scheduled for May 30 for
Feature Freeze  (bkp, 14:22:47)
  * 3.5 planning integration
http://www.ovirt.org/Features/Self_Hosted_Engine_iSCSI_Support all
patches have been pushed and under review  (bkp, 14:22:47)
  * 3.5 planning integration There's an issue on additional host setup,
but should be fixed easily. Patch pushed an under review.  (bkp,
14:22:49)
  * 3.5 planning integration F20 support started for ovirt-engine,
hopefully ready for alpha 2  (bkp, 14:22:52)
  * 3.5 planning virt Unfinished features are closer to completition...
but nothing got in yet.  (bkp, 14:29:06)
  * 3.5 planning virt Last week spent fixing major bugs.  (bkp,
14:29:06)
  * 3.5 planning Gluster Volume capacity - vdsm dependency on
libgfapi-python needs to be added  (bkp, 14:34:31)
  * 3.5 planning Gluster Volume profile - review comments on patches
incorporated and submitted, need final review and approval  (bkp,
14:34:31)
  * 3.5 planning Gluster REST API in progress  (bkp, 14:34:31)
  * 3.5 planning node All features are in progress.  (bkp, 14:37:59)
  * 3.5 planning node Two features up for review (generic registration
and hosted engine plugin)  (bkp, 14:37:59)
  * 3.5 planning node Appliances can also be built, now all needs to be
reviewed and tested.  (bkp, 14:37:59)
  * 3.5 planning node  ETA for a node with the bits in place by early
next week.  (bkp, 14:37:59)
  * 3.5 planning network All Neutron appliance oVirt code is merged.
Feature now depends on some OpenStack repository processes that are
async with oVirt releases. oVirt side completed.  (bkp, 14:42:17)
  * 3.5 planning network The two MAC pool features are in a tough
review, so they're in danger even for the new deadline of end of
May.  (bkp, 14:42:17)
  * 3.5 planning network Progress with the RHEL bug on which bridge_opts
depends upon is unknown. danken adds that it is not a blocker for
3.5  (bkp, 14:42:17)
  * 3.5 planning network [UPDATE] Progress with the RHEL bug on which
bridge_opts depends upon done (with an asterisk).  (bkp, 14:43:34)
  * 3.5 planning storage Store OVF on any domains - merged  (bkp,
14:49:12)
  

[ovirt-users] sanlock + gluster recovery -- RFE

2014-05-21 Thread Giuseppe Ragusa
Hi,

 - Original Message -
  From: Ted Miller tmiller at hcjb.org
  To: users users at ovirt.org
  Sent: Tuesday, May 20, 2014 11:31:42 PM
  Subject: [ovirt-users] sanlock + gluster recovery -- RFE
  
  As you are aware, there is an ongoing split-brain problem with running
  sanlock on replicated gluster storage. Personally, I believe that this is
  the 5th time that I have been bitten by this sanlock+gluster problem.
  
  I believe that the following are true (if not, my entire request is probably
  off base).
  
  
  * ovirt uses sanlock in such a way that when the sanlock storage is on a
  replicated gluster file system, very small storage disruptions can
  result in a gluster split-brain on the sanlock space
 
 Although this is possible (at the moment) we are working hard to avoid it.
 The hardest part here is to ensure that the gluster volume is properly
 configured.
 
 The suggested configuration for a volume to be used with ovirt is:
 
 Volume Name: (...)
 Type: Replicate
 Volume ID: (...)
 Status: Started
 Number of Bricks: 1 x 3 = 3
 Transport-type: tcp
 Bricks:
 (...three bricks...)
 Options Reconfigured:
 network.ping-timeout: 10
 cluster.quorum-type: auto
 
 The two options ping-timeout and quorum-type are really important.
 
 You would also need a build where this bug is fixed in order to avoid any
 chance of a split-brain:
 
 https://bugzilla.redhat.com/show_bug.cgi?id=1066996

It seems that the aforementioned bug is peculiar to 3-bricks setups.

I understand that a 3-bricks setup can allow proper quorum formation without 
resorting to first-configured-brick-has-more-weight convention used with only 
2 bricks and quorum auto (which makes one node special, so not properly 
any-single-fault tolerant).

But, since we are on ovirt-users, is there a similar suggested configuration 
for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management properly 
configured and tested-working?
I mean a configuration where any host can go south and oVirt (through the 
other one) fences it (forcibly powering it off with confirmation from IPMI or 
similar) then restarts HA-marked vms that were running there, all the while 
keeping the underlying GlusterFS-based storage domains responsive and 
readable/writeable (maybe apart from a lapse between detected other-node 
unresposiveness and confirmed fencing)?

Furthermore: is such a suggested configuration possible in a self-hosted-engine 
scenario?

Regards,
Giuseppe

  How did I get into this mess?
  
  ...
  
  What I would like to see in ovirt to help me (and others like me). 
  Alternates
  listed in order from most desirable (automatic) to least desirable (set of
  commands to type, with lots of variables to figure out).
 
 The real solution is to avoid the split-brain altogether. At the moment it
 seems that using the suggested configurations and the bug fix we shouldn't
 hit a split-brain.
 
  1. automagic recovery
  
  2. recovery subcommand
  
  3. script
  
  4. commands
 
 I think that the commands to resolve a split-brain should be documented.
 I just started a page here:
 
 http://www.ovirt.org/Gluster_Storage_Domain_Reference
 
 Could you add your documentation there? Thanks!
 
 -- 
 Federico

  ___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Fwd: [Users] Ovirt 3.4 EqualLogic multipath Bug 953343

2014-05-21 Thread Gary Lloyd
Hi

I was just wondering if ISCSI Multipathing is supported yet on Direct Lun ?
I have deployed 3.4.0.1 but i can only see the option for ISCSI
multipathing on storage domains.
We will be glad if it could be, as it saves us having to inject new code
into our vdsm nodes with each new version.

Thanks


*Gary Lloyd*
--
IT Services
Keele University
---

Guys,

Please, pay attention that this feature currently may not work. I resolved
several bugs related to this feature but some of my patches are still
waiting for merge.

Regards,

Sergey

- Original Message -
 From: Maor Lipchuk mlipc...@redhat.com
 To: Gary Lloyd g.ll...@keele.ac.uk, users@ovirt.org, Sergey Gotliv 
sgot...@redhat.com
 Sent: Thursday, March 27, 2014 6:50:10 PM
 Subject: Re: [Users] Ovirt 3.4 EqualLogic multipath Bug 953343

 IIRC it should also support direct luns as well.
 Sergey?

 regards,
 Maor

 On 03/27/2014 06:25 PM, Gary Lloyd wrote:
  Hi I have just had a look at this thanks. Whilst it seemed promising we
  are in a situation where we use Direct Lun for all our production VM's
  in order to take advantage of being able to individually replicate and
  restore vm volumes using the SAN tools. Is multipath supported for
  Direct Luns or only data domains ?
 
  Thanks
 
  /Gary Lloyd/
  --
  IT Services
  Keele University
  ---
 
 
  On 27 March 2014 16:02, Maor Lipchuk mlipc...@redhat.com
  mailto:mlipc...@redhat.com wrote:
 
  Hi Gary,
 
  Please take a look at
  http://www.ovirt.org/Feature/iSCSI-Multipath#User_Experience
 
  Regards,
  Maor
 
  On 03/27/2014 05:59 PM, Gary Lloyd wrote:
   Hello
  
   I have just deployed Ovirt 3.4 on our test environment. Does
  anyone know
   how the ISCSI multipath issue is resolved ? At the moment it is
  behaving
   as before and only opening one session per lun ( we bodged vdsm
   python
   code in previous releases to get it to work).
  
   The Planning sheet shows that its fixed but I am not sure what to
do
   next:
 
https://docs.google.com/spreadsheet/ccc?key=0AuAtmJW_VMCRdHJ6N1M3d1F1UTJTS1dSMnZwMF9XWVEusp=drive_web#gid=0
  
  
   Thanks
  
   /Gary Lloyd/
   --
   IT Services
   Keele University
   ---
  
  
   ___
   Users mailing list
   Users@ovirt.org mailto:Users@ovirt.org
   http://lists.ovirt.org/mailman/listinfo/users
  
 
 


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



-- 
Martin Goldstone
IT Systems Administrator - Finance  IT
Keele University, Keele, Staffordshire, United Kingdom, ST5 5BG
Telephone: +44 1782 734457
G+: http://google.com/+MartinGoldstoneKeele
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] sanlock + gluster recovery -- RFE

2014-05-21 Thread Federico Simoncelli
- Original Message -
 From: Giuseppe Ragusa giuseppe.rag...@hotmail.com
 To: fsimo...@redhat.com
 Cc: users@ovirt.org
 Sent: Wednesday, May 21, 2014 5:15:30 PM
 Subject: sanlock + gluster recovery -- RFE
 
 Hi,
 
  - Original Message -
   From: Ted Miller tmiller at hcjb.org
   To: users users at ovirt.org
   Sent: Tuesday, May 20, 2014 11:31:42 PM
   Subject: [ovirt-users] sanlock + gluster recovery -- RFE
   
   As you are aware, there is an ongoing split-brain problem with running
   sanlock on replicated gluster storage. Personally, I believe that this is
   the 5th time that I have been bitten by this sanlock+gluster problem.
   
   I believe that the following are true (if not, my entire request is
   probably
   off base).
   
   
   * ovirt uses sanlock in such a way that when the sanlock storage is
   on a
   replicated gluster file system, very small storage disruptions can
   result in a gluster split-brain on the sanlock space
  
  Although this is possible (at the moment) we are working hard to avoid it.
  The hardest part here is to ensure that the gluster volume is properly
  configured.
  
  The suggested configuration for a volume to be used with ovirt is:
  
  Volume Name: (...)
  Type: Replicate
  Volume ID: (...)
  Status: Started
  Number of Bricks: 1 x 3 = 3
  Transport-type: tcp
  Bricks:
  (...three bricks...)
  Options Reconfigured:
  network.ping-timeout: 10
  cluster.quorum-type: auto
  
  The two options ping-timeout and quorum-type are really important.
  
  You would also need a build where this bug is fixed in order to avoid any
  chance of a split-brain:
  
  https://bugzilla.redhat.com/show_bug.cgi?id=1066996
 
 It seems that the aforementioned bug is peculiar to 3-bricks setups.
 
 I understand that a 3-bricks setup can allow proper quorum formation without
 resorting to first-configured-brick-has-more-weight convention used with
 only 2 bricks and quorum auto (which makes one node special, so not
 properly any-single-fault tolerant).

Correct.

 But, since we are on ovirt-users, is there a similar suggested configuration
 for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management
 properly configured and tested-working?
 I mean a configuration where any host can go south and oVirt (through the
 other one) fences it (forcibly powering it off with confirmation from IPMI
 or similar) then restarts HA-marked vms that were running there, all the
 while keeping the underlying GlusterFS-based storage domains responsive and
 readable/writeable (maybe apart from a lapse between detected other-node
 unresposiveness and confirmed fencing)?

We already had a discussion with gluster asking if it was possible to
add fencing to the replica 2 quorum/consistency mechanism.

The idea is that as soon as you can't replicate a write you have to
freeze all IO until either the connection is re-established or you
know that the other host has been killed.

Adding Vijay.
-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] emulated machine error

2014-05-21 Thread Nathanaël Blanchet
Hello
I was used tout run ovirt 3.2.2 installed from the dreyou repo and it has 
worked like a charm until now. I succeeded to upgrade to 3.3.5 official 
repository but I didn't pay attention with the host vdsm upgrade and I 
installed 4.14. So the 3.3.5 web engine complained that it wasn't the 
appropriated vdsm. Then I decided to upgrade engine to 3.4.0 el6 bit this time 
none of my 6 hosts get successfully activated and the error message is that 
host emulated is not the good one. I updated all of them to 6.5 with or without 
the official qemu it is the same. I remember I changed the 3.3 cluster 
compatibility for 3.4 but I can't rolling back to 3.3 compatibility (tells me 
it can't decrease) What can I do now? Fortunately I saved the initial 3.2 
webadmin image so that I can revert back to my to my 3.2 functionnal webadmin.
Is there such reported issue like mine?
Thank you for your help.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] sanlock + gluster recovery -- RFE

2014-05-21 Thread Ted Miller


On 5/21/2014 11:15 AM, Giuseppe Ragusa wrote:

Hi,

 - Original Message -
  From: Ted Miller tmiller at hcjb.org
  To: users users at ovirt.org
  Sent: Tuesday, May 20, 2014 11:31:42 PM
  Subject: [ovirt-users] sanlock + gluster recovery -- RFE
 
  As you are aware, there is an ongoing split-brain problem with running
  sanlock on replicated gluster storage. Personally, I believe that this is
  the 5th time that I have been bitten by this sanlock+gluster problem.
 
  I believe that the following are true (if not, my entire request is 
probably

  off base).
 
 
  * ovirt uses sanlock in such a way that when the sanlock storage is 
on a

  replicated gluster file system, very small storage disruptions can
  result in a gluster split-brain on the sanlock space

 Although this is possible (at the moment) we are working hard to avoid it.
 The hardest part here is to ensure that the gluster volume is properly
 configured.

 The suggested configuration for a volume to be used with ovirt is:

 Volume Name: (...)
 Type: Replicate
 Volume ID: (...)
 Status: Started
 Number of Bricks: 1 x 3 = 3
 Transport-type: tcp
 Bricks:
 (...three bricks...)
 Options Reconfigured:
 network.ping-timeout: 10
 cluster.quorum-type: auto

 The two options ping-timeout and quorum-type are really important.

 You would also need a build where this bug is fixed in order to avoid any
 chance of a split-brain:

 https://bugzilla.redhat.com/show_bug.cgi?id=1066996

It seems that the aforementioned bug is peculiar to 3-bricks setups.

I understand that a 3-bricks setup can allow proper quorum formation 
without resorting to first-configured-brick-has-more-weight convention 
used with only 2 bricks and quorum auto (which makes one node special, 
so not properly any-single-fault tolerant).


But, since we are on ovirt-users, is there a similar suggested 
configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power 
management properly configured and tested-working?
I mean a configuration where any host can go south and oVirt (through the 
other one) fences it (forcibly powering it off with confirmation from IPMI 
or similar) then restarts HA-marked vms that were running there, all the 
while keeping the underlying GlusterFS-based storage domains responsive and 
readable/writeable (maybe apart from a lapse between detected other-node 
unresposiveness and confirmed fencing)?


Furthermore: is such a suggested configuration possible in a 
self-hosted-engine scenario?


Regards,
Giuseppe

  How did I get into this mess?
 
  ...
 
  What I would like to see in ovirt to help me (and others like me). 
Alternates

  listed in order from most desirable (automatic) to least desirable (set of
  commands to type, with lots of variables to figure out).

 The real solution is to avoid the split-brain altogether. At the moment it
 seems that using the suggested configurations and the bug fix we shouldn't
 hit a split-brain.

  1. automagic recovery
 
  2. recovery subcommand
 
  3. script
 
  4. commands

 I think that the commands to resolve a split-brain should be documented.
 I just started a page here:

 http://www.ovirt.org/Gluster_Storage_Domain_Reference
I suggest you add these lines to the Gluster configuration, as I have seen 
this come up multiple times on the User list:


storage.owner-uid: 36
storage.owner-gid: 36

Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Ovirt WebAdmin Portal

2014-05-21 Thread Carlos Castillo
regards,

My Name is Carlos and I'm a I:T Admin. I recently noticed that my oVirt Web
Admin portal shows 0% memory usage for all my virtual machines, a situation
that does not seem right to me.

I have tried to search for information about why it may be happening that,
but I have not found anything useful, so I go to this list looking for
ideas?

My oVirt Engine Version: 3.3.1-2.el6,
O.S. CentOS release 6.4 (Final)


-- 
Carlos J. Castillo
--
Ingeniero de Soluciones de TI
+58 426 2542313
@Dr4g0nKn1ght
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Ovirt WebAdmin Portal

2014-05-21 Thread John Taylor
On Wed, May 21, 2014 at 4:17 PM, Carlos Castillo 
carlos.casti...@globalr.net wrote:

 regards,

 My Name is Carlos and I'm a I:T Admin. I recently noticed that my oVirt
 Web Admin portal shows 0% memory usage for all my virtual machines, a
 situation that does not seem right to me.

 I have tried to search for information about why it may be happening that,
 but I have not found anything useful, so I go to this list looking for
 ideas?

 My oVirt Engine Version: 3.3.1-2.el6,
 O.S. CentOS release 6.4 (Final)


 --
 Carlos J. Castillo

 --
 Ingeniero de Soluciones de TI
 +58 426 2542313
 @Dr4g0nKn1ght


 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

 Hi Carlos,
The memory usage is populated by the guest agent, so you need it installed
on the guests.
Seehttp://www.ovirt.org/Guest_Agent

-John
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Ovirt WebAdmin Portal

2014-05-21 Thread Karli Sjöberg

Den 21 maj 2014 22:18 skrev Carlos Castillo carlos.casti...@globalr.net:

 regards,

 My Name is Carlos and I'm a I:T Admin. I recently noticed that my oVirt Web 
 Admin portal shows 0% memory usage for all my virtual machines, a situation 
 that does not seem right to me.

 I have tried to search for information about why it may be happening that, 
 but I have not found anything useful, so I go to this list looking for ideas?

You need to install the ovirt-guest-agent in all VM's for that to show. You 
will also get the VM’s IP, fqdn, some installed packages and who’s currently 
logged into it, among other things.

/K


 My oVirt Engine Version: 3.3.1-2.el6,
 O.S. CentOS release 6.4 (Final)


 --
 Carlos J. Castillo
 --
 Ingeniero de Soluciones de TI
 +58 426 2542313
 @Dr4g0nKn1ght

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] post glusterfs 3.4 - 3.5 upgrade issue in ovirt (3.4.0-1.fc19): bricks unavailable

2014-05-21 Thread Alastair Neil
I just did a rolling upgrade of my gluster storage cluster to the latest
3.5 bits.  This all seems to have gone smoothly and all the volumes are on
line.  All volumes are replicated 1x2

The ovirt console now insists that two of my volumes , including the
vm-store  volume with my vm's happily running have no bricks up.

It reports Up but all bricks are down

This would seem to be impossible.  Gluster  on the nodes itself reports no
issues

[root@gluster1 ~]# gluster volume status vm-store
 Status of volume: vm-store
 Gluster process Port Online Pid

 --
 Brick gluster0:/export/brick0/vm-store 49158 Y 2675
 Brick gluster1:/export/brick4/vm-store 49158 Y 2309
 NFS Server on localhost 2049 Y 27012
 Self-heal Daemon on localhost N/A Y 27019
 NFS Server on gluster0 2049 Y 12875
 Self-heal Daemon on gluster0 N/A Y 12882

 Task Status of Volume vm-store

 --
 There are no active volume tasks



As I mentioned the vms are running happily
initially the ISOs volume had the same issue.  I did a volume start and
stop on the volume as it was not being activly used and that cleared up the
issue in the console.  However, as I have VMs running  I can't so this for
the vm-store volume.


Any suggestions, Alastair
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Ovirt WebAdmin Portal

2014-05-21 Thread Carlos Castillo
Thank you both for the info, tomorrow  I will be doing tests on my
computers.

Regards


2014-05-21 16:02 GMT-04:30 Karli Sjöberg karli.sjob...@slu.se:


 Den 21 maj 2014 22:18 skrev Carlos Castillo carlos.casti...@globalr.net:

 
  regards,
 
  My Name is Carlos and I'm a I:T Admin. I recently noticed that my oVirt
 Web Admin portal shows 0% memory usage for all my virtual machines, a
 situation that does not seem right to me.
 
  I have tried to search for information about why it may be happening
 that, but I have not found anything useful, so I go to this list looking
 for ideas?

 You need to install the ovirt-guest-agent in all VM's for that to show.
 You will also get the VM’s IP, fqdn, some installed packages and who’s
 currently logged into it, among other things.

 /K

 
  My oVirt Engine Version: 3.3.1-2.el6,
  O.S. CentOS release 6.4 (Final)
 
 
  --
  Carlos J. Castillo
 
 --
  Ingeniero de Soluciones de TI
  +58 426 2542313
  @Dr4g0nKn1ght
 




-- 
Carlos J. Castillo
--
Ingeniero de Soluciones de TI
+58 426 2542313
@Dr4g0nKn1ght
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] 1009100 likely be be fixed for 3.5.0?

2014-05-21 Thread Paul Jansen
Hello.
There are a few bugs that are related to live snapshots/storage migrations not 
working.
https://bugzilla.redhat.com/show_bug.cgi?id=1009100 is one of them and is 
targeted for 3.5.0.
According to the bug there is some engine work still required.

I understand that with EL6 based hosts live storage migration will still not 
work (due to a too old QEMU version), but it should work with F20/21 hosts (and 
EL7 hosts when that comes online).
Am I correct in assuming that in a cluster with both EL6 hosts and new hosts 
(described above) that ovirt will allow live storage migration on hosts that 
support it and prevent the option from appearing on hosts that do not?

The possibility of getting a newer QEMU on EL6 appears to be tied up in the 
Centos virt SIG and their proposed repository, which appears to be moving quite 
slowly.


I'm looking forward to ovirt closing the gap to vmware vcenter in regard to 
live storage migrations.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] post glusterfs 3.4 - 3.5 upgrade issue in ovirt (3.4.0-1.fc19): bricks unavailable

2014-05-21 Thread Kanagaraj

engine.log and vdsm.log?

This can mostly happen due to following reasons
- gluster volume status vm-store is not consistently returning the 
right output

- ovirt-engine is not able to identify the bricks properly

Anyway, engine.log will give better clarity.


On 05/22/2014 02:24 AM, Alastair Neil wrote:
I just did a rolling upgrade of my gluster storage cluster to the 
latest 3.5 bits.  This all seems to have gone smoothly and all the 
volumes are on line.  All volumes are replicated 1x2


The ovirt console now insists that two of my volumes , including the 
vm-store  volume with my vm's happily running have no bricks up.


It reports Up but all bricks are down

This would seem to be impossible.  Gluster  on the nodes itself 
reports no issues


[root@gluster1 ~]# gluster volume status vm-store
Status of volume: vm-store
Gluster processPortOnlinePid

--
Brick gluster0:/export/brick0/vm-store49158Y2675
Brick gluster1:/export/brick4/vm-store49158Y2309
NFS Server on localhost2049Y27012
Self-heal Daemon on localhostN/AY27019
NFS Server on gluster02049Y12875
Self-heal Daemon on gluster0N/AY12882

Task Status of Volume vm-store

--
There are no active volume tasks



As I mentioned the vms are running happily
initially the ISOs volume had the same issue.  I did a volume start 
and stop on the volume as it was not being activly used and that 
cleared up the issue in the console.  However, as I have VMs running 
 I can't so this for the vm-store volume.



Any suggestions, Alastair



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] engine upgrade 3.2.2 -- 3.2.3 Database rename failed

2014-05-21 Thread Neil
Hi guys,  sorry to repost but getting a bit desperate. Is anyone able to
assist?

Thanks.

Regards.

Neil Wilson
On 21 May 2014 12:06 PM, Neil nwilson...@gmail.com wrote:

 Hi guys,

 Just a little more info on the problem. I've upgraded another oVirt
 system before from Dreyou and it worked perfectly, however on this
 particular system, we had to restore from backups (DB PKI and
 etc/ovirt-engine) as the physical machine died that was hosting the
 engine, so perhaps this is the reason we encountering this problem
 this time around...

 Any help is greatly appreciated.

 Thank you.

 Regards.

 Neil Wilson.



 On Wed, May 21, 2014 at 11:46 AM, Sven Kieske s.kie...@mittwald.de
 wrote:
  Hi,
 
  I don't know the exact resolution for this, but I'll add some people
  who managed to make it work, following this tutorial:
  http://wiki.dreyou.org/dokuwiki/doku.php?id=ovirt_rpm_start33
 
  See this thread on the users ML:
 
  http://lists.ovirt.org/pipermail/users/2013-December/018341.html
 
  HTH
 
 
  Am 20.05.2014 17:00, schrieb Neil:
  Hi guys,
 
  I'm trying to upgrade from Dreyou to the official repo, I've installed
  the official 3.2 repo (I'll do the 3.3 update once this works). I've
  updated to ovirt-engine-setup.noarch 0:3.2.3-1.el6 and when I run
  engine upgrade it bombs out when trying to rename my database with the
  following error...
 
  [root@engine01 /]#  cat
  /var/log/ovirt-engine/ovirt-engine-upgrade_2014_05_20_16_34_21.log
  2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
  pgpass file /etc/ovirt-engine/.pgpass, fetching DB host value
  2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
  pgpass file /etc/ovirt-engine/.pgpass, fetching DB port value
  2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
  pgpass file /etc/ovirt-engine/.pgpass, fetching DB user value
  2014-05-20 16:34:21::DEBUG::common_utils::332::root:: YUM: VERB:
  Loaded plugins: refresh-packagekit, versionlock
  2014-05-20 16:34:21::INFO::engine-upgrade::969::root:: Info:
  /etc/ovirt-engine/.pgpass file found. Continue.
  2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
  pgpass file /etc/ovirt-engine/.pgpass, fetching DB admin value
  2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
  pgpass file /etc/ovirt-engine/.pgpass, fetching DB host value
  2014-05-20 16:34:21::DEBUG::common_utils::804::root:: found existing
  pgpass file /etc/ovirt-engine/.pgpass, fetching DB port value
  2014-05-20 16:34:21::DEBUG::common_utils::481::root:: running sql
  query 'SELECT pg_database_size('engine')' on db server: 'localhost'.
  2014-05-20 16:34:21::DEBUG::common_utils::434::root:: Executing
  command -- '/usr/bin/psql -h localhost -p 5432 -U postgres -d
  postgres -c SELECT pg_database_size('engine')'
  2014-05-20 16:34:21::DEBUG::common_utils::472::root:: output =
  pg_database_size
  --
   11976708
  (1 row)
 
 
  2014-05-20 16:34:21::DEBUG::common_utils::473::root:: stderr =
  2014-05-20 16:34:21::DEBUG::common_utils::474::root:: retcode = 0
  2014-05-20 16:34:21::DEBUG::common_utils::1567::root:: Found mount
  point of '/var/cache/yum' at '/'
  2014-05-20 16:34:21::DEBUG::common_utils::663::root:: Checking
  available space on /var/cache/yum
  2014-05-20 16:34:21::DEBUG::common_utils::668::root:: Available space
  on /var/cache/yum is 172329
  2014-05-20 16:34:21::DEBUG::common_utils::1567::root:: Found mount
  point of '/var/lib/ovirt-engine/backups' at '/'
  2014-05-20 16:34:21::DEBUG::common_utils::663::root:: Checking
  available space on /var/lib/ovirt-engine/backups
  2014-05-20 16:34:21::DEBUG::common_utils::668::root:: Available space
  on /var/lib/ovirt-engine/backups is 172329
  2014-05-20 16:34:21::DEBUG::common_utils::1567::root:: Found mount
  point of '/usr/share' at '/'
  2014-05-20 16:34:21::DEBUG::common_utils::663::root:: Checking
  available space on /usr/share
  2014-05-20 16:34:21::DEBUG::common_utils::668::root:: Available space
  on /usr/share is 172329
  2014-05-20 16:34:21::DEBUG::common_utils::1590::root:: Mount points
  are: {'/': {'required': 1511, 'free': 172329}}
  2014-05-20 16:34:21::DEBUG::common_utils::1599::root:: Comparing free
  space 172329 MB with required 1511 MB
  2014-05-20 16:34:21::DEBUG::common_utils::481::root:: running sql
  query 'SELECT compatibility_version FROM storage_pool;' on db server:
  'localhost'.
  2014-05-20 16:34:21::DEBUG::common_utils::434::root:: Executing
  command -- '/usr/bin/psql -h localhost -p 5432 -U engine -d engine -c
  SELECT compatibility_version FROM storage_pool;'
  2014-05-20 16:34:21::DEBUG::common_utils::472::root:: output =
  compatibility_version
  ---
   3.2
  (1 row)
 
 
  2014-05-20 16:34:21::DEBUG::common_utils::473::root:: stderr =
  2014-05-20 16:34:21::DEBUG::common_utils::474::root:: retcode = 0
  2014-05-20 16:34:21::DEBUG::common_utils::481::root:: running sql
  query 'SELECT compatibility_version 

Re: [ovirt-users] IO errors when adding new disk on iSCSI storage

2014-05-21 Thread Morten A. Middelthon

Hi list,

I just tried this again with preallocated disk, otherwise the exact same 
procedure as described in my original post. No problems at all so far


with regards,

--
Morten A. Middelthon
Email: mor...@flipp.net
Phone: +47 907 83 708
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users