Re: [ovirt-users] [Users] Migrate cluster 3.3 -> 3.4 hosted on existing hosts

2014-05-23 Thread Ted Miller


On 4/2/2014 1:58 AM, Yedidyah Bar David wrote:

- Original Message -

From: "Ted Miller" 
To: "users" 
Sent: Tuesday, April 1, 2014 10:40:38 PM
Subject: [Users] Migrate cluster 3.3 -> 3.4 hosted on existing hosts

Current setup:
 * 3 identical hosts running on HP GL180 g5 servers
 * gluster running 5 volumes in replica 3
 * engine running on VMWare Server on another computer (that computer is
 NOT available to convert to a host)

Where I want to end up:
 * 3 identical hosted-engine hosts running on HP GL180 g5 servers
 * gluster running 6 volumes in replica 3
 * new volume will be nfs storage for engine VM
 * hosted engine in oVirt VM
 * as few changes to current setup as possible

The two pages I found on the wiki are: Hosted Engine Howto and Migrate to
Hosted Engine . Both were written during the testing process, and have not
been updated to reflect production status. I don't know if anything in the
process has changed since they were written.

Basically things remained the same, with some details changing perhaps.


Process outlined in above two pages (as I understand it):

have nfs file store ready to hold VM

Do minimal install (not clear if ovirt node, Centos, or Fedora was used--I am
Centos-based)

Fedora/Centos/RHEL are supposed to work. ovirt node is currently not
supported - iirc it's planned to be supported soon, not sure.


# yum install ovirt-hosted-engine-setup
# hosted-engine --deploy


Install OS on VM


return to host console


at "Please install the engine in the VM" prompt on host


on VM console
# yum install ovirt-engine


on old engine:
service ovirt-engine stop
chkconfig ovirt-engine off

set up dns for new engine


# engine-backup --mode=backup --file=backup1 --log=backup1.log
scp backup file to new engine VM


on new VM:

Please see [1]. Specifically, if you had a local db, you'll first have
to create it yourself.

[1] http://www.ovirt.org/Ovirt-engine-backup#Howto


# engine-backup --mode=restore --file=backup1 --log=backup1-restore.log
--change-db-credentials --db-host=didi-lap --db-user=engine --db-password
--db-name=engine

The above assumes a db was already created and ready to use (access etc)
using the supplied credentials. You'll naturally have to provide your own.


# engine-setup

on host:
run script until: "The system will wait until the VM is down."

on new VM:
# reboot

on Host: finish script
My questions:

1. Is the above still the recommended way to do a hosted-engine install?

Yes.


2. Will it blow up at me if I use my existing host (with glusterfs all set
up, etc) as the starting point, instead of a clean install?

a. Probably yes, for now. I did not hear much about testing such a migration
using an existing host - ovirt or gluster or both. I did not test that myself
either.

If at all possible, you should use a new clean host. Do plan well and test.

Also see discussions on the mailing lists, e.g. this one:

http://lists.ovirt.org/pipermail/users/2014-March/thread.html#22441

Good luck, and please report back!

I have good news and bad news.

I migrated the 3 host cluster from 3.4 to 3.4 hosted.  The process went 
fairly smoothly.  Engine ran, I was able to add the three hosts to the 
engine's domain, etc.  That was all working about Thursday. (I did not get 
fencing set up).


Friday, at the end of the day, I shut down the entire system (it is not yet 
in production) because I was leaving for a week's vacation/holiday.  I am 
fairly certain that I put the system into global maintenance mode before 
shutting down.  I know I shut down the engine before shutting down the hosts.


Monday (10 days later) I came back from vacation and powered up the three 
machines.  The hosts came up fine, but the engine will not start.  (I found 
some gluster split-brain errors, and chased that for a couple of days, until 
I realized that the split-brain was not the fundamental problem.)


During bootup /var/log/messages shows:

May 21 19:22:00 s2 ovirt-ha-broker mgmt_bridge.MgmtBridge ERROR Failed to 
getVdsCapabilities: VDSM initialization timeout
May 21 19:22:00 s2 ovirt-ha-broker mem_free.MemFree ERROR Failed to 
getVdsStats: VDSM initialization timeout
May 21 19:22:00 s2 ovirt-ha-broker cpu_load_no_engine.EngineHealth ERROR Failed 
to getVmStats: VDSM initialization timeout
May 21 19:22:00 s2 ovirt-ha-broker engine_health.CpuLoadNoEngine ERROR Failed 
to getVmStats: VDSM initialization timeout
May 21 19:22:03 s2 vdsm vds WARNING Unable to load the json rpc server module. 
Please make sure it is installed.


and then /var/log/ovirt-hosted-engine-ha/agent.log shows:

MainThread::ERROR::2014-05-21 
19:22:04,198::hosted_engine::414::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm)
 Failed trying to connect storage:
MainThread::CRITICAL::2014-05-21 
19:22:04,199::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::

Re: [ovirt-users] sanlock + gluster recovery -- RFE

2014-05-23 Thread Ted Miller

Vijay, I am not a member of the developer list, so my comments are at end.

On 5/23/2014 6:55 AM, Vijay Bellur wrote:

On 05/21/2014 10:22 PM, Federico Simoncelli wrote:

- Original Message -

From: "Giuseppe Ragusa" 
To: fsimo...@redhat.com
Cc: users@ovirt.org
Sent: Wednesday, May 21, 2014 5:15:30 PM
Subject: sanlock + gluster recovery -- RFE

Hi,


- Original Message -

From: "Ted Miller" 
To: "users" 
Sent: Tuesday, May 20, 2014 11:31:42 PM
Subject: [ovirt-users] sanlock + gluster recovery -- RFE

As you are aware, there is an ongoing split-brain problem with running
sanlock on replicated gluster storage. Personally, I believe that this is
the 5th time that I have been bitten by this sanlock+gluster problem.

I believe that the following are true (if not, my entire request is
probably
off base).


 * ovirt uses sanlock in such a way that when the sanlock storage is
 on a
 replicated gluster file system, very small storage disruptions can
 result in a gluster split-brain on the sanlock space


Although this is possible (at the moment) we are working hard to avoid it.
The hardest part here is to ensure that the gluster volume is properly
configured.

The suggested configuration for a volume to be used with ovirt is:

Volume Name: (...)
Type: Replicate
Volume ID: (...)
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
(...three bricks...)
Options Reconfigured:
network.ping-timeout: 10
cluster.quorum-type: auto

The two options ping-timeout and quorum-type are really important.

You would also need a build where this bug is fixed in order to avoid any
chance of a split-brain:

https://bugzilla.redhat.com/show_bug.cgi?id=1066996


It seems that the aforementioned bug is peculiar to 3-bricks setups.

I understand that a 3-bricks setup can allow proper quorum formation without
resorting to "first-configured-brick-has-more-weight" convention used with
only 2 bricks and quorum "auto" (which makes one node "special", so not
properly any-single-fault tolerant).


Correct.


But, since we are on ovirt-users, is there a similar suggested configuration
for a 2-hosts setup oVirt+GlusterFS with oVirt-side power management
properly configured and tested-working?
I mean a configuration where "any" host can go south and oVirt (through the
other one) fences it (forcibly powering it off with confirmation from IPMI
or similar) then restarts HA-marked vms that were running there, all the
while keeping the underlying GlusterFS-based storage domains responsive and
readable/writeable (maybe apart from a lapse between detected other-node
unresposiveness and confirmed fencing)?


We already had a discussion with gluster asking if it was possible to
add fencing to the replica 2 quorum/consistency mechanism.

The idea is that as soon as you can't replicate a write you have to
freeze all IO until either the connection is re-established or you
know that the other host has been killed.

Adding Vijay.
There is a related thread on gluster-devel [1] to have a better behavior in 
GlusterFS for prevention of split brains with sanlock and 2-way replicated 
gluster volumes.


Please feel free to comment on the proposal there.

Thanks,
Vijay

[1] http://supercolony.gluster.org/pipermail/gluster-devel/2014-May/040751.html

One quick note before my main comment: I see references to quorum being "N/2 
+ 1".  Isn't if more accurate to say that quorum is "(N + 1)/2" or "N/2 + 0.5"?


Now to my main comment.

I see a case that is not being addressed.  I have no proof of how often this 
use-case occurs, but I believe that is does occur.  (It could (theoretically) 
occur in any situation where multiple bricks are writing to different parts 
of the same file.)


Use-case: sanlock via fuse client.

Steps to produce originally

   (not tested for reproducibility, because I was unable to recover the
   ovirt cluster after occurrence, had to rebuild from scratch), time frame
   was late 2013 or early 2014

   2 node ovirt cluster using replicated gluster storage
   ovirt cluster up and running VMs
   remove power from network switch
   restore power to network switch after a few minutes

Result

   both copies of .../dom_md/ids file accused the other of being out of sync

Hypothesis of cause

   servers (ovirt nodes and gluster bricks) are called A and B
   At the moment when network communication was lost, or just a moment after
   communication was lost

   A had written to local ids file
   A had started process to send write to B
   A had not received write confirmation from B
   and
   B had written to local ids file
   B had started process to send write to A
   B had not received write confirmation from A

   Thus, each file had a segment that had been written to the local file,
   but had not been confirmed written on the remote file.  Each file
   correctl

Re: [ovirt-users] sanlock + gluster recovery -- RFE

2014-05-21 Thread Ted Miller


On 5/21/2014 11:15 AM, Giuseppe Ragusa wrote:

Hi,

> - Original Message -
> > From: "Ted Miller" 
> > To: "users" 
> > Sent: Tuesday, May 20, 2014 11:31:42 PM
> > Subject: [ovirt-users] sanlock + gluster recovery -- RFE
> >
> > As you are aware, there is an ongoing split-brain problem with running
> > sanlock on replicated gluster storage. Personally, I believe that this is
> > the 5th time that I have been bitten by this sanlock+gluster problem.
> >
> > I believe that the following are true (if not, my entire request is 
probably

> > off base).
> >
> >
> > * ovirt uses sanlock in such a way that when the sanlock storage is 
on a

> > replicated gluster file system, very small storage disruptions can
> > result in a gluster split-brain on the sanlock space
>
> Although this is possible (at the moment) we are working hard to avoid it.
> The hardest part here is to ensure that the gluster volume is properly
> configured.
>
> The suggested configuration for a volume to be used with ovirt is:
>
> Volume Name: (...)
> Type: Replicate
> Volume ID: (...)
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> (...three bricks...)
> Options Reconfigured:
> network.ping-timeout: 10
> cluster.quorum-type: auto
>
> The two options ping-timeout and quorum-type are really important.
>
> You would also need a build where this bug is fixed in order to avoid any
> chance of a split-brain:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1066996

It seems that the aforementioned bug is peculiar to 3-bricks setups.

I understand that a 3-bricks setup can allow proper quorum formation 
without resorting to "first-configured-brick-has-more-weight" convention 
used with only 2 bricks and quorum "auto" (which makes one node "special", 
so not properly any-single-fault tolerant).


But, since we are on ovirt-users, is there a similar suggested 
configuration for a 2-hosts setup oVirt+GlusterFS with oVirt-side power 
management properly configured and tested-working?
I mean a configuration where "any" host can go south and oVirt (through the 
other one) fences it (forcibly powering it off with confirmation from IPMI 
or similar) then restarts HA-marked vms that were running there, all the 
while keeping the underlying GlusterFS-based storage domains responsive and 
readable/writeable (maybe apart from a lapse between detected other-node 
unresposiveness and confirmed fencing)?


Furthermore: is such a suggested configuration possible in a 
self-hosted-engine scenario?


Regards,
Giuseppe

> > How did I get into this mess?
> >
> > ...
> >
> > What I would like to see in ovirt to help me (and others like me). 
Alternates

> > listed in order from most desirable (automatic) to least desirable (set of
> > commands to type, with lots of variables to figure out).
>
> The real solution is to avoid the split-brain altogether. At the moment it
> seems that using the suggested configurations and the bug fix we shouldn't
> hit a split-brain.
>
> > 1. automagic recovery
> >
> > 2. recovery subcommand
> >
> > 3. script
> >
> > 4. commands
>
> I think that the commands to resolve a split-brain should be documented.
> I just started a page here:
>
> http://www.ovirt.org/Gluster_Storage_Domain_Reference
I suggest you add these lines to the Gluster configuration, as I have seen 
this come up multiple times on the User list:


storage.owner-uid: 36
storage.owner-gid: 36

Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] sanlock + gluster recovery -- RFE

2014-05-20 Thread Ted Miller
Itamar, I am addressing this to you because one of your assignments seems to 
be to coordinate other oVirt contributors when dealing with issues that are 
raised on the ovirt-users email list.


As you are aware, there is an ongoing split-brain problem with running 
sanlock on replicated gluster storage.  Personally, I believe that this is 
the 5th time that I have been bitten by this sanlock+gluster problem.


I believe that the following are true (if not, my entire request is probably 
off base).


 * ovirt uses sanlock in such a way that when the sanlock storage is on a
   replicated gluster file system, very small storage disruptions can result
   in a gluster split-brain on the sanlock space
 o gluster is aware of the problem, and is working on a different way of
   replicating data, which will reduce these problems.
 * most (maybe all) of the sanlock locks have a short duration, measured in
   seconds
 * there are only a couple of things that a user can safely do from the
   command line when a file is in split-brain
 o delete the file
 o rename (mv) the file
 * x

_How did I get into this mess?_

had 3 hosts running ovirt 3.3
each hosted VMs
gluster replica 3 storage
engine was external to cluster
upgraded 3 hosts from ovirt 3.3 to 3.4
hosted-engine deploy
used new gluster volume (accessed via nfs) for storage
storage was accessed using localhost:engVM1 link (localhost was 
probably a poor choice)

created new engine on VM (did not transfer any data from old engine)
added 3 hosts to new engine via web-gui
ran above setup for 3 days
shut entire system down before I left on vacation (holiday)
came back from vacation
powered on hosts
found that iptables did not have rules for gluster access
(a continuing problem if host installation is allowed to set up firewall)
added rules for gluster
glusterfs now up and running
added storage manually
tried "hosted-engine --vm-start"
vm did not start
logs show sanlock errors
"gluster volume heal engVM1full:
"gluster volume heal engVM1 info split-brain" showed 6 files in split-brain
all 5 prefixed by /rhev/data-center/mnt/localhost\:_engVM1
UUID/dom_md/ids
UUID/images/UUID/UUID (VM hard disk)
UUID/images/UUID/UUID.lease
UUID/ha_agent/hosted-engine.lockspace
UUID/ha_agent/hosted-engine.metadata
I copied each of the above files off of each of the three bricks to a safe 
place (15 files copied)

I renamed the 5 files on /rhev/
I copied the 5 files from one of the bricks to /rhev/
files can now be read OK (e.g. cat ids)
sanlock.log shows error sets like these:

2014-05-20 03:23:39-0400 36199 [2843]: s3358 lockspace 
5ebb3b40-a394-405b-bbac-4c0e21ccd659:1:/rhev/data-center/mnt/localhost:_engVM1/5ebb3b40-a394-405b-bbac-4c0e21ccd659/dom_md/ids:0
2014-05-20 03:23:39-0400 36199 [18873]: open error -5 
/rhev/data-center/mnt/localhost:_engVM1/5ebb3b40-a394-405b-bbac-4c0e21ccd659/dom_md/ids
2014-05-20 03:23:39-0400 36199 [18873]: s3358 open_disk 
/rhev/data-center/mnt/localhost:_engVM1/5ebb3b40-a394-405b-bbac-4c0e21ccd659/dom_md/ids
 error -5
2014-05-20 03:23:40-0400 36200 [2843]: s3358 add_lockspace fail result -19

I am now stuck

What I would like to see in ovirt to help me (and others like me). Alternates 
listed in order from most desirable (automatic) to least desirable (set of 
commands to type, with lots of variables to figure out).


1. automagic recovery

 *   When a host is not able to access sanlock, it writes a small "problem"
   text file into the shared storage
 o the host-ID as part of the name (so only one host ever accesses that
   file)
 o a status number for the error causing problems
 o time stamp
 o time stamp when last sanlock lease will expire
 o if sanlock is able to access the file, the "problem" file is deleted
 * when time passes for its last sanlock lease to be expired, highest number
   host does a survey
 o did all other hosts create "problem" files?
 o do all "problem" files show same (or compatible) error codes related
   to file access problems?
 o are all hosts communicating by network?
 o if yes to all above
 * delete all sanlock storage space
 * initialize sanlock from scratch
 * restart whatever may have given up because of sanlock
 * restart VM if necessary

2. recovery subcommand

 * add "hosted-engine --lock-initialize" command that would delete sanlock,
   start over from scratch

3. script

 * publish a script (in ovirt packages or available on web) which, when run,
   does all (or most) of the recovery process needed.

4. commands

 * publish on the web a "recipe" for dealing with files that commonly go
   split-brain
 o ids
     o *.lease
 o *.lockspace

Any chance of any help on any of the above levels?

Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-14 Thread Ted Miller



On 4/12/2014 12:23 PM, Itamar Heim wrote:

On 04/12/2014 03:40 PM, Maurice James wrote:
What did you do to try to fix the sanlock? Anything is better than nothing 
at this point

My thread is at http://lists.ovirt.org/pipermail/users/2014-January/020394.html


- Original Message -
From: "Ted Miller" 
To: "Maurice James" 
Sent: Friday, April 11, 2014 7:27:24 PM
Subject: Re: [ovirt-users] SPM error

I did receive some help on one stage of rebuilding my sanlock, but there were
too many other things wrong to get it started again. Only advice I have is --
look at your sanlock logs, and see if you can find anything there that is
helpful.

On 4/11/2014 7:23 PM, Maurice James wrote:

Nooo.


Sent from my Galaxy S®III

 Original message ----
From: Ted Miller 
Date:04/11/2014  7:08 PM  (GMT-05:00)
To: Maurice James 
Subject: Re: [ovirt-users] SPM error



I didn't, really.  I did something wrong along the way, and ended up having
to rebuild the engine and hosts.  (My problems were due to a glusterfs
split-brain.)
Ted Miller

On 4/11/2014 6:03 PM, Maurice James wrote:

How did you fix it?


Sent from my Galaxy S®III

 Original message 
From: Ted Miller 
Date:04/11/2014  6:00 PM  (GMT-05:00)
To: users@ovirt.org
Subject: Re: [ovirt-users] SPM error



On 4/11/2014 2:05 PM, Maurice James wrote:

I have an error trying to bring the master DC back online. After several
reboots, no luck. I took the other cluster members offline to try to
troubleshoot. The remaining host is constantly in contention with itself
for SPM


ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-40) [38d400ea]
IrsBroker::Failed::GetStoragePoolInfoVDS due to:
IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
SpmStart failed


I'm no expert, but the last time I beat my head on that rock, something was
wrong with my sanlock storage.  YMMV
Ted Miller
Elkhart, IN, USA





Maurice - which type of storage is this?


--
"He is no fool who gives what he cannot keep, to gain what he cannot lose." - - 
Jim Elliot
For more information about Jim Elliot and his unusual life, see 
http://www.christianliteratureandliving.com/march2003/carolyn.html.

Ted Miller
Design Engineer
HCJB Global Technology Center, a ministry of Reach Beyond
2830 South 17th St
Elkhart, IN  46517
574--970-4272 my desk
574--970-4252 receptionist

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-11 Thread Ted Miller

On 4/11/2014 2:05 PM, Maurice James wrote:
I have an error trying to bring the master DC back online. After several 
reboots, no luck. I took the other cluster members offline to try to 
troubleshoot. The remaining host is constantly in contention with itself 
for SPM



ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(DefaultQuartzScheduler_Worker-40) [38d400ea] 
IrsBroker::Failed::GetStoragePoolInfoVDS due to: 
IrsSpmStartFailedException: IRSGenericException: IRSErrorException: 
SpmStart failed


I'm no expert, but the last time I beat my head on that rock, something was 
wrong with my sanlock storage.  YMMV

Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Migrate cluster 3.3 -> 3.4 hosted on existing hosts

2014-04-01 Thread Ted Miller

Current setup:

 * 3 identical hosts running on HP GL180 g5 servers
 o gluster running 5 volumes in replica 3
 * engine running on VMWare Server on another computer (that computer is NOT
   available to convert to a host)

Where I want to end up:

 * 3 identical hosted-engine hosts running on HP GL180 g5 servers
 o gluster running 6 volumes in replica 3
 + new volume will be nfs storage for engine VM
 * hosted engine in oVirt VM
 * as few changes to current setup as possible

The two pages I found on the wiki are: Hosted Engine Howto 
<http://www.ovirt.org/Hosted_Engine_Howto> and Migrate to Hosted Engine 
<http://www.ovirt.org/Migrate_to_Hosted_Engine>.  Both were written during 
the testing process, and have not been updated to reflect production status. 
I don't know if anything in the process has changed since they were written.


Process outlined in above two pages (as I understand it):

   have nfs file store ready to hold VM

   Do minimal install (not clear if ovirt node, Centos, or Fedora was
   used--I am Centos-based)

   # yum install ovirt-hosted-engine-setup
   # hosted-engine --deploy

   Install OS on VM

   return to host console

   at "Please install the engine in the VM" prompt on host

   on VM console
   # yum install ovirt-engine

   on old engine:
   service ovirt-engine stop
   chkconfig ovirt-engine off

   set up dns for new engine

   # engine-backup --mode=backup --file=backup1 --log=backup1.log
   scp backup file to new engine VM

   on new VM:
   # engine-backup --mode=restore --file=backup1 --log=backup1-restore.log
   --change-db-credentials --db-host=didi-lap --db-user=engine --db-password
   --db-name=engine
   # engine-setup

   on host:
   run script until: "The system will wait until the VM is down."

   on new VM:
   # reboot

   on Host: finish script

My questions:

1. Is the above still the recommended way to do a hosted-engine install?

2. Will it blow up at me if I use my existing host (with glusterfs all set 
up, etc) as the starting point, instead of a clean install?


Thank you for letting me benefit from your experience,
Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Force certain VMs to be on different hosts

2014-03-24 Thread Ted Miller

Scott

On 3/19/2014 10:30 AM, Scott Ocken wrote:

Ted,

Yes!  This is exactly what I was looking for.  I think you described it 
better than I did.  This feature would be really nice.


Thanks
Scott

Quoting Ted Miller :

I think what the OP is asking for a designation as a "redundant group 1".  
He may have 10 hosts and 3 VMs in "redundant group 1". He doesn't care 
which hosts they run on, as long as they are three separate hosts.


I can see this as being fairly widely applicable.  If you have multiple 
web servers for load sharing, you don't want them all running on the same 
host, because VM load is going to peak on them at the same times.  oVirt 
has no way of knowing that unless you give oVirt a hint to spread things 
around.  The web group might also want to split up the server that spreads 
the jobs around, and a database server used by all the web hosts.  I can 
see easily ending up with a group of 5 machines (3 web servers, a load 
sharing controller, and a database server) that you want spread across any 
5 of the 15 servers in a cluster, because their loads are all going to 
spike together. You don't want oVirt having to try to migrate some of them 
during a load spike, because oVirt noticed that a host with 3 of the 5 is 
overloaded.


Not my situation, but one I can see the usefulness of.
Ted Miller
Elkhart, IN, USA

On 3/18/2014 11:54 AM, Meital Bourvine wrote:

Hi Scott,

Click on a vm
Edit
Show Advanced Options
Host
"Start Running on"

- Original Message -

From: "Scott Ocken" 
To: Users@ovirt.org
Sent: Tuesday, March 18, 2014 5:08:24 PM
Subject: [Users] Force certain VMs to be on different hosts

Is there a way to have certain VMs to be on different hosts? (assuming
there are enough hosts)

IE.  I have a db cluster of 3 VMs.  I would like each one to always be
on different hosts.  That way if a host goes down my db cluster is
still happy while migration happens.  Or if migration fails I am still
good.

Thanks
Scott


Ted Miller
It looks like the Negative Affinity/Anti-Affinity feature that Itamar Hein 
pointed out in his email, with a feature page at 
http://www.ovirt.org/Features/VM-Affinity includes what you are trying to 
do.  This is in 3.4, which in the QA process now.


Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Force certain VMs to be on different hosts

2014-03-19 Thread Ted Miller
I think what the OP is asking for a designation as a "redundant group 1".  He 
may have 10 hosts and 3 VMs in "redundant group 1". He doesn't care which 
hosts they run on, as long as they are three separate hosts.


I can see this as being fairly widely applicable.  If you have multiple web 
servers for load sharing, you don't want them all running on the same host, 
because VM load is going to peak on them at the same times.  oVirt has no way 
of knowing that unless you give oVirt a hint to spread things around.  The 
web group might also want to split up the server that spreads the jobs 
around, and a database server used by all the web hosts.  I can see easily 
ending up with a group of 5 machines (3 web servers, a load sharing 
controller, and a database server) that you want spread across any 5 of the 
15 servers in a cluster, because their loads are all going to spike together. 
You don't want oVirt having to try to migrate some of them during a load 
spike, because oVirt noticed that a host with 3 of the 5 is overloaded.


Not my situation, but one I can see the usefulness of.
Ted Miller
Elkhart, IN, USA

On 3/18/2014 11:54 AM, Meital Bourvine wrote:

Hi Scott,

Click on a vm
Edit
Show Advanced Options
Host
"Start Running on"

- Original Message -

From: "Scott Ocken" 
To: Users@ovirt.org
Sent: Tuesday, March 18, 2014 5:08:24 PM
Subject: [Users] Force certain VMs to be on different hosts

Is there a way to have certain VMs to be on different hosts? (assuming
there are enough hosts)

IE.  I have a db cluster of 3 VMs.  I would like each one to always be
on different hosts.  That way if a host goes down my db cluster is
still happy while migration happens.  Or if migration fails I am still
good.

Thanks
Scott

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


--
"He is no fool who gives what he cannot keep, to gain what he cannot lose." - - 
Jim Elliot
For more information about Jim Elliot and his unusual life, see 
http://www.christianliteratureandliving.com/march2003/carolyn.html.

Ted Miller
Design Engineer
HCJB Global Technology Center, a ministry of Reach Beyond
2830 South 17th St
Elkhart, IN  46517
574--970-4272 my desk
574--970-4252 receptionist

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] [RFI] GUI Changes for oVirt 4.0

2014-03-18 Thread Ted Miller

On 3/18/2014 6:29 AM, Itamar Heim wrote:

we are brainstorming on what should we change in the oVirt UI for 4.0.
for current brainstorming phase, "anything goes" - i.e., I'd like us to 
ignore current limitations and flows, and envision/fantasize the "perfect 
solution".


SO - what do YOU think we should consider for 4.0 UI concept, flows, etc.

I have an idea, though it may be relevant only to smaller setups.

I would like one place that I can go and see the health of my system. Right 
now I am running a cluster in test mode, and I have to look at several places 
before I have confidence that all is well:

Data Center
Storage (I am using gluster)
Hosts
VMs

The natural place for me to look would seem to be the left bar. If the icons 
there had a color change to reflect status, I could just hit "Expand All" and 
the color would immediately tell me the system status. Same icons would work, 
with just a background or little square of color.


I realize that my little 3-host system is the exception, because (so far) I 
can hit "Expand all" and I can still see the whole thing. I will not have to 
deploy many more VMs before it will not all fit.


There may be a better way to do this, e.g. another choice between "Expand 
all" and "Collapse All" that would expand all except the lowest level. It 
would then show me categories like "Storage", "Networks", "Hosts", "Volumes" 
and "VMs" with a health color indication for each cluster. If I see anything 
I am not expecting, I can expand that heading and see the status of the 
individual items.


There may be a better way to do this, but I know that it is somewhat 
frustrating to check first thing in the morning and have to do some many 
clicks before I have confidence that nothing bad happened overnight. I have 
actually found it faster to click on the "Events" tab and see if there are 
any nasty messages there, rather than checking current status.


I look forward to the insights of others as to how they monitor cluster status.

Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] SPICE causes migration failure?

2014-03-04 Thread Ted Miller


On 3/3/2014 12:26 PM, Dafna Ron wrote:
I don't see a reason why open monitor will fail migration - at most, if 
there is a problem I would close the spice session on src and restarted it 
at the dst.
can you please attach vdsm/libvirt/qemu logs from both hosts and engine 
logs so that we can see the migration failure reason?


Thanks,
Dafna



On 03/03/2014 05:16 PM, Ted Miller wrote:
I just got my Data Center running again, and am proceeding with some setup 
& testing.


I created a VM (not doing anything useful)
I clicked on the "Console" and had a SPICE console up (viewed in Win7).
I had it printing the time on the screen once per second (while date;do 
sleep 1; done).

I tried to migrate the VM to another host and got in the GUI:

Migration started (VM: web1, Source: s1, Destination: s3, User: 
admin@internal).


Migration failed due to Error: Fatal error during migration (VM: web1, 
Source: s1, Destination: s3).


As I started the migration I happened to think "I wonder how they handle 
the SPICE console, since I think that is a link from the host to my 
machine, letting me see the VM's screen."


After the failure, I tried shutting down the SPICE console, and found that 
the migration succeeded.  I again opened SPICE and had a migration fail.  
Closed SPICE, migration failed.


I can understand how migrating SPICE is a problem, but, at least could we 
give the victim of this condition a meaningful error message?  I have seen 
a lot of questions about failed migrations (mostly due to attached CDs), 
but I have never seen this discussed. If I had not had that particular 
thought cross my brain at that particular time, I doubt that SPICE would 
have been where I went looking for a solution.


If this is the first time this issue has been raised, I am willing to file 
a bug.


Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



In finding the right one-minute slice of the logs, I saw something that makes 
me think this is due to a missing method in the glusterfs support.  Others 
who understand more of what the logs are saying can verify or correct my hunch.


Was trying to migrate from s2 to s1.

Logs on fpaste.org:
http://ur1.ca/gr48c
http://ur1.ca/gr48r
http://ur1.ca/gr493
http://ur1.ca/gr49e
http://ur1.ca/gr49i
http://ur1.ca/gr49x
http://ur1.ca/gr4a6

Ted Miller
Elkhart, IN, USA



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] SPICE causes migration failure?

2014-03-03 Thread Ted Miller
I just got my Data Center running again, and am proceeding with some setup & 
testing.


I created a VM (not doing anything useful)
I clicked on the "Console" and had a SPICE console up (viewed in Win7).
I had it printing the time on the screen once per second (while date;do sleep 
1; done).

I tried to migrate the VM to another host and got in the GUI:

Migration started (VM: web1, Source: s1, Destination: s3, User: admin@internal).

Migration failed due to Error: Fatal error during migration (VM: web1, Source: 
s1, Destination: s3).

As I started the migration I happened to think "I wonder how they handle the 
SPICE console, since I think that is a link from the host to my machine, 
letting me see the VM's screen."


After the failure, I tried shutting down the SPICE console, and found that 
the migration succeeded.  I again opened SPICE and had a migration fail.  
Closed SPICE, migration failed.


I can understand how migrating SPICE is a problem, but, at least could we 
give the victim of this condition a meaningful error message?  I have seen a 
lot of questions about failed migrations (mostly due to attached CDs), but I 
have never seen this discussed. If I had not had that particular thought 
cross my brain at that particular time, I doubt that SPICE would have been 
where I went looking for a solution.


If this is the first time this issue has been raised, I am willing to file a 
bug.

Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Opinions needed: 3 node gluster replica 3 | NFS async | snapshots for consistency

2014-02-24 Thread Ted Miller
 server does not get 
along with gluster.  You need to run gluster's own NFS server, and turn off 
the kernel NFS server.  Gluster's own NFS server is gluster-aware, so I think 
some of the problems you envision may be covered in that server.


Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] vmware image conversion

2014-02-20 Thread Ted Miller

On 2/19/2014 8:54 PM, Bob Doolittle wrote:

Yes.
So:

VMware non-ESX -> ESX, using VMware's tool, then
ESX -> RHEV using virt-v2v

no?

NO!!!

You apparently have not done this, only talk about it?  The last step (ESX -> 
RHEV using virt-v2v) doesn't work.  It had problems with some images, and 
they pulled it completely from current versions of virt-v2v.  See 
https://rhn.redhat.com/errata/RHBA-2013-1749.html


Ted Miller

If this is viable, it's easy to understand why nobody wants to put effort 
into supporting a bevy of VMware VM formats, when there's a tool already 
available to convert to one and they can focus on it.


-Bob

On 02/19/2014 05:52 PM, Maurice James wrote:

I want to change it from VMware to RHEV/oVirt

-Original Message-
From: Bob Doolittle [mailto:b...@doolittle.us.com]
Sent: Wednesday, February 19, 2014 8:51 PM
To: Maurice James; 'Ted Miller'; users@ovirt.org
Subject: Re: [Users] vmware image conversion

My recollection is that VMware provides a converter to change your VMware
non-ESX VMs into ESX format.
Do you have to buy ESX to gain access to it?

-Bob

On 02/19/2014 05:46 PM, Maurice James wrote:

I even open a feature request that they closed pretty quickly with
WONTFIX
https://bugzilla.redhat.com/show_bug.cgi?id=1062910 . Why is this such
a touchy issue?

-Original Message-
From: users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] On
Behalf Of Ted Miller
Sent: Wednesday, February 19, 2014 7:28 PM
To: users@ovirt.org
Subject: Re: [Users] vmware image conversion


On 2/9/2014 4:27 PM, Itamar Heim wrote:

On 02/09/2014 10:28 PM, Maurice James wrote:

The instructions assume that I have an ESX instance to connect to.
How do I do this with an already exported vmware image with no esx
available to connect to? I have a turnkey drupal vm in ovf format

-Original Message-
From: Itamar Heim [mailto:ih...@redhat.com]
Sent: Sunday, February 09, 2014 2:24 PM
To: Maurice James; 'users'
Subject: Re: [Users] vmware image conversion

On 02/09/2014 07:02 PM, Maurice James wrote:

According to this https://rhn.redhat.com/errata/RHBA-2013-1749.html
It does not do it

please review:
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterpris
e _Virtua lization/3.3/html-single/V2V_Guide/index.html


-Original Message-
From: Itamar Heim [mailto:ih...@redhat.com]
Sent: Sunday, February 09, 2014 4:52 AM
To: Maurice James; 'users'
Subject: Re: [Users] vmware image conversion

On 02/08/2014 04:18 PM, Maurice James wrote:

I submitted an RFE to have vmware image conversion added to 3.5. I
think that is a key feature that is lacking. Im just trying to get
some eyes on it here.

https://bugzilla.redhat.com/show_bug.cgi?id=1062910



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


can you comment on the gaps from virt-v2v which does this today (in
the bug as well).

thanks,
Itamar


iirc, you need an ESX currently.

Some of us are stuck without a way to try out ovirt because of this.
ESX is not the only platform that people run VMWare on.  I am trying
to bring over VMs from an old VMWare Server setup on Centos 5.  Works
fine, but there is no migration path.  Other people may have VMs on
VMWare Workstation or other, older products.  We just get told to go fly a

kite?

If the only choice is to bring up a full-blown, working ESX instance,
I may bring up ESXi and stay there.

Ted Miller
Elkhart, IN

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users




--
"He is no fool who gives what he cannot keep, to gain what he cannot lose." - - 
Jim Elliot
For more information about Jim Elliot and his unusual life, see 
http://www.christianliteratureandliving.com/march2003/carolyn.html.

Ted Miller
Design Engineer
HCJB Global Technology Center
2830 South 17th St
Elkhart, IN  46517
574--970-4272 my desk
574--970-4252 receptionist

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] vmware image conversion

2014-02-19 Thread Ted Miller


On 2/9/2014 4:27 PM, Itamar Heim wrote:

On 02/09/2014 10:28 PM, Maurice James wrote:

The instructions assume that I have an ESX instance to connect to. How do I
do this with an already exported vmware image with no esx available to
connect to? I have a turnkey drupal vm in ovf format

-Original Message-
From: Itamar Heim [mailto:ih...@redhat.com]
Sent: Sunday, February 09, 2014 2:24 PM
To: Maurice James; 'users'
Subject: Re: [Users] vmware image conversion

On 02/09/2014 07:02 PM, Maurice James wrote:

According to this https://rhn.redhat.com/errata/RHBA-2013-1749.html It
does not do it


please review:
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtua
lization/3.3/html-single/V2V_Guide/index.html



-Original Message-
From: Itamar Heim [mailto:ih...@redhat.com]
Sent: Sunday, February 09, 2014 4:52 AM
To: Maurice James; 'users'
Subject: Re: [Users] vmware image conversion

On 02/08/2014 04:18 PM, Maurice James wrote:

I submitted an RFE to have vmware image conversion added to 3.5. I
think that is a key feature that is lacking. Im just trying to get
some eyes on it here.

https://bugzilla.redhat.com/show_bug.cgi?id=1062910



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users



can you comment on the gaps from virt-v2v which does this today (in
the bug as well).

thanks,
  Itamar





iirc, you need an ESX currently.
Some of us are stuck without a way to try out ovirt because of this.  ESX is 
not the only platform that people run VMWare on.  I am trying to bring over 
VMs from an old VMWare Server setup on Centos 5.  Works fine, but there is no 
migration path.  Other people may have VMs on VMWare Workstation or other, 
older products.  We just get told to go fly a kite?


If the only choice is to bring up a full-blown, working ESX instance, I may 
bring up ESXi and stay there.


Ted Miller
Elkhart, IN

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Asking for advice on hosted engine

2014-02-18 Thread Ted Miller


On 2/17/2014 4:20 AM, Giorgio Bersano wrote:

Hello everybody,
I discovered oVirt a couple of months ago when I was looking for the
best way to manage our small infrastructure. I have read any document
I considered useful but I would like to receive advice from the many
experts that are on this list.

I think it worths an introduction (I hope doesn't get you bored).

I work in a small local government entity and I try to manage
effectively our limited resources.
We have many years of experience with Linux and especially with CentOS
which we have deployed on PC (i.e. for using as firewall in remote
locations) and moreover on servers.

We have been using Xen virtualization from the early days of CentOS 5
and  we have built our positive experience on KVM too.
I have to say that libvirt in a small environment like ours is really
a nice tool.
So nothing to regret.

Trying to go a little further, as already said, I stumbled upon oVirt
and I've found the project intriguing.

At the moment we are thinking of deploying it on a small environment
of four very similar servers each having:
- a couple of Xeon E5504
- 6 x 1Gb ethernet interfaces
- 40 GB of RAM
two of them have 72 GB of disk (mirrored)
two of them have almost 500GB of useful RAID array

Moreover we have an HP iSCSI storage that should easily satisfy our
current storage requirement.

So, given our small server pool, the necessity of another host just to
run the supervisor seems a requirement too high.

Enter "hosted engine" and the picture takes brighter colors. Well, I'm
usually not the adventurous guy but after experimenting a little with
oVirt 3.4 I developed better confidence.
We would want to install the engine over the two hosts with smaller disks.

For what I know, installing hosted engine mandates NFS storage. But we
want this to be highly available too, and possibly to have it on the
very same hosts.

Here is my solution: make a gluster replicated volume across the two
hosts and take advantage of that NFS server.
Then I put 127.0.0.1 as the address of the NFS server in the
hosted-engine-setup so  the host is always able to reach the storage
server (itself).
GlusterFS configuration is done outside of oVirt that, regarding
engine's storage, doesn't even know that it's a gluster thing.

Relax, we've finally reached the point where I'm asking advice :-)

Storage and virtualization experts, do you see in this configuration
any pitfall that I've overlooked given my inexperience in oVirt,
Gluster, NFS or clustered filesystems?
Do you think that not only it's feasable (I know it is, I made it and
it's working now) but it's also reliable and dependable and I'm not
risking my neck on this setup?

I've obviously made some test but I'm not at the confidence level of
saying that all is right in the way it is designed.

OK, I think I've already written too much, better I stop and humbly
wait for your opinion but I'm obviously here if any clarification by
my part  is needed.

Thank you very much for reading until this point.
Best Regards,
Giorgio.

Giorgio,

Gluster on two hosts only is not a good idea.  Installed for high reliability 
(quorum activated), gluster requires that >50% of the nodes be working before 
anything can be written.  When you have only two nodes, that means both nodes 
must be up before anything can happen.


You can turn off quorum, but then you are almost guaranteeing yourself a 
split-brain headache the first time communication between the two hosts is 
interrupted, even briefly (been there, done that). Ovirt is constantly 
writing to the storage, so if they are not communicating you WILL get 
different things written to the same files in both servers, especially the 
sanlock files.  This is called split-brain, and it will give you a splitting 
headache.


For replicated gluster to work well, you need a minimum of three gluster 
nodes in replica mode.  Two nodes is a recipe for unhappiness.  It is either 
low-availability (quorum on) or a split-brain waiting to spring on you 
(quorum off).  You don't want either one.


Figure out how to use some storage on some third computer to provide a third 
gluster node.  That way only two of the three have to be working for things 
to keep working.


Ted Miller
Elkhart, IN
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Import VMware

2014-02-05 Thread Ted Miller
From: Maurice James 

Do you know of the best way to get a vmware guest into ovirt without virt-v2v 
by chance?

From: users-boun...@ovirt.org [mailto:users-boun...@ovirt.org] On Behalf Of Ted 
Miller

>> On 2/4/2014 10:49 AM, Maurice James wrote:

>>> Is it possible to import vmware images into ovirt 3.3, Or is a running Esx 
>>> instance still required?

>> This bug https://rhn.redhat.com/errata/RHBA-2013-1749.html officially 
>> withdrew support for importing image files directly, because it didn't 
>> always work.

>> Ted Miller


> From: Maurice James 

> Do you know of the best way to get a vmware guest into ovirt without virt-v2v 
> by chance?



No.  I had a gluster + sanlock problem take out my ovirt cluster (2 hosts), and 
I only have it partially back up.  My dozen VMs are currently available only 
when my (dual boot) hardware isn't running oVirt.  Or, to put it the other way, 
I can only run oVirt when I can take down the VMWare group, because I don't 
have spare hardware.  Working on rebuilding one VM in KVM today (VMWare copy 
had a problem).

The only way I have heard succeed is to use ESX/ESXi or the "hollow pig" 
method.  Create a VM in ovirt, including the hard drive.  Replace hard drive 
file with file from VMWare (or otherwise get data into file).  Fiddle with VM 
hardware & settings until it runs.

Ted Miller
Elkhart, IN

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Import VMware

2014-02-04 Thread Ted Miller

On 2/4/2014 10:49 AM, Maurice James wrote:
Is it possible to import vmware images into ovirt 3.3, Or is a running Esx 
instance still required?
This bug https://rhn.redhat.com/errata/RHBA-2013-1749.html officially 
withdrew support for importing image files directly, because it didn't always 
work.


Ted Miller

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-27 Thread Ted Miller

Federico, thank you for your help so far.  Lots of more information below.

On 1/27/2014 4:46 PM, Federico Simoncelli wrote:

- Original Message -

From: "Ted Miller" 

On 1/27/2014 3:47 AM, Federico Simoncelli wrote:

Maybe someone from gluster can identify easily what happened. Meanwhile if
you just want to repair your data-center you could try with:

   $ cd 
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
   $ touch ids
   $ sanlock direct init -s 0322a407-2b16-40dc-ac67-13d387c6eb4c:0:ids:1048576


I tried your suggestion, and it helped, but it was not enough.

   [root@office4a ~]$ cd
   
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/

   [root@office4a dom_md]$ touch ids

   [root@office4a dom_md]$ sanlock direct init -s 
0322a407-2b16-40dc-ac67-13d387c6eb4c:0:ids:1048576

   init done 0

Let me explain a little.

When the problem originally happened, the sanlock.log started having -223 
error messages.  10 seconds later the log switched from -223 messages to -90 
messages.  Running your little script changed the error from -90 back to -223.


I hope you can send me another script that will get rid of the -223 messages.

Here is the sanlock.log as I ran your script:

   2014-01-27 19:40:41-0500 39281 [3803]: s13 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

   2014-01-27 19:40:41-0500 39281 [22751]: 0322a407 aio collect 0 
0x7f54240008c0:0x7f54240008d0:0x7f5424101000 result 0:0 match len 512

   2014-01-27 19:40:41-0500 39281 [22751]: read_sectors delta_leader offset 512 
rv -90 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids

   2014-01-27 19:40:42-0500 39282 [3803]: s13 add_lockspace fail result -90

   2014-01-27 19:40:47-0500 39287 [3803]: s14 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

   2014-01-27 19:40:47-0500 39287 [22795]: 0322a407 aio collect 0 
0x7f54240008c0:0x7f54240008d0:0x7f5424101000 result 0:0 match len 512

   2014-01-27 19:40:47-0500 39287 [22795]: read_sectors delta_leader offset 512 
rv -90 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids

   2014-01-27 19:40:48-0500 39288 [3803]: s14 add_lockspace fail result -90

   2014-01-27 19:40:56-0500 39296 [3802]: s15 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

   2014-01-27 19:40:56-0500 39296 [22866]: verify_leader 2 wrong magic 0 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids

   2014-01-27 19:40:56-0500 39296 [22866]: leader1 delta_acquire_begin error 
-223 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c host_id 2

   2014-01-27 19:40:56-0500 39296 [22866]: leader2 path 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
 offset 0

   2014-01-27 19:40:56-0500 39296 [22866]: leader3 m 0 v 0 ss 0 nh 0 mh 0 oi 0 
og 0 lv 0

   2014-01-27 19:40:56-0500 39296 [22866]: leader4 sn  rn  ts 0 cs 0

   2014-01-27 19:40:57-0500 39297 [3802]: s15 add_lockspace fail result -223

   2014-01-27 19:40:57-0500 39297 [3802]: s16 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

   2014-01-27 19:40:57-0500 39297 [22870]: verify_leader 2 wrong magic 0 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids

   2014-01-27 19:40:57-0500 39297 [22870]: leader1 delta_acquire_begin error 
-223 lockspace 0322a407-2b16-40dc-ac67-13d387c6eb4c host_id 2

   2014-01-27 19:40:57-0500 39297 [22870]: leader2 path 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids
 offset 0

   2014-01-27 19:40:57-0500 39297 [22870]: leader3 m 0 v 0 ss 0 nh 0 mh 0 oi 0 
og 0 lv 0

   2014-01-27 19:40:57-0500 39297 [22870]: leader4 sn  rn  ts 0 cs 0

   2014-01-27 19:40:58-0500 39298 [3802]: s16 add_lockspace fail result -223

   2014-01-27 19:41:07-0500 39307 [3802]: s17 lockspace 
0322a407-2b16-40dc-ac67-13d387c6eb4c:2:/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids:0

Unfortunately, I think the error looks about the same to vdsm, because 
/var/log/messages shows the same two lines in the calling scripts on the 
callback lists (66 & 425, if I remember right).


When I get up in the morning, I will be looking for another magic potion from 
your pen. :)



Federico,

I won't be able to do anything to the ovirt setup for another 5 hours or so
(it is a trial system I am working on  at home, I am at work), but I will try
your repair script and report

Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-27 Thread Ted Miller


On 1/27/2014 3:47 AM, Federico Simoncelli wrote:

- Original Message -

From: "Itamar Heim" 
To: "Ted Miller" , users@ovirt.org, "Federico Simoncelli" 

Cc: "Allon Mureinik" 
Sent: Sunday, January 26, 2014 11:17:04 PM
Subject: Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

On 01/27/2014 12:00 AM, Ted Miller wrote:

On 1/26/2014 4:00 PM, Itamar Heim wrote:

On 01/26/2014 10:51 PM, Ted Miller wrote:

On 1/26/2014 3:10 PM, Itamar Heim wrote:

On 01/26/2014 10:08 PM, Ted Miller wrote:
is this gluster storage (guessing sunce you mentioned a 'volume')

yes (mentioned under "setup" above)

does it have a quorum?

Volume Name: VM2
Type: Replicate
Volume ID: 7bea8d3b-ec2a-4939-8da8-a82e6bda841e
Status: Started
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: 10.41.65.2:/bricks/01/VM2
Brick2: 10.41.65.4:/bricks/01/VM2
Brick3: 10.41.65.4:/bricks/101/VM2
Options Reconfigured:
cluster.server-quorum-type: server
storage.owner-gid: 36
storage.owner-uid: 36
auth.allow: *
user.cifs: off
nfs.disa

(there were reports of split brain on the domain metadata before when
no quorum exist for gluster)

after full heal:

[root@office4a ~]$ gluster volume heal VM2 info
Gathering Heal info on volume VM2 has been successful

Brick 10.41.65.2:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/101/VM2
Number of entries: 0
[root@office4a ~]$ gluster volume heal VM2 info split-brain
Gathering Heal info on volume VM2 has been successful

Brick 10.41.65.2:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/01/VM2
Number of entries: 0

Brick 10.41.65.4:/bricks/101/VM2
Number of entries: 0

noticed this in host /var/log/messages (while looking for something else).  
Loop seems to repeat over and over.

Jan 26 15:35:52 office4a sanlock[3763]: 2014-01-26 15:35:52-0500 14678 [30419]: 
read_sectors delta_leader offset 512 rv -90 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids


Jan 26 15:35:53 office4a sanlock[3763]: 2014-01-26 15:35:53-0500 14679 [3771]: 
s1997 add_lockspace fail result -90
Jan 26 15:35:58 office4a vdsm TaskManager.Task ERROR Task=`89885661-88eb-4ea3-8793-00438735e4ab`::Unexpected 
error#012Traceback (most recent call last):#012  File "/usr/share/vdsm/storage/task.py", line 857, in 
_run#012 return fn(*args, **kargs)#012  File "/usr/share/vdsm/logUtils.py", line 45, in wrapper#012res = 
f(*args, **kwargs)#012  File "/usr/share/vdsm/storage/hsm.py", line 2111, in getAllTasksStatuses#012
allTasksStatus = sp.getAllTasksStatuses()#012 File "/usr/share/vdsm/storage/securable.py", line 66, in 
wrapper#012
raise SecureError()#012SecureError
Jan 26 15:35:59 office4a sanlock[3763]: 2014-01-26 15:35:59-0500 14686 [30495]: 
read_sectors delta_leader offset 512 rv -90 
/rhev/data-center/mnt/glusterSD/10.41.65.2:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/ids


Jan 26 15:36:00 office4a sanlock[3763]: 2014-01-26 15:36:00-0500 14687 [3772]: 
s1998 add_lockspace fail result -90
Jan 26 15:36:00 office4a vdsm TaskManager.Task ERROR Task=`8db9ff1a-2894-407a-915a-279f6a7eb205`::Unexpected error#012Traceback 
(most recent call last):#012  File "/usr/share/vdsm/storage/task.py", line 857, in _run#012 return fn(*args, 
**kargs)#012  File "/usr/share/vdsm/storage/task.py", line 318, in run#012return self.cmd(*self.argslist, 
**self.argsdict)#012 File "/usr/share/vdsm/storage/sp.py", line 273, in startSpm#012 
self.masterDomain.acquireHostId(self.id)#012  File "/usr/share/vdsm/storage/sd.py", line 458, in acquireHostId#012 
self._clusterLock.acquireHostId(hostId, async)#012  File "/usr/share/vdsm/storage/clusterlock.py", line 189, in 
acquireHostId#012raise se.AcquireHostIdFailure(self._sdUUID, e)#012AcquireHostIdFailure: Cannot acquire host id: 
('0322a407-2b16-40dc-ac67-13d387c6eb4c', SanlockException(90, 'Sanlock lockspace add failure', 'Message too long'))

fede - thoughts on above?
(vojtech reported something similar, but it sorted out for him after
some retries)

Something truncated the ids file, as also reported by:


[root@office4a ~]$ ls
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
-l
total 1029
-rw-rw 1 vdsm kvm 0 Jan 22 00:44 ids
-rw-rw 1 vdsm kvm 0 Jan 16 18:50 inbox
-rw-rw 1 vdsm kvm 2097152 Jan 21 18:20 leases
-rw-r--r-- 1 vdsm kvm 491 Jan 21 18:20 metadata
-rw-rw 1 vdsm kvm 0 Jan 16 18:50 outbox

In the past I saw that happening because of a glusterfs bug:

https://bugzilla.redhat.com/show_bug.cgi?id=862975

Anyway in general it seems that glusterfs is not always able to reconcile
the ids file (as it's written by all the hosts at the same time).

Maybe someone from gluster can 

Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-26 Thread Ted Miller


On 1/26/2014 6:24 PM, Ted Miller wrote:

On 1/26/2014 5:17 PM, Itamar Heim wrote:

On 01/27/2014 12:00 AM, Ted Miller wrote:


On 1/26/2014 4:00 PM, Itamar Heim wrote:

On 01/26/2014 10:51 PM, Ted Miller wrote:


On 1/26/2014 3:10 PM, Itamar Heim wrote:

On 01/26/2014 10:08 PM, Ted Miller wrote:

My Data Center is down, and won't come back up.

Data Center Status on the GUI flips between "Non Responsive" and
"Contending"

Also noted:
Host sometimes seen flipping between "Low" and "Contending" in SPM
column.
Storage VM2 "Data (Master)" is in "Cross Data-Center Status" = Unknown
VM2 is "up" under "Volumes" tab

Created another volume for VM storage.  It shows up in "volumes" tab,
but when I try to add "New Domain" in storage tab, says that "There
are
No Data Centers to which the Storage Domain can be attached"

Setup:
2 hosts w/ glusterfs storage
1 engine
all 3 computers Centos 6.5, just updated
ovirt-engine   3.3.0.1-1.el6
ovirt-engine-lib 3.3.2-1.el6
ovirt-host-deploy.noarch  1.1.3-1.el6
glusterfs.x86_64   3.4.2-1.el6

This loop seems to repeat in the ovirt-engine log (grep of log showing
only DefaultQuartzScheduler_Worker-79 thread:

2014-01-26 14:44:58,416 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) Irs placed on server
9a591103-83be-4ca9-b207-06929223b541 failed. Proceed Failover
2014-01-26 14:44:58,511 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) hostFromVds::selectedVds -
office4a,
spmStatus Free, storage pool mill
2014-01-26 14:44:58,550 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) SpmStatus on vds
127ed939-34af-41a8-87a0-e2f6174b1877: Free
2014-01-26 14:44:58,571 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) starting spm on vds office4a,
storage
pool mill, prevId 2, LVER 15
2014-01-26 14:44:58,579 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) START,
SpmStartVDSCommand(HostName =
office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877,
storagePoolId =
536a864d-83aa-473a-a675-e38aafdd9071, prevId=2, prevLVER=15,
storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log
id: 74c38eb7
2014-01-26 14:44:58,617 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) spmStart polling started: taskId =
e8986753-fc80-4b11-a11d-6d3470b1728c
2014-01-26 14:45:00,662 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand]
(DefaultQuartzScheduler_Worker-79) Failed in HSMGetTaskStatusVDS
method
2014-01-26 14:45:00,664 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand]
(DefaultQuartzScheduler_Worker-79) Error code AcquireHostIdFailure and
error message VDSGenericException: VDSErrorException: Failed to
HSMGetTaskStatusVDS, error = Cannot acquire host id
2014-01-26 14:45:00,665 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) spmStart polling ended: taskId =
e8986753-fc80-4b11-a11d-6d3470b1728c task status = finished
2014-01-26 14:45:00,666 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) Start SPM Task failed - result:
cleanSuccess, message: VDSGenericException: VDSErrorException:
Failed to
HSMGetTaskStatusVDS, error = Cannot acquire host id
2014-01-26 14:45:00,695 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) spmStart polling ended, spm
status: Free
2014-01-26 14:45:00,702 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(DefaultQuartzScheduler_Worker-79) START,
HSMClearTaskVDSCommand(HostName = office4a, HostId =
127ed939-34af-41a8-87a0-e2f6174b1877,
taskId=e8986753-fc80-4b11-a11d-6d3470b1728c), log id: 336ec5a6
2014-01-26 14:45:00,722 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(DefaultQuartzScheduler_Worker-79) FINISH, HSMClearTaskVDSCommand, log
id: 336ec5a6
2014-01-26 14:45:00,724 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) FINISH, SpmStartVDSCommand, return:
org.ovirt.engine.core.common.businessentities.SpmStatusResult@13652652,

log id: 74c38eb7
2014-01-26 14:45:00,733 INFO
[org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand]
(DefaultQuartzScheduler_Worker-79) Running command:
SetStoragePoolStatusCommand internal: true. Entities affected : ID:
536a864d-83aa-473a-a675-e38aafdd9071 Type: StoragePool
2014-01-26 14:45:00,778 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79)
IrsBroker::Failed::GetStoragePoolInfoVDS due t

[Users] sanlock can't read empty 'ids' file

2014-01-26 Thread Ted Miller
igure is that something is supposed to be in dom_md/ids, but that 
file is empty:


ls 
/rhev/data-center/mnt/glusterSD/10.41.65.2\:VM2/0322a407-2b16-40dc-ac67-13d387c6eb4c/dom_md/
 -l

total 1029

-rw-rw 1 vdsm kvm   0 Jan 22 00:44 ids

-rw-rw 1 vdsm kvm   0 Jan 16 18:50 inbox

-rw-rw 1 vdsm kvm 2097152 Jan 21 18:20 leases

-rw-r--r-- 1 vdsm kvm 491 Jan 21 18:20 metadata

-rw-rw 1 vdsm kvm   0 Jan 16 18:50 outbox

Any hints as to how to put whatever is needed into 'ids', or reinitialize the 
sanlock system--or a better diagnosis and solution--gladly accepted.


Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-26 Thread Ted Miller

On 1/26/2014 5:17 PM, Itamar Heim wrote:

On 01/27/2014 12:00 AM, Ted Miller wrote:


On 1/26/2014 4:00 PM, Itamar Heim wrote:

On 01/26/2014 10:51 PM, Ted Miller wrote:


On 1/26/2014 3:10 PM, Itamar Heim wrote:

On 01/26/2014 10:08 PM, Ted Miller wrote:

My Data Center is down, and won't come back up.

Data Center Status on the GUI flips between "Non Responsive" and
"Contending"

Also noted:
Host sometimes seen flipping between "Low" and "Contending" in SPM
column.
Storage VM2 "Data (Master)" is in "Cross Data-Center Status" = Unknown
VM2 is "up" under "Volumes" tab

Created another volume for VM storage.  It shows up in "volumes" tab,
but when I try to add "New Domain" in storage tab, says that "There
are
No Data Centers to which the Storage Domain can be attached"

Setup:
2 hosts w/ glusterfs storage
1 engine
all 3 computers Centos 6.5, just updated
ovirt-engine   3.3.0.1-1.el6
ovirt-engine-lib 3.3.2-1.el6
ovirt-host-deploy.noarch  1.1.3-1.el6
glusterfs.x86_64   3.4.2-1.el6

This loop seems to repeat in the ovirt-engine log (grep of log showing
only DefaultQuartzScheduler_Worker-79 thread:

2014-01-26 14:44:58,416 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) Irs placed on server
9a591103-83be-4ca9-b207-06929223b541 failed. Proceed Failover
2014-01-26 14:44:58,511 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) hostFromVds::selectedVds -
office4a,
spmStatus Free, storage pool mill
2014-01-26 14:44:58,550 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) SpmStatus on vds
127ed939-34af-41a8-87a0-e2f6174b1877: Free
2014-01-26 14:44:58,571 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) starting spm on vds office4a,
storage
pool mill, prevId 2, LVER 15
2014-01-26 14:44:58,579 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) START,
SpmStartVDSCommand(HostName =
office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877,
storagePoolId =
536a864d-83aa-473a-a675-e38aafdd9071, prevId=2, prevLVER=15,
storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log
id: 74c38eb7
2014-01-26 14:44:58,617 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) spmStart polling started: taskId =
e8986753-fc80-4b11-a11d-6d3470b1728c
2014-01-26 14:45:00,662 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand]
(DefaultQuartzScheduler_Worker-79) Failed in HSMGetTaskStatusVDS
method
2014-01-26 14:45:00,664 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand]
(DefaultQuartzScheduler_Worker-79) Error code AcquireHostIdFailure and
error message VDSGenericException: VDSErrorException: Failed to
HSMGetTaskStatusVDS, error = Cannot acquire host id
2014-01-26 14:45:00,665 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) spmStart polling ended: taskId =
e8986753-fc80-4b11-a11d-6d3470b1728c task status = finished
2014-01-26 14:45:00,666 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) Start SPM Task failed - result:
cleanSuccess, message: VDSGenericException: VDSErrorException:
Failed to
HSMGetTaskStatusVDS, error = Cannot acquire host id
2014-01-26 14:45:00,695 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) spmStart polling ended, spm
status: Free
2014-01-26 14:45:00,702 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(DefaultQuartzScheduler_Worker-79) START,
HSMClearTaskVDSCommand(HostName = office4a, HostId =
127ed939-34af-41a8-87a0-e2f6174b1877,
taskId=e8986753-fc80-4b11-a11d-6d3470b1728c), log id: 336ec5a6
2014-01-26 14:45:00,722 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(DefaultQuartzScheduler_Worker-79) FINISH, HSMClearTaskVDSCommand, log
id: 336ec5a6
2014-01-26 14:45:00,724 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) FINISH, SpmStartVDSCommand, return:
org.ovirt.engine.core.common.businessentities.SpmStatusResult@13652652,

log id: 74c38eb7
2014-01-26 14:45:00,733 INFO
[org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand]
(DefaultQuartzScheduler_Worker-79) Running command:
SetStoragePoolStatusCommand internal: true. Entities affected : ID:
536a864d-83aa-473a-a675-e38aafdd9071 Type: StoragePool
2014-01-26 14:45:00,778 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79)
IrsBroker::Failed::GetStoragePoolInfoVDS due to:
IrsSpmStartFailedException: IRSGenericException: IRSEr

Re: [Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-26 Thread Ted Miller


On 1/26/2014 4:00 PM, Itamar Heim wrote:

On 01/26/2014 10:51 PM, Ted Miller wrote:


On 1/26/2014 3:10 PM, Itamar Heim wrote:

On 01/26/2014 10:08 PM, Ted Miller wrote:

My Data Center is down, and won't come back up.

Data Center Status on the GUI flips between "Non Responsive" and
"Contending"

Also noted:
Host sometimes seen flipping between "Low" and "Contending" in SPM
column.
Storage VM2 "Data (Master)" is in "Cross Data-Center Status" = Unknown
VM2 is "up" under "Volumes" tab

Created another volume for VM storage.  It shows up in "volumes" tab,
but when I try to add "New Domain" in storage tab, says that "There are
No Data Centers to which the Storage Domain can be attached"

Setup:
2 hosts w/ glusterfs storage
1 engine
all 3 computers Centos 6.5, just updated
ovirt-engine   3.3.0.1-1.el6
ovirt-engine-lib 3.3.2-1.el6
ovirt-host-deploy.noarch  1.1.3-1.el6
glusterfs.x86_64   3.4.2-1.el6

This loop seems to repeat in the ovirt-engine log (grep of log showing
only DefaultQuartzScheduler_Worker-79 thread:

2014-01-26 14:44:58,416 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) Irs placed on server
9a591103-83be-4ca9-b207-06929223b541 failed. Proceed Failover
2014-01-26 14:44:58,511 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) hostFromVds::selectedVds - office4a,
spmStatus Free, storage pool mill
2014-01-26 14:44:58,550 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) SpmStatus on vds
127ed939-34af-41a8-87a0-e2f6174b1877: Free
2014-01-26 14:44:58,571 INFO
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79) starting spm on vds office4a, storage
pool mill, prevId 2, LVER 15
2014-01-26 14:44:58,579 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) START, SpmStartVDSCommand(HostName =
office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, storagePoolId =
536a864d-83aa-473a-a675-e38aafdd9071, prevId=2, prevLVER=15,
storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log
id: 74c38eb7
2014-01-26 14:44:58,617 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) spmStart polling started: taskId =
e8986753-fc80-4b11-a11d-6d3470b1728c
2014-01-26 14:45:00,662 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand]
(DefaultQuartzScheduler_Worker-79) Failed in HSMGetTaskStatusVDS method
2014-01-26 14:45:00,664 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand]
(DefaultQuartzScheduler_Worker-79) Error code AcquireHostIdFailure and
error message VDSGenericException: VDSErrorException: Failed to
HSMGetTaskStatusVDS, error = Cannot acquire host id
2014-01-26 14:45:00,665 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) spmStart polling ended: taskId =
e8986753-fc80-4b11-a11d-6d3470b1728c task status = finished
2014-01-26 14:45:00,666 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) Start SPM Task failed - result:
cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to
HSMGetTaskStatusVDS, error = Cannot acquire host id
2014-01-26 14:45:00,695 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) spmStart polling ended, spm
status: Free
2014-01-26 14:45:00,702 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(DefaultQuartzScheduler_Worker-79) START,
HSMClearTaskVDSCommand(HostName = office4a, HostId =
127ed939-34af-41a8-87a0-e2f6174b1877,
taskId=e8986753-fc80-4b11-a11d-6d3470b1728c), log id: 336ec5a6
2014-01-26 14:45:00,722 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand]
(DefaultQuartzScheduler_Worker-79) FINISH, HSMClearTaskVDSCommand, log
id: 336ec5a6
2014-01-26 14:45:00,724 INFO
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand]
(DefaultQuartzScheduler_Worker-79) FINISH, SpmStartVDSCommand, return:
org.ovirt.engine.core.common.businessentities.SpmStatusResult@13652652,
log id: 74c38eb7
2014-01-26 14:45:00,733 INFO
[org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand]
(DefaultQuartzScheduler_Worker-79) Running command:
SetStoragePoolStatusCommand internal: true. Entities affected : ID:
536a864d-83aa-473a-a675-e38aafdd9071 Type: StoragePool
2014-01-26 14:45:00,778 ERROR
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-79)
IrsBroker::Failed::GetStoragePoolInfoVDS due to:
IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
SpmStart failed

Ted Miller
Elkhart, IN, USA




[Users] Data Center stuck between "Non Responsive" and "Contending"

2014-01-26 Thread Ted Miller

My Data Center is down, and won't come back up.

Data Center Status on the GUI flips between "Non Responsive" and "Contending"

Also noted:
Host sometimes seen flipping between "Low" and "Contending" in SPM column.
Storage VM2 "Data (Master)" is in "Cross Data-Center Status" = Unknown
VM2 is "up" under "Volumes" tab

Created another volume for VM storage.  It shows up in "volumes" tab, but 
when I try to add "New Domain" in storage tab, says that "There are No Data 
Centers to which the Storage Domain can be attached"


Setup:
2 hosts w/ glusterfs storage
1 engine
all 3 computers Centos 6.5, just updated
ovirt-engine   3.3.0.1-1.el6
ovirt-engine-lib 3.3.2-1.el6
ovirt-host-deploy.noarch  1.1.3-1.el6
glusterfs.x86_64   3.4.2-1.el6

This loop seems to repeat in the ovirt-engine log (grep of log showing only 
DefaultQuartzScheduler_Worker-79 thread:


2014-01-26 14:44:58,416 INFO  
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(DefaultQuartzScheduler_Worker-79) Irs placed on server 
9a591103-83be-4ca9-b207-06929223b541 failed. Proceed Failover
2014-01-26 14:44:58,511 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(DefaultQuartzScheduler_Worker-79) hostFromVds::selectedVds - office4a, 
spmStatus Free, storage pool mill
2014-01-26 14:44:58,550 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(DefaultQuartzScheduler_Worker-79) SpmStatus on vds 
127ed939-34af-41a8-87a0-e2f6174b1877: Free
2014-01-26 14:44:58,571 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(DefaultQuartzScheduler_Worker-79) starting spm on vds office4a, storage pool 
mill, prevId 2, LVER 15
2014-01-26 14:44:58,579 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] 
(DefaultQuartzScheduler_Worker-79) START, SpmStartVDSCommand(HostName = 
office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, storagePoolId = 
536a864d-83aa-473a-a675-e38aafdd9071, prevId=2, prevLVER=15, 
storagePoolFormatType=V3, recoveryMode=Manual, SCSIFencing=false), log id: 
74c38eb7
2014-01-26 14:44:58,617 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] 
(DefaultQuartzScheduler_Worker-79) spmStart polling started: taskId = 
e8986753-fc80-4b11-a11d-6d3470b1728c
2014-01-26 14:45:00,662 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] 
(DefaultQuartzScheduler_Worker-79) Failed in HSMGetTaskStatusVDS method
2014-01-26 14:45:00,664 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetTaskStatusVDSCommand] 
(DefaultQuartzScheduler_Worker-79) Error code AcquireHostIdFailure and error 
message VDSGenericException: VDSErrorException: Failed to 
HSMGetTaskStatusVDS, error = Cannot acquire host id
2014-01-26 14:45:00,665 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] 
(DefaultQuartzScheduler_Worker-79) spmStart polling ended: taskId = 
e8986753-fc80-4b11-a11d-6d3470b1728c task status = finished
2014-01-26 14:45:00,666 ERROR 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] 
(DefaultQuartzScheduler_Worker-79) Start SPM Task failed - result: 
cleanSuccess, message: VDSGenericException: VDSErrorException: Failed to 
HSMGetTaskStatusVDS, error = Cannot acquire host id
2014-01-26 14:45:00,695 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] 
(DefaultQuartzScheduler_Worker-79) spmStart polling ended, spm status: Free
2014-01-26 14:45:00,702 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] 
(DefaultQuartzScheduler_Worker-79) START, HSMClearTaskVDSCommand(HostName = 
office4a, HostId = 127ed939-34af-41a8-87a0-e2f6174b1877, 
taskId=e8986753-fc80-4b11-a11d-6d3470b1728c), log id: 336ec5a6
2014-01-26 14:45:00,722 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMClearTaskVDSCommand] 
(DefaultQuartzScheduler_Worker-79) FINISH, HSMClearTaskVDSCommand, log id: 
336ec5a6
2014-01-26 14:45:00,724 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] 
(DefaultQuartzScheduler_Worker-79) FINISH, SpmStartVDSCommand, return: 
org.ovirt.engine.core.common.businessentities.SpmStatusResult@13652652, log 
id: 74c38eb7
2014-01-26 14:45:00,733 INFO 
[org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] 
(DefaultQuartzScheduler_Worker-79) Running command: 
SetStoragePoolStatusCommand internal: true. Entities affected : ID: 
536a864d-83aa-473a-a675-e38aafdd9071 Type: StoragePool
2014-01-26 14:45:00,778 ERROR 
[org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(DefaultQuartzScheduler_Worker-79) IrsBroker::Failed::GetStoragePoolInfoVDS 
due to: IrsSpmStartFailedException: IRSGenericException: IRSErrorException: 
SpmStart failed


Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] fencing: HP ilo100 status does NMI, reboots computer

2014-01-22 Thread Ted Miller

On 1/22/2014 2:44 PM, Joop wrote:

Ted Miller wrote:
I am having trouble getting fencing to work on my HP DL180 g6 servers. 
They have ilo100 controllers.  The documentation mentions ipmi compliance, 
but there are problems.


The ipmilan driver gets a response, but it is the wrong response.  A 
status request results in the NMI line being asserted, which (in standard 
PC architecture) is the same as pressing the reset button (which these 
servers don't have).
Thats weird. I have 2 ML110 G6 desktop servers, as storage servers, and 
those have ilo100 controllers too and I just checked and they are setup as 
ipmilan in engine.
I have used the Test button more than once and never had problems. My 
Summary page says:

*IPMI Version:* 2.0
*Firmware Version:* 4.23
*Hardware Version:* 1.0
*Description:*  ProLiant ML110 G6
*System GUID:*  33221100-5544-7766-8899-AABBCCDDEEFF


Mayb that helps you to track down the problem. If you have got question, 
please ask.


Joop
Joop, thanks for the info.  That tells me I was not totally off track when I 
was trying the ipmilan driver.


I have firmware 4.21 in my controllers.  I'll have to see about updating that.

One thing I have figured out, this problem would not have been so noticeable, 
except that something is causing host s1 to go "non responsive" every few 
hours.  That provokes a Restart -> Stop -> Status -> Start sequence from the 
fencing system.


I will have to deal with what is causing the "non responsive" condition, but 
first I want to work through the fencing problem.


I tried the ilo2 driver, but the test of that produced even more convulsive 
messages from the computer.


Ted Miller

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] fencing: HP ilo100 status does NMI, reboots computer

2014-01-22 Thread Ted Miller
I am having trouble getting fencing to work on my HP DL180 g6 servers.  They 
have ilo100 controllers.  The documentation mentions ipmi compliance, but 
there are problems.


The ipmilan driver gets a response, but it is the wrong response.  A status 
request results in the NMI line being asserted, which (in standard PC 
architecture) is the same as pressing the reset button (which these servers 
don't have).


Here are some log excerpts:

16:33
just after re-running re-install from engine, which ended:
*From oVirt GUI "Events" tab
*Host s1 installed
State was set to up for host s1.
Host s3 from cluster Default was *chosen* as a proxy to execute Status 
command on Host s1

Host s1 power management was verified successfully
16:34
*on ssh screen:*
Message from syslogd@s1 at Jan 21 16:34:14 ...
 kernel:Uhhuh. NMI received for unknown reason 31 on CPU 0.

Message from syslogd@s1 at Jan 21 16:34:14 ...
 kernel:Do you have a strange power saving mode enabled?

Message from syslogd@s1 at Jan 21 16:34:14 ...
 kernel:Dazed and confused, but trying to continue

***from IPMI web interface event log:*
Generic 01/21/2014  21:34:15Gen ID 0x21 Bus 
Uncorrectable Error Assertion
Generic 01/21/2014  21:34:15IOH_NMI_DETECT  State Asserted  
Assertion

*
From oVirt GUI "Events" tab
*Host s1 is non responsive
Host s3 from cluster Default was chosen as a proxy to execute Restart command 
on Host s1
Host s3 from cluster Default was chosen as a proxy to execute Stop command on 
Host s1
Host s3 from cluster Default was chosen as a proxy to execute Status command 
on Host s1

Host s1 was stopped by engine
Manual fence for host s1 was started
Host s3 from cluster Default was chosen as a proxy to execute Status command 
on Host s1
Host s3 from cluster Default was chosen as a proxy to execute Start command 
on Host s1
Host s3 from cluster Default was chosen as a proxy to execute Status command 
on Host s1

Host s1 was started by engine
Host s1 is rebooting
State was set to up for host s1.
Host s3 from cluster Default was chosen as a proxy to execute Status command 
on Host s1

16:41
saw kernel panic output on remote KVM terminal
computer rebooted itself


I have searched for ilo100, but find nothing related to ovirt, so am clueless 
as to what is the "correct" driver for this hardware.


So far I have seen this mostly on server1 (s1), but that is also the one I 
have cycled up and down most often.


I have also seen where the commands are apparently issued too fast (these 
servers are fairly slow booting).  For example, I found that one server was 
powered down when the boot process had gotten to the stage where the RAID 
controller screen was up, so it had not had time to complete the boot that 
was already in progress.


Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] online storage domain resize

2014-01-20 Thread Ted Miller

On 1/20/2014 9:57 AM, Jiří Sléžka wrote:

Hello,

I'm just curious and I didn't try it already. I'm using FC storage (Dell
MD3620f) with some logical disks on it. I should be able online increase
virtual disk capacity using storage management (I have some free capacity
on disk group).

Is there any way to on-line extend volume group used for vm's images
storage and don't break anything?


I just found this hint by Eduardo from list

1. Shutdown all VMs
2. Manually connect iscsi on the SPM host
3. Run pvresize on the LUN
4. Put the domain in maintenance
5. Activate the domains

Is it possible to do this on-line without shutting down all vms? If not, it
could be really nice feature for oncoming releases.


Thanks in advance

Jiri
I don't know if oVirt is "breaking the rules" for using LVM, but in regular 
Linux all you have to do is:

pvcreate to create a new PV on the available space
vgextend to add the new pv to the existing vg
enjoy additional space.

Others will have to chime in on whether oVirt breaks this process somehow (I 
am using gluster for my storage), but I doubt it.


Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Making v2v easier?

2014-01-20 Thread Ted Miller
Unfortunately I have not ESX or ESXi server.  These VMs were running on a 
VMWare Server.  Ted


On 1/20/2014 4:23 AM, Sander Grendelman wrote:

FWIW, importing directly from an ESX server still works:

virt-v2v-host:
- RHEL/CentOS 6.5 physical host ( virt-v2v uses qemu-kvm = extra++ slow on a VM)
- Packages:
   virt-v2v-0.9.1-5.el6_5.x86_64
   libguestfs-winsupport-1.0-7.el6.x86_64
   libguestfs-tools-c-1.20.11-2.el6.x86_64
   libguestfs-tools-1.20.11-2.el6.x86_64
   libguestfs-1.20.11-2.el6.x86_64
   virtio-win-1.6.7-2.el6.noarch ( RHEL only? )
- network acces to:
 oVirt export domain (NFS)
 esx host(s) to import from (HTTPS)
- virt-v2v has to run as root to mount the oVirt NFS export domain
- Edit ~/.netrc and add a line for the esx host(s) to import from
(change the <> parts):
machine  login  password 
- Fix permissions on netrc file:
chmod 600 ~/.netrc
- Run virt-v2v ( again: change the <> parts, ?no_verify=1 is needed
when esx uses self signed certs)
LIBGUESTFS_DEBUG=1 virt-v2v -ic esx:///?no_verify=1 -o
rhev -os 
--network  

Conversion can take quite some time after the disk copy,
especially when virt-v2v removes the vmware tools.
Running on a physical host (or using nested virtualization) helps.

On Mon, Jan 20, 2014 at 8:59 AM, Sander Grendelman
 wrote:

https://rhn.redhat.com/errata/RHBA-2013-1749.html

"""
This update fixes the following bug:

* An update to virt-v2v included upstream support for the import of OVA images
exported by VMware servers. Unfortunately, testing has shown that VMDK images
created by recent versions of VMware ESX cannot be reliably supported, thus this
feature has been withdrawn. (BZ#1028983)

Users of virt-v2v are advised to upgrade to this updated package, which fixes
this bug.
"""


--
"He is no fool who gives what he cannot keep, to gain what he cannot lose." - - 
Jim Elliot
For more information about Jim Elliot and his unusual life, see 
http://www.christianliteratureandliving.com/march2003/carolyn.html.

Ted Miller
Design Engineer
HCJB Global Technology Center
2830 South 17th St
Elkhart, IN  46517
574--970-4272 my desk
574--970-4252 receptionist

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Making v2v easier?

2014-01-19 Thread Ted Miller

I am wide open to suggestions (see discussion at bottom, as usual).

On 1/19/2014 5:25 AM, Gianluca Cecchi wrote:

On Sun, Jan 19, 2014 at 7:13 AM, Ted Miller wrote:


* BEAT HEAD AGAINST WALL because virt-v2v.x86_64 0.9.1-5.el6_5 from Centos
updates doesn't seem to know about .ova files.  I was following the
instructions in
Red_Hat_Enterprise_Virtualization-3.3-Beta-V2V_Guide-en-US.pdf guide, but I
figured out that the v2v they are talking about has an "-i ova" option,
while the help file for the version I am using does not list ova as an
option for -i, and if I try to use it, it tells me that it is an invalid
option, and if I leave it off it goes off looking for a qemu///system to
attach to.  help files for v2v say nothing at all about .ova files.

I am wondering where to find a v2v program that knows about .ova files, or
else am I going to have to import all my VMWare files to my (non-ovirt) KVM
host, and then drag them into ovirt from libvirt?

I made a bit of research about this

Strange I just update a CentOS 6.4 VM to latest 6.5 and see that
there (also matching RHEL 6.5 I think) there is indeed as you wrote:

virt-v2v-0.9.1-5.el6_5.x86_64

And it seems ova is missing as an option...

Instead on a Fedora 19 system with
virt-v2v-0.9.0-3.fc19.x86_64

I have it
So for any reason was it removed in newer packages?
It seems also strange to see a Fedora package (even if 19 and not 20)
older than a RH EL 6 one ...

RHEL 6 version bumped this way skipping 0.9.0:
* Wed Jun 12 2013 Matthew Booth  - 0.9.1-1
- Rebase to new upstream release

* Mon Oct 22 2012 Matthew Booth  - 0.8.9-2

while fedora 19 has been currently stopped at

* Wed Jul 03 2013 Richard W.M. Jones  - 0.9.0-3
- Default to using the appliance backend, since in Fedora >= 18 the
   libvirt backend doesn't support the 'iface' parameter which virt-v2v
   requires.
- Add BR perl(Sys::Syslog), required to run the tests.
- Remove some cruft from the spec file.

BTW in F20 we do have ova too:
virt-v2v-0.9.0-5.fc20.x86_64

and in fact it has the older version...

For RHEL 6 I remained here:
http://lists.ovirt.org/pipermail/users/2013-May/014457.html

ANd no particular virt-v2v package in rhev source repo
http://ftp.redhat.com/redhat/linux/enterprise/6Server/en/RHEV/SRPMS/

For sure the rhev 3.3 beta guide is incorrect at the moment

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.2/html-single/V2V_Guide/index.html#chap-V2V_Guide-Installing_virt_v2v

because it says
"
virt-v2v is available on Red Hat Network (RHN) in the Red Hat
Enterprise Linux Server (v.6 for 64-bit x86_64) or Red Hat Enterprise
Linux Workstation (v.6 for x86_64) channel. Ensure the system is
subscribed to the appropriate channel before installing virt-v2v.
"
and some lines below
"
7.1. virt-v2v Parameters
The following parameters can be used with virt-v2v:
  -i input Specifies the input method to obtain the guest for
conversion. The default is libvirt. Supported options are:
libvirt
Guest argument is the name of a libvirt domain.
libvirtxml
Guest argument is the path to an XML file containing a libvirt domain.
ova
"

Any light to shed on this?

Thanks
Gianluca
I think I have some light, but you'd better get out your rose-colored 
glasses, because that is the only way the light will look good.   ;(


I spun up a Fedora 20 64-bit VM (in my brand new oVirt environment) to take 
advantage of your wonderful discovery.  I did a minimal install, then "yum 
upgrade", then "yum install virt-v2v" brought in 489MB of dependencies!  This 
is what I found (not necessarily in the order I found them).


1. virt-v2v has a missing dependency: perl-Archive-Tar
 yum install perl-Archive-Tar

2. virt-v2v would error out fairly early in the conversion process:

   Error extracting archive '/media/VMold/vmware.dud/Fedora13A.ova':
   /usr/bin/tar: Fedora13A-disk1.vmdk: Wrote only 6144 of 10240 bytes

 that error went away when I increased VM memory from 1G to 4G (see below).

3. The *.ova files produced by vmware-vdiskmanager build 835872 (downloaded 
yesterday as part of VDDK 5.0) give the following error messages when running 
through virt-v2v:


   Use of uninitialized value $file in hash element at
   /usr/share/perl5/vendor_perl/Sys/VirtConvert/Connection/VMwareOVASource.pm 
line
   261, <$manifest> line 2.
   Reading from filehandle failed at
   /usr/share/perl5/vendor_perl/Sys/VirtConvert/Connection/VMwareOVASource.pm 
line
   271.

Though I am not a Perl programmer, I took a look at the code, stuck in some 
debugging "print" statements, and came to this conclusion:


   The *.ova files produced by my version of vmware-vdiskmanager contain a
   "blank" line at the end of the manifest with about 62 spaces in it
   (nothing else).  "sub _verify_manifest" in the file "VMwareOVASource.pm"
   it throws the error.

   

Re: [Users] Making v2v easier?

2014-01-19 Thread Ted Miller

On 01/17/2014 10:19 AM, Itamar Heim wrote:

I see a lot of threads about v2v pains (mostly from ESX?)

I'm interested to see if we can make this simpler/easier.

if you have experience with this, please describe the steps you are using
(also the source platform), and how you would like to see this make simpler
(I'm assuming that would start from somewhere in the webadmin probably).


I have spent most of the day trying to do this, and so far have failed.

Source: VMWare Server 2.0 disk files (.vmx, .vmdk, etc.), about 10 VMs to 
transfer.


Eliminating all the false starts and detours along the way, this is what I 
have done so far.


* copy my tree of vmware files to local storage;
in case I goof up or get fumble-fingered and need to start over clean
again.

* Set up a 32-bit VM (running Centos 6)
because vmware-vdiskmanager only seems to come in 32 bit in the VDSDK
package I found to download.
   I was planning to do this anyway, to run GoogleEarth and other
   software that doesn't come in pure-64bit format.

* run vmware-vdiskmanager -R 
to clean up errors that kept next step from happening on about 1/3 of
*.vmdk files.

* run ovftool .vmx .ova
to turn .vmx and .vmdk files into .ova files -- long process

* BEAT HEAD AGAINST WALL because virt-v2v.x86_64 0.9.1-5.el6_5 from Centos 
updates doesn't seem to know about .ova files.  I was following the 
instructions in 
Red_Hat_Enterprise_Virtualization-3.3-Beta-V2V_Guide-en-US.pdf guide, but I 
figured out that the v2v they are talking about has an "-i ova" option, 
while the help file for the version I am using does not list ova as an 
option for -i, and if I try to use it, it tells me that it is an invalid 
option, and if I leave it off it goes off looking for a qemu///system to 
attach to.  help files for v2v say nothing at all about .ova files.


I am wondering where to find a v2v program that knows about .ova files, or 
else am I going to have to import all my VMWare files to my (non-ovirt) KVM 
host, and then drag them into ovirt from libvirt?


My setup:
All hosts running Centos 6.5, fully up to date.
2 hosts
engine running in KVM VM, hosted on a non-oVirt KVM host.
gluster replica 3 file system across 2 ovirt hosts and on KVM host.

Going to bed now to give head some rest.
Ted Miller
Elkhart, IN, USA
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] engine-iso-uploader -- REST API not usable?

2014-01-18 Thread Ted Miller
I ran into this problem when I tried to use engine-iso-uploader, but reading 
on the lists makes it sound like it may be a more general problem.  There was 
a bug that caused this, but that was back in the ver. 3.0/3.1 days, and 
doesn't seem common since then.


Back on Dec 24 I was able to upload an ISO file OK, so I am not sure what has 
changed since then.


I am running a test setup, fully up to date:
office2a  host w/ glusterfs Centos 6
office4a  host w/ glusterfs Centos 6
ov-eng01 engine on Centos 6 VM (not hosted on oVirt)
office9  KVM host (not oVirt) for ov-eng01

whether I log in to ov-eng01 by ssh or execute the command from the console, 
I get:


# engine-iso-uploader list -v
Please provide the REST API password for the admin@internal oVirt Engine user 
(CTRL+D to abort):
ERROR: Problem connecting to the REST API.  Is the service available and does 
the CA certificate exist?


checking on some things suggested on a thread about engine-iso-uploader back 
in March, I get:


# ls -la /etc/pki/ovirt-engine/ca.pem
-rw-r--r--. 1 root root 4569 Nov 10 15:13 /etc/pki/ovirt-engine/ca.pem

# cat 
/var/log/ovirt-engine/ovirt-iso-uploader/ovirt-iso-uploader/20140117112938.log
2014-01-17 11:29:44::ERROR::engine-iso-uploader::512::root:: Problem 
connecting to the REST API.  Is the service available and does the CA 
certificate exist?


The thread back in March gave a work-around to upload ISO images directly, so 
I am not "blocked" from uploading images, but I would like to get things 
working "right", as I am afraid the problem will "turn around and bite me" 
down the road.


Ted Miller



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] tuned profile for Centos hosts -- new Bugzilla or Regression

2014-01-06 Thread Ted Miller
I posted a script (a while back) to get oVirt running on Centos hosts.

One of the items in it has to do with what "tuned" profile to use.  At the time 
I first ran into it, this was a fatal error.  It is now just a warning, so it 
does not prevent installing a host.  But, as a warning, a lot of people are 
probably missing it.

When using Centos 6 as the host OS, the script tries to install a 
"rhs-virtualization" profile.  That profile is not included in Centos.  I 
substituted the "virtual-host" profile.

I believe that this may be a regression as a result of Bugzilla 
987293<https://bugzilla.redhat.com/show_bug.cgi?id=987293>, where 
"rhs-virtualization" was substituted for "virtual-host" for RHEV + RHS.  I am 
guessing that whatever is used as a switch to determine RHEV + RHS is also 
shoving Centos into that same path, which is not appropriate.

My suggestion would be to write the script so that it uses "rhs-virtualization" 
when present, and if it is not present, then it falls back to "virtual-host".  
(I don't know what (if any) differences there are between the two profiles.)

Should I open a new bug, make a comment on 987293, or take some other path?

Ted Miller

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Centos 6.5 host configuration script -- a tale begun

2013-12-16 Thread Ted Miller

On 12/13/2013 2:59 AM, Sven Kieske wrote:

Hi,

first, thanks for this script!
I'll have to setup some CentOS 6.5 machines too, maybe it will help.
Here are some questions/improvements from me:

There's no need to use "localinstall" anymore, "install" is fine with
yum :-)

Then you install "virt-manager", for what purpose, may I ask?
I also never needed to manually create the ovirt-management bridge
has this behaviour changed in recent vdsm/CentOS Release?


Am 13.12.2013 05:48, schrieb Ted Miller:


# script to prepare Centos 6.5 for ovirt host install process

echo "=  Ted's personal preferences--early 
"

yum -y install nano deltarpm yum-plugin-priorities yum-presto mlocate

echo "=== end of Ted's personal preferences--early 
"

yum -y upgrade

echo "install some repos (if not already 
done).."
cd /etc/yum.repos.d
if [ ! -f glusterfs-epel.repo ] ; then
   echo "..installing gluster repo..."
   yum -y install wget
   wget 
http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo
   echo "..done installing gluster repo.."
fi

if [ ! -f el6-ovirt.repo ] ; then
   echo "..installing ovirt repo..."
   yum -y localinstall http://ovirt.org/releases/ovirt-release-el.noarch.rpm
   echo "..done installing ovirt repo.."
fi

if [ ! -f epel.repo ] ; then
   echo "..installing epel repo..."
   yum -y localinstall 
http://mirror.us.leaseweb.net/epel/6/i386/epel-release-6-8.noarch.rpm
   echo "..done installing epel repo.."
fi


echo "install libvirt"
 yum -y install libvirt qemu-kvm tuned
echo "install virt-manager" # with unlisted dependencies
yum -y install virt-manager xorg-x11-xauth dejavu-lgc-sans-mono-fonts

# create the ovirtmgmt bridge
if [ ! -f /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt ]; then
echo "creating ovirtmgmt 
bridge.."
service libvirtd start
service libvirtd status
virsh net-destroy default
virsh net-undefine default
virsh iface-bridge eth0 ovirtmgmt
service network restart
service libvirtd stop
service libvirtd status
fi

echo ".copy tuned profile..."
# copy virtual-host --> rhs-virtualization so ovirt is happy
cp -r /etc/tune-profiles/virtual-host /etc/tune-profiles/rhs-virtualization

yum -y install vdsm

echo "=  Ted's personal preferences--late 
="

#add lines to send messages to TTY12
cat /etc/rsyslog.conf | grep tty12
if [ ! $? -eq 0 ] ; then
echo "...Adding for tty12"
echo " " >> /etc/rsyslog.conf
echo "# Log everything to tty12"  >> /etc/rsyslog.conf
echo "*.* /dev/tty12" >> /etc/rsyslog.conf
service rsyslog restart
fi

echo "...install gkrellm.."
yum -y install gkrellm

#add poll=0 to kill noveau messages
cat /boot/grub/grub.conf | grep poll=0
if [ ! $? -eq 0 ] ; then
echo "Adding poll=0.."
sed -i '/^[ \t]kernel*/ s/$/ drm-kms-helper.poll=0/g' /boot/grub/grub.conf
fi

I followed your instructions (and similar suggestions by others).  I was 
installing the host only, so commented out everything in my file except 
install EPEL and ovirt repos.  I'm not interested in hearing about what 
"should" happen, or what happened on 6.4. These are the (unfixed) problems 
with 6.5, and they are real.


INSTALL FAILED with message:

Failed to install Host office4a. Yum [u'glusterfs-server-3.4.0-8.el6.x86_64 
requires glusterfs-libs = 3.4.0-8.el6', u'glusterfs-server-3.4.0-8.el6.x86_64 
requires glusterfs = 3.4.0-8.el6', u'glusterfs-server-3.4.0-8.el6.x86_64 
requires glusterfs-fuse = 3.4.0-8.el6', u'glusterfs-cli-3.4.0-8.el6.x86_64 
requires glusterfs-libs = 3.4.0-8.el6'].


# yum list gluster
base: mirror.oss.ou.edu
epel: mirrors.servercentral.net
extras: mirror.dattobackup.com
updates: centos.sonn.com
Available Packages
glusterfs.x86_64  3.4.0.36rhs-1.el6 base
glusterfs-api.x86_64  3.4.0.36rhs-1.el6 base
glusterfs-api-devel.x86_643.4.0.36rhs-1.el6 base
glusterfs-cli.x86_64  3.4.0-8.el6   glusterfs-epel
glusterfs-debuginfo.x86_643.4.0-8.el6   glusterfs-epel
glusterfs-devel.x86_643.4.0.36rhs-1.el6 base
glusterfs-fuse.x

Re: [Users] simple networking? [SOLVED] mostly

2013-12-13 Thread Ted Miller


On 12/13/2013 7:56 AM, Bob Doolittle wrote:


On 12/12/2013 11:04 PM, Ted Miller wrote:


From: users-boun...@ovirt.org  on behalf of Ted 
Miller 

Sent: Wednesday, November 27, 2013 12:18 PM
To: users@ovirt.org
Subject: [Users] simple networking?

I am trying to set up a testing network using o-virt, but the networking is
refusing to cooperate.  I am testing for possible use in two different
production setups.

My previous experience has been with VMWare.  I have always set up a single
bridged network on each host.  All my hosts, VMs, and non-VM computers were
peers on the LAN.  They could all talk to each other, and things worked very
well.  There was a firewall/gateway that provided access to the Internet, and
hosts, VMs, and could all communicate with the Internet as needed.

o-virt seems to be compartmentalizing things beyond all reason.
Is there any way to set up simple networking, so ALL computers can see each
other?
Is there anywhere that describes the philosophy behind the networking setup?
What reason is there that networks are so divided?

After banging my head against the wall trying to configure just one host, I
am very frustrated.  I have spent several HOURS Googling for a coherent
explanation of how/why networking is supposed to work, but only fine obscure
references like "letting non-VMs see VM traffic would be a huge security
violation".  I have no concept of what king of an installation the o-virt
designers have in mind, but it is obviously worlds different from what I am
trying to do.

The best I can tell, o-virt networking works like this (at least when you
have only one NIC):
there must be an ovirtmgt network, which cannot be combined with any other
network.
   the ovirtmgt network cannot talk to VMs (unless that VM is running the
engine)
   the ovirtmgt network can only talk to hosts, not to other non-VM 
computers

a VM network can talk only to VMs
   cannot talk to hosts
   cannot talk to non-VMs
hosts cannot talk to my LAN
hosts cannot talk to VMs
VMs cannot talk to my LAN
All of the above are enforced by a boatload of firewall rules that o-virt
puts into every host and VM under its jurisdiction.

All of the above is inferred from things I Googled, because I can't find
anywhere that explains what or how things are supposed to work--only things
telling people WHAT THEY CANT DO.  All I see on the mailing lists is people
getting their hands slapped because they are trying to do SIMPLE SETUPS that
should work, but don't (due to either design restrictions or software bugs).

My use case A:
   * My (2 or 3) hosts have only one physical NIC.
   * My VMs exist to provide services to non-VM computers.
  *  The VMs do not run X-windows, but they provide GUI programs to
non-VMs via "ssh -X" connections.
   * MY VMs need access to storage that is shared with hosts and non-VMs on
the LAN.

Is there some way to TURN OFF network control in o-virt?  My systems are
small and static.  I can hand-configure the networking a whole lot easier
than I can deal with o-virt (as I have used it so far). Mostly I would need
to be able to turn off the firewall rules on both hosts and VMs.

banging head against wall,
Ted
*

I have spent the last three days getting a Centos 6.5 host running under 
O-virt.


Since the networking was just a small part of this, I am going to open an 
new thread
to discuss the Centos 6.5 host setup process.  Look for a thread titled 
something like
"Centos 6.5 host configuration" if you want the gory details, or want to 
try if for yourself.


My biggest problem is that the o-virt GUI is apparently incapable of setting
up a bridge in Centos, which turned out to be what I needed.  I had to set 
up the
bridge BEFORE adding the host to the ovirt cluster.  If the bridge was not 
set

up ahead of time, the whole installation failed completely.

The bridge was only one of a list of things that had to be done ahead of 
time, in order

for the process to complete correctly.


Ted, I have RHEL 6.5 running in a VM, and it can talk to all my VMs and 
hosts on my LAN, and I didn't have to do anything special. I didn't define 
any new networks or bridges or anything of the sort, either in oVirt or on 
my host or engine. It just worked.


I am running RHEL 6.5 on both my engine and my host, as well in this 
particular VM.


-Bob
Do you have the Engine on a separate machine, or did you set up the host as 
an All-In-One?


Did you install 6.5 or upgrade to 6.5?

Ted
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Centos 6.5 host configuration script -- a tale begun

2013-12-12 Thread Ted Miller
I have been working since Monday to get a Centos 6.5 host node added to ovirt.  
Since 6.5 is just out, I figured I might as well use the latest and greatest to 
build my host.  Today I succeeded (I think).  I have not actually added a VM to 
the host, but at least ovirt is willing to accept that the host is part of the 
default cluster.  My next task will be bringing up gluster on that same node.

Since I will be doing at least three more hosts, I wrote a shell script to do a 
the setup that is needed to make the ovirt process succeed.

No, I do not have a similar script for the engine.  I got that running under 
Centos 6.4 in a VM, without too much problem.  That VM is temporarily running 
on a KVM host (but that host is not under ovirt).

Feel welcome to stare at or run my script and make any comments or observations.

  *   The script is run after a clean install of Centos 6.5 from the "minimal" 
ISO.
  *   I will try to remember what each element was there for, if anything is 
not clear.
  *   There are probably a few (not many) things there that are not needed
 *   Mostly they result in doing something ahead of time that ovirt was 
going to do later anyway.
  *   Feel free to point out a better way to do whatever needs to be done.
  *   Bits and pieces of the script were stolen from googleing here and there.
  *   Parts of the script were cooked up by stewing logs over low heat until 
something useful bubbled to the top.
  *   The number of clean reinstalls to test the script is beyond count.
 *   I almost broke down and learned how to write a kickstart file (but 
didn't).
  *   No guarantees or representations.
 *   So far this script has been tested on exactly one set of bare-metal 
hardware
*   That hardware is not server-grade. (ovirt keeps complaining because 
I have not configured Power Management :)
  *   There are a few things that are personal preferences (things I install on 
all my Linux machines)
 *   I believe those preferences are clearly marked.
 *   I am leaving them in because they may (incidentally) be installing 
some dependencies that influence the outcome of the process.

I hope to see a day when a similar script is either not needed, or is available 
and maintained as part of the Centos distro, or as part of ovirt.  Meanwhile we 
try to muddle through.

I will copy my script into this webmail interface (OWA) (since I am writing at 
home and this is all I have to work with) and see how bad it mangles it.  
You'll probably need a wide window so that lines don't wrap, as Microsoft 
thinks this OWA interface doesn't ever need to let me specify text as 
"preformat".  I called my script ov_host-start.sh


# script to prepare Centos 6.5 for ovirt host install process

echo "=  Ted's personal preferences--early 
"

yum -y install nano deltarpm yum-plugin-priorities yum-presto mlocate

echo "=== end of Ted's personal preferences--early 
"

yum -y upgrade

echo "install some repos (if not already 
done).."
cd /etc/yum.repos.d
if [ ! -f glusterfs-epel.repo ] ; then
  echo "..installing gluster repo..."
  yum -y install wget
  wget 
http://download.gluster.org/pub/gluster/glusterfs/LATEST/EPEL.repo/glusterfs-epel.repo
  echo "..done installing gluster repo.."
fi

if [ ! -f el6-ovirt.repo ] ; then
  echo "..installing ovirt repo..."
  yum -y localinstall http://ovirt.org/releases/ovirt-release-el.noarch.rpm
  echo "..done installing ovirt repo.."
fi

if [ ! -f epel.repo ] ; then
  echo "..installing epel repo..."
  yum -y localinstall 
http://mirror.us.leaseweb.net/epel/6/i386/epel-release-6-8.noarch.rpm
  echo "..done installing epel repo.."
fi


echo "install libvirt"
yum -y install libvirt qemu-kvm tuned
echo "install virt-manager" # with unlisted dependencies
yum -y install virt-manager xorg-x11-xauth dejavu-lgc-sans-mono-fonts

# create the ovirtmgmt bridge
if [ ! -f /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt ]; then
   echo "creating ovirtmgmt 
bridge.."
   service libvirtd start
   service libvirtd status
   virsh net-destroy default
   virsh net-undefine default
   virsh iface-bridge eth0 ovirtmgmt
   service network restart
   service libvirtd stop
   service libvirtd status
fi

echo ".copy tuned profile..."
# copy virtual-host --> rhs-virtualization so ovirt is happy
cp -r /etc/tune-profiles/virtual-host /etc/tune-profiles/rhs-virtualization

yum -y install vdsm

echo "=  Ted's personal preferences--late 
="

#add lines to send messages to TTY12
cat /etc/rsyslog.conf | grep tty12
if [ ! $? -eq 0 ] ; then
   ech

[Users] Re: simple networking? [SOLVED] mostly

2013-12-12 Thread Ted Miller


From: users-boun...@ovirt.org  on behalf of Ted Miller 

Sent: Wednesday, November 27, 2013 12:18 PM
To: users@ovirt.org
Subject: [Users] simple networking?

I am trying to set up a testing network using o-virt, but the networking is
refusing to cooperate.  I am testing for possible use in two different
production setups.

My previous experience has been with VMWare.  I have always set up a single
bridged network on each host.  All my hosts, VMs, and non-VM computers were
peers on the LAN.  They could all talk to each other, and things worked very
well.  There was a firewall/gateway that provided access to the Internet, and
hosts, VMs, and could all communicate with the Internet as needed.

o-virt seems to be compartmentalizing things beyond all reason.
Is there any way to set up simple networking, so ALL computers can see each
other?
Is there anywhere that describes the philosophy behind the networking setup?
What reason is there that networks are so divided?

After banging my head against the wall trying to configure just one host, I
am very frustrated.  I have spent several HOURS Googling for a coherent
explanation of how/why networking is supposed to work, but only fine obscure
references like "letting non-VMs see VM traffic would be a huge security
violation".  I have no concept of what king of an installation the o-virt
designers have in mind, but it is obviously worlds different from what I am
trying to do.

The best I can tell, o-virt networking works like this (at least when you
have only one NIC):
there must be an ovirtmgt network, which cannot be combined with any other
network.
  the ovirtmgt network cannot talk to VMs (unless that VM is running the
engine)
  the ovirtmgt network can only talk to hosts, not to other non-VM computers
a VM network can talk only to VMs
  cannot talk to hosts
  cannot talk to non-VMs
hosts cannot talk to my LAN
hosts cannot talk to VMs
VMs cannot talk to my LAN
All of the above are enforced by a boatload of firewall rules that o-virt
puts into every host and VM under its jurisdiction.

All of the above is inferred from things I Googled, because I can't find
anywhere that explains what or how things are supposed to work--only things
telling people WHAT THEY CANT DO.  All I see on the mailing lists is people
getting their hands slapped because they are trying to do SIMPLE SETUPS that
should work, but don't (due to either design restrictions or software bugs).

My use case A:
  * My (2 or 3) hosts have only one physical NIC.
  * My VMs exist to provide services to non-VM computers.
 *  The VMs do not run X-windows, but they provide GUI programs to
non-VMs via "ssh -X" connections.
  * MY VMs need access to storage that is shared with hosts and non-VMs on
the LAN.

Is there some way to TURN OFF network control in o-virt?  My systems are
small and static.  I can hand-configure the networking a whole lot easier
than I can deal with o-virt (as I have used it so far). Mostly I would need
to be able to turn off the firewall rules on both hosts and VMs.

banging head against wall,
Ted
*

I have spent the last three days getting a Centos 6.5 host running under O-virt.

Since the networking was just a small part of this, I am going to open an new 
thread 
to discuss the Centos 6.5 host setup process.  Look for a thread titled 
something like
"Centos 6.5 host configuration" if you want the gory details, or want to try if 
for yourself.

My biggest problem is that the o-virt GUI is apparently incapable of setting
up a bridge in Centos, which turned out to be what I needed.  I had to set up 
the 
bridge BEFORE adding the host to the ovirt cluster.  If the bridge was not set 
up ahead of time, the whole installation failed completely.

The bridge was only one of a list of things that had to be done ahead of time, 
in order
for the process to complete correctly.

Ted Miller
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] simple networking?

2013-12-02 Thread Ted Miller
Thank you for your response, Mike.  I am slow answering because of the 
American Thanksgiving holiday.  Answers are below.


On 11/28/2013 1:41 AM, Mike Kolesnik wrote:

- Original Message -

I am trying to set up a testing network using o-virt, but the networking is
refusing to cooperate.  I am testing for possible use in two different
production setups.

My previous experience has been with VMWare.  I have always set up a single
bridged network on each host.  All my hosts, VMs, and non-VM computers were
peers on the LAN.  They could all talk to each other, and things worked very
well.  There was a firewall/gateway that provided access to the Internet, and
hosts, VMs, and could all communicate with the Internet as needed.

o-virt seems to be compartmentalizing things beyond all reason.
Is there any way to set up simple networking, so ALL computers can see each
other?
Is there anywhere that describes the philosophy behind the networking setup?
What reason is there that networks are so divided?

Yes there is lack of documentation in this area, it's a shame but given it's an
open source project with an open wiki, everyone is invited to contribute and
improve this.

I'll see if I can get a page started..

Please post a link if you succeed.



After banging my head against the wall trying to configure just one host, I
am very frustrated.  I have spent several HOURS Googling for a coherent
explanation of how/why networking is supposed to work, but only fine obscure
references like "letting non-VMs see VM traffic would be a huge security
violation".  I have no concept of what king of an installation the o-virt
designers have in mind, but it is obviously worlds different from what I am
trying to do.

The best I can tell, o-virt networking works like this (at least when you
have only one NIC):
there must be an ovirtmgt network, which cannot be combined with any other
network.
   the ovirtmgt network cannot talk to VMs (unless that VM is running the
engine)
   the ovirtmgt network can only talk to hosts, not to other non-VM
   computers
a VM network can talk only to VMs
   cannot talk to hosts
   cannot talk to non-VMs
hosts cannot talk to my LAN
hosts cannot talk to VMs
VMs cannot talk to my LAN
All of the above are enforced by a boatload of firewall rules that o-virt
puts into every host and VM under its jurisdiction.

Not sure what you mean by all these "restrictions", from what I know the 
firewall
rules that are set on each host are to allow host to talk to engine
(ssh, vdsm, VM consoles traffic, etc) no more no less..

Usually the default behavior of firewall is to block almost all communication so
when you add a host and check the "Configure firewall" box it modifies it so 
that
your host can function properly.


I need my host to be on my LAN (for multiple reasons).  Ovirtmgt "stole" the 
LAN connection, and cut off the host from the LAN, a connection which worked 
fine until then.


oVirt has no sense of firewall otherwise. For all it cares you can turn it off
completely, or configure it by yourself (manually or via 
puppet/chef/foreman/etc)
and not use the capability of the system to configure it for you.
How do I keep the engine from reconfiguring the firewall again if I change it 
manually?  I saw a blog post that mentioned being able to uncheck a box (on 
the o-virt web GUI) called "configure IPTables". That /might/ be what I 
need.  I didn't see that box, but I wasn't looking for it (and at the moment 
I don't have o-virt available to me).

You can also change it so that it uses the rules you want by modifying
IPTablesConfig via engine-config tool.


Where can I find documentation on changing firewall rules using engine-config?

From what I understand, I want my LAN to be my non-VLAN bridge.  Can I move 
the ovirtmgt functionality to run over the LAN, or can I/will I have to put 
ovirt-mgt onto a VLAN?

All of the above is inferred from things I Googled, because I can't find
anywhere that explains what or how things are supposed to work--only things
telling people WHAT THEY CANT DO.  All I see on the mailing lists is people
getting their hands slapped because they are trying to do SIMPLE SETUPS that
should work, but don't (due to either design restrictions or software bugs).
My use case A:
   * My (2 or 3) hosts have only one physical NIC.
   * My VMs exist to provide services to non-VM computers.
  *  The VMs do not run X-windows, but they provide GUI programs to
non-VMs via "ssh -X" connections.
   * MY VMs need access to storage that is shared with hosts and non-VMs on
the LAN.

Your VMs will be sitting on the ovirtmgmt network, or on a VLAN?
I want them to sit on the LAN (which may be ovirtmgt, if I can get the IP 
filtering turned off).  If they have to be on something else too, that is OK, 
as long as it does not interfere with them being on the LAN.


FYI, the LANs on both of my applications are fairly small.  One of them less 
than 10 nodes, the other less than 

Re: [Users] simple networking?

2013-12-02 Thread Ted Miller

On 11/28/2013 3:54 AM, noc wrote:

On 27-11-2013 18:18, Ted Miller wrote:
I am trying to set up a testing network using o-virt, but the networking 
is refusing to cooperate.  I am testing for possible use in two different 
production setups.


My previous experience has been with VMWare.  I have always set up a 
single bridged network on each host.  All my hosts, VMs, and non-VM 
computers were peers on the LAN.  They could all talk to each other, and 
things worked very well.  There was a firewall/gateway that provided 
access to the Internet, and hosts, VMs, and could all communicate with the 
Internet as needed.


o-virt seems to be compartmentalizing things beyond all reason.
That is a way to use oVirt, but the following simple setup should work and 
give you a way to check against your setup.


I have two setups, one at home and one at work. The one at home is a setup 
of 2 hosts and one of those is a hacked up host/engine.
engine/host1: standard fedora19 kde install, static ip (192.168.1.11) 
configured with my NAS (192.168.1.16) as dhcp/dns server and my internet 
router (192.168.1.254) as gateway
Just make sure that NetworkManager is off and that your interfaces are not 
NM managed, network on.
This was a allinone setup but I got a NAS with NFS so I turned my aio setup 
into a engine/host system. It has problems with that but nothing network 
related.


Host2: same as above but without the engine install, ip:192.168.1.22, gw 
192.168.1.254 DNS:192.168.1.16.


How does it all come together?
Well in your case, and mine if I were to start over, start with a static 
network which is NOT managed by NetworkManager. Use either Fedora or Centos 
which ever you more comfortable with and it also depends on whether you 
want to test/use all the features in oVirt. Currently, there are a few 
features not available in Centos because the versions of 
libvirt/kvm/qemu/gluster are too old in Centos.
Install ovirt-engine on your first 'server', probably choose NFS as your 
storage domain, either on your engine server or from somewhere else on your 
network. Make sure its nfs-v3 and not v4!, local default is v4!
Make sure that ip addresses on you network are resolvable, either through 
/etc/hosts or through DNS! Engine-setup will complain if this doesn't work, 
using localhost will not work either!
On the engine server there will be no bridge and nothing will change the 
network config.


Next the first host.
Prepare the host in a similar way you did the engine server. You can choose 
a minimal install of either Centos or Fedora or install a full desktop but 
make sure that ips are static and NOT managed by NetworkManager, hostname 
resolvable, ovirt repo available.


From the webui add your prepared host and if everything went OK you'll see 
that on that host you will now have a bridge, ovirtmgmt, which acts as the 
primary interface.
Create a VMs and choose ovirtmgmt as a network for its nics, can't choose 
anything else. Either give the VMs a static address or use a dhcp server 
but the VMs should be able to talk to each other, to the host(s), the 
engine and to the internet.


Every host that you add after the first will also has its network turned 
into a bridge, ovirtmgmt, and communication/migration/display/etc will take 
place over this network. One caveat, storage domain mapping is from the 
host to the storage, the engine, if it is NOT the NFS server, doesn't have 
to have access to the storage.


If you have servers with more that 1 nic then you can create additional 
networks using the webui of oVirt and assign these to clusters and to VMs.


If you need vlans to coexist with ovirtmgmt on the same physical nic, I 
think that is possible but haven't tried it myself. In theory you need to 
setup the network first outside of oVirt, including you vlan structure and 
then install ovirt.


Some concepts:
oVirt engine: is just the manager, does 'nothing' related to running VMs 
itself. You can turn it off and all hosts with their VMs will keep running. 
You just can't start new ones, in short manage them.
oVirt host: is the real workhorse and is managed using oVirt-engine. Runs 
VDSM which communicates with engine and starts/manages the VMs on the host 
on behalf of engine.
oVirt node: is a special slimmed down Fedora distro that includes VDSM and 
a small setup so that it can be used as a oVirt host


People tend to mix and match ovirt-host and ovirt-node which makes for nice 
communication problems :-)


If you haven't done so, there is an irc channel, ovirt, on irc.oftc.net 
with helpful people, if they are awake.


Joop
--
#irc jvandewege

When I get another project out of the way (hopefully this week), I will be 
able to get back to my test setup and try again.  Between your info, 
something I stumbled onto on a blog, and the info from Mike, I hope to have 
enough to make some progress when I take another stab at it.


Ted Miller

_

Re: [Users] simple networking?

2013-12-02 Thread Ted Miller


On 11/27/2013 4:35 PM, Thomas Suckow wrote:

On 11/27/2013 01:00 PM, Ted Miller wrote:

I am not using an all-in-one.

Do you have more than one host?  If not, that is a very different story,
because it only has to "talk to itself".  I have the engine on a VM (at the
moment on a KVM host not managed by ovirt).  I was trying to bring up one
host, but couldn't get past that point.  Will then have to add another host,
and migrate the engine to running on one of those two hosts.
Ted Miller

I don't currently, I had dabbled with adding another host but found out the
other server had a different processor and removed it. That said, my vms
can talk to eachother and the host can talk to vms and vice versa.


That still doesn't offer what I need: VMs and host all talking on LAN to all 
other LAN residents.



It works better than when I just used virt-manager.

After setting up the bridge on the host does it lose all network connectivity?


No, it could still talk to ovirt-engine.  It seemed to work the way o-virt 
wanted it to, just not the way I need it to.



If so it may be the same issue I was having where I had to manually
manipulate the network configuration to fix the bridge.


Thanks for the answer,
Ted Miller

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] simple networking?

2013-11-27 Thread Ted Miller
I am trying to set up a testing network using o-virt, but the networking is 
refusing to cooperate.  I am testing for possible use in two different 
production setups.


My previous experience has been with VMWare.  I have always set up a single 
bridged network on each host.  All my hosts, VMs, and non-VM computers were 
peers on the LAN.  They could all talk to each other, and things worked very 
well.  There was a firewall/gateway that provided access to the Internet, and 
hosts, VMs, and could all communicate with the Internet as needed.


o-virt seems to be compartmentalizing things beyond all reason.
Is there any way to set up simple networking, so ALL computers can see each 
other?

Is there anywhere that describes the philosophy behind the networking setup?
What reason is there that networks are so divided?

After banging my head against the wall trying to configure just one host, I 
am very frustrated.  I have spent several HOURS Googling for a coherent 
explanation of how/why networking is supposed to work, but only fine obscure 
references like "letting non-VMs see VM traffic would be a huge security 
violation".  I have no concept of what king of an installation the o-virt 
designers have in mind, but it is obviously worlds different from what I am 
trying to do.


The best I can tell, o-virt networking works like this (at least when you 
have only one NIC):
there must be an ovirtmgt network, which cannot be combined with any other 
network.
 the ovirtmgt network cannot talk to VMs (unless that VM is running the 
engine)

 the ovirtmgt network can only talk to hosts, not to other non-VM computers
a VM network can talk only to VMs
 cannot talk to hosts
 cannot talk to non-VMs
hosts cannot talk to my LAN
hosts cannot talk to VMs
VMs cannot talk to my LAN
All of the above are enforced by a boatload of firewall rules that o-virt 
puts into every host and VM under its jurisdiction.


All of the above is inferred from things I Googled, because I can't find 
anywhere that explains what or how things are supposed to work--only things 
telling people WHAT THEY CANT DO.  All I see on the mailing lists is people 
getting their hands slapped because they are trying to do SIMPLE SETUPS that 
should work, but don't (due to either design restrictions or software bugs).


My use case A:
 * My (2 or 3) hosts have only one physical NIC.
 * My VMs exist to provide services to non-VM computers.
*  The VMs do not run X-windows, but they provide GUI programs to 
non-VMs via "ssh -X" connections.
 * MY VMs need access to storage that is shared with hosts and non-VMs on 
the LAN.


Is there some way to TURN OFF network control in o-virt?  My systems are 
small and static.  I can hand-configure the networking a whole lot easier 
than I can deal with o-virt (as I have used it so far). Mostly I would need 
to be able to turn off the firewall rules on both hosts and VMs.


banging head against wall,
Ted

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users