[Users] general (totally noob) questions...

2012-09-21 Thread Michael Hauber
While I am not completely new to running instances in KVM/Qemu, I am a total 
noob when it comes to clustering.

While I realize that not all of these questions pertain specifically to the 
oVirt project, I would much rather get input from people that have experience 
in this environment.

I am in the process of designing my next server farm and would like to know 
some basics that I can't seem to find the answers to via google search.


Background:

I have a server farm setup for a large family.  It consists of 9 servers 
(minimal-to-medium loads for the most part, but they are in poor condition 
physically).  My hopes are to run this same setup virtually in a 3-node 
cluster.  I can run the entire farm in a virtual environment on a host with a 
dual, quad-core box with 64gig RAM.  That said, I know that each of these 
nodes will be capable of handling the entire load by itself if absolutely 
necessary.

For disk space, my hopes are to use two drive arrays (10 configuration?) which 
I can build (up and out) from without having too much trouble.

Everything will be fiber, including drops to the computers and televisions.  
Copper will exist as well for things like POE, printers, etc. (but not in 
scope)


Questions:

1.  As I understand it, a N-to-N configuration means that there will be load 
balancing between the nodes as well as failover.  Is the load balancing 
something that is manual (I have to monitor/balance the load manually, or is 
it done automatically?)  

2.  If it is done automatically, how do the loads get split up?  Is the 
virtual machine itself the unit of load that transfers from one node to 
another or does it go so far as balancing services running inside those 
virtual machines?

3.  For the fail-over, is it seemless in the sense that the user's connections 
don't get reset or is there a short period of down-time before the service is 
available again?  While this isn't a big issue for me, it is something that 
I've been wondering about.

4.  Fibre channel or FCOE?  (I've spent entire evenings trying to get a 
straight answer through google searches, but there seem to be way too many 
agendas).  Being that one of the virtual servers will be a media server for 
the televisions (new addition), my worry is lag-time (I would like to serve at 
least 5 televisions without lag-time).  The array will also support things 
like file server, space for about 2-dozen www (family pages (lots of 
pictures)), space for mail, space for backups (rSync,Amanda), ISO boots, etc.

I have the logical and physical topology drawn up for the most part, but until 
I understand a bit more, I'm scared _less about dropping any money into the 
project because I've neither seen a setup like this in action, and what I find 
online seems a bit vague in the way of details (dumbed-down enough for the 
likes of me to know if the hardware I'm buying is necessary and/or sufficient 
to 
achieve my goals).

Any and all pointers, suggestions, patience, etc. would be greatly 
appreciated.

mchauber


 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Problem with creating a glusterfs volume

2012-09-21 Thread Jason Brooks

On Fri 21 Sep 2012 04:19:27 PM PDT, Dominic Kaiser wrote:

Yes I can mount both to another computer.  Just not to ovirt.  I
noticed on the other computer which is Ubuntu 12.04 if you leave
mountproto=tcp out of the command it does not mount.  Does engine
default to tcp?


I believe that the gluster nfs server only supports tcp. On my setup, 
I've edited /etc/nfsmount.conf with Defaultvers=3, Nfsvers=3, and 
Defaultproto=tcp




Dk

On Sep 21, 2012 6:36 PM, "Jason Brooks" mailto:jbro...@redhat.com>> wrote:

On 09/21/2012 08:09 AM, Dominic Kaiser wrote:

I can mount to another computer with this command:

mount -o mountproto=tcp,vers=3 -t nfs
gfs1.bostonvineyard.org:/data
/home/administrator/test


I notice that in your previous message, citing the mount that
didn't work, you were mounting :/export, and above you're mounting
:/data. Can you also mount the export volume from another computer?



So volumes work but I get a 500 error timeout when trying to
add as a
storage domain in ovirt.  weird?

dk

On Fri, Sep 21, 2012 at 10:44 AM, Dominic Kaiser
mailto:domi...@bostonvineyard.org>
>> wrote:

Hey All,

So I finally found the problem.  Cheap NIC's.  Installed
Intel NIC's
no problems creating gluster volumes and distributed
replicated
ones.  Broadcom and Realtek yuk!  So now I am trying to
mount the
gluster volume as a nfs mount and am having a problem.  It
is timing
out like it is blocked by a firewall.

I am trying to:  mount -t nfs
gfs1.bostonvineyard.org:/__export
/home/administrator/test

Here is gfs1 tail vdsm.log

[root@gfs1 vdsm]# tail vdsm.log
Thread-88731::DEBUG::2012-09-__21


10:35:56,566::resourceManager:__:844::ResourceManager.Owner::(__cancelAll)
Owner.cancelAll requests {}
Thread-88731::DEBUG::2012-09-__21
10:35:56,567::task::978::__TaskManager.Task::(_decref)
Task=`01b69eed-de59-4e87-8b28-__5268b5dcbb50`::ref 0
aborting False
Thread-88737::DEBUG::2012-09-__21
10:36:06,890::task::588::__TaskManager.Task::(___updateState)
Task=`f70222ad-f8b4-4733-9526-__eff1d214ebd8`::moving from
state init
-> state preparing
Thread-88737::INFO::2012-09-21
10:36:06,891::logUtils::37::__dispatcher::(wrapper) Run
and protect:
repoStats(options=None)
Thread-88737::INFO::2012-09-21
10:36:06,891::logUtils::39::__dispatcher::(wrapper) Run
and protect:
repoStats, Return response: {}
Thread-88737::DEBUG::2012-09-__21
10:36:06,891::task::1172::__TaskManager.Task::(prepare)
Task=`f70222ad-f8b4-4733-9526-__eff1d214ebd8`::finished: {}
Thread-88737::DEBUG::2012-09-__21
10:36:06,892::task::588::__TaskManager.Task::(___updateState)
Task=`f70222ad-f8b4-4733-9526-__eff1d214ebd8`::moving from
state
preparing -> state finished
Thread-88737::DEBUG::2012-09-__21


10:36:06,892::resourceManager:__:809::ResourceManager.Owner::(__releaseAll)
Owner.releaseAll requests {} resources {}
Thread-88737::DEBUG::2012-09-__21


10:36:06,892::resourceManager:__:844::ResourceManager.Owner::(__cancelAll)
Owner.cancelAll requests {}
Thread-88737::DEBUG::2012-09-__21
10:36:06,893::task::978::__TaskManager.Task::(_decref)
Task=`f70222ad-f8b4-4733-9526-__eff1d214ebd8`::ref 0
aborting False

Do you know why I can not connect via NFS?  Using an older
kernel
not 3.5 and iptables are off.

Dominic


On Mon, Sep 10, 2012 at 12:20 PM, Haim Ateya
mailto:hat...@redhat.com>
>> wrote:

On 09/10/2012 06:27 PM, Dominic Kaiser wrote:

Here is the message and the logs again except zipped I
failed the first delivery:

Ok here are the logs 4 node and 1 engine log.
 Tried making
/data folder owned by root and then tried by 36:36
neither
worked.  Name of volume is data to match folders
on nodes also.

Let me know what you think,

Dominic


this is the actual failure (taken from gfs2vdsm.log).

Thread-332442::DEBUG::2012-09-10
10:28:05,788::BindingXMLRPC::859::vds::(wrapper)
client
[10.3.0.241]::call volumeCreate with ('data',

Re: [Users] Problem with creating a glusterfs volume

2012-09-21 Thread Dominic Kaiser
Yes I can mount both to another computer.  Just not to ovirt.  I noticed on
the other computer which is Ubuntu 12.04 if you leave mountproto=tcp out of
the command it does not mount.  Does engine default to tcp?

Dk
On Sep 21, 2012 6:36 PM, "Jason Brooks"  wrote:

> On 09/21/2012 08:09 AM, Dominic Kaiser wrote:
>
>> I can mount to another computer with this command:
>>
>> mount -o mountproto=tcp,vers=3 -t nfs gfs1.bostonvineyard.org:/data
>> /home/administrator/test
>>
>
> I notice that in your previous message, citing the mount that didn't work,
> you were mounting :/export, and above you're mounting :/data. Can you also
> mount the export volume from another computer?
>
>
>
>> So volumes work but I get a 500 error timeout when trying to add as a
>> storage domain in ovirt.  weird?
>>
>> dk
>>
>> On Fri, Sep 21, 2012 at 10:44 AM, Dominic Kaiser
>> > >
>> wrote:
>>
>> Hey All,
>>
>> So I finally found the problem.  Cheap NIC's.  Installed Intel NIC's
>> no problems creating gluster volumes and distributed replicated
>> ones.  Broadcom and Realtek yuk!  So now I am trying to mount the
>> gluster volume as a nfs mount and am having a problem.  It is timing
>> out like it is blocked by a firewall.
>>
>> I am trying to:  mount -t nfs gfs1.bostonvineyard.org:/**export
>> /home/administrator/test
>>
>> Here is gfs1 tail vdsm.log
>>
>> [root@gfs1 vdsm]# tail vdsm.log
>> Thread-88731::DEBUG::2012-09-**21
>> 10:35:56,566::resourceManager:**:844::ResourceManager.Owner::(**
>> cancelAll)
>> Owner.cancelAll requests {}
>> Thread-88731::DEBUG::2012-09-**21
>> 10:35:56,567::task::978::**TaskManager.Task::(_decref)
>> Task=`01b69eed-de59-4e87-8b28-**5268b5dcbb50`::ref 0 aborting False
>> Thread-88737::DEBUG::2012-09-**21
>> 10:36:06,890::task::588::**TaskManager.Task::(_**updateState)
>> Task=`f70222ad-f8b4-4733-9526-**eff1d214ebd8`::moving from state init
>> -> state preparing
>> Thread-88737::INFO::2012-09-21
>> 10:36:06,891::logUtils::37::**dispatcher::(wrapper) Run and protect:
>> repoStats(options=None)
>> Thread-88737::INFO::2012-09-21
>> 10:36:06,891::logUtils::39::**dispatcher::(wrapper) Run and protect:
>> repoStats, Return response: {}
>> Thread-88737::DEBUG::2012-09-**21
>> 10:36:06,891::task::1172::**TaskManager.Task::(prepare)
>> Task=`f70222ad-f8b4-4733-9526-**eff1d214ebd8`::finished: {}
>> Thread-88737::DEBUG::2012-09-**21
>> 10:36:06,892::task::588::**TaskManager.Task::(_**updateState)
>> Task=`f70222ad-f8b4-4733-9526-**eff1d214ebd8`::moving from state
>> preparing -> state finished
>> Thread-88737::DEBUG::2012-09-**21
>> 10:36:06,892::resourceManager:**:809::ResourceManager.Owner::(**
>> releaseAll)
>> Owner.releaseAll requests {} resources {}
>> Thread-88737::DEBUG::2012-09-**21
>> 10:36:06,892::resourceManager:**:844::ResourceManager.Owner::(**
>> cancelAll)
>> Owner.cancelAll requests {}
>> Thread-88737::DEBUG::2012-09-**21
>> 10:36:06,893::task::978::**TaskManager.Task::(_decref)
>> Task=`f70222ad-f8b4-4733-9526-**eff1d214ebd8`::ref 0 aborting False
>>
>> Do you know why I can not connect via NFS?  Using an older kernel
>> not 3.5 and iptables are off.
>>
>> Dominic
>>
>>
>> On Mon, Sep 10, 2012 at 12:20 PM, Haim Ateya > > wrote:
>>
>> On 09/10/2012 06:27 PM, Dominic Kaiser wrote:
>>
>> Here is the message and the logs again except zipped I
>> failed the first delivery:
>>
>> Ok here are the logs 4 node and 1 engine log.  Tried making
>> /data folder owned by root and then tried by 36:36 neither
>> worked.  Name of volume is data to match folders on nodes
>> also.
>>
>> Let me know what you think,
>>
>> Dominic
>>
>>
>> this is the actual failure (taken from gfs2vdsm.log).
>>
>> Thread-332442::DEBUG::2012-09-**__10
>> 10:28:05,788::BindingXMLRPC::_**_859::vds::(wrapper) client
>> [10.3.0.241]::call volumeCreate with ('data',
>> ['10.4.0.97:/data', '10.4.0.98:/data', '10.4.0.99:/data',
>> '10.4.0.100:/data'],
>>   2, 0, ['TCP']) {} flowID [406f2c8e]
>> MainProcess|Thread-332442::__**DEBUG::2012-09-10
>> 10:28:05,792::__init__::1249::**__Storage.Misc.excCmd::(_log)
>> '/usr/sbin/gluster --mode=script volume create data replica 2
>> transport TCP 10.4.0.97:/data 10.4.0.98:/data 10
>> .4.0.99:/data 10.4.0.100:/data' (cwd None)
>> MainProcess|Thread-332442::__**DEBUG::2012-09-10
>> 10:28:05,900::__init__::1249::**__Storage.Misc.excCmd::(_log)
>> FAILED:  = 'Host 10.4.0.99 not a friend\n';  = 255
>> MainProcess|Thread-332442::__**ERROR::2012-09-10
>> 10:28:05,900::supervdsmServer:**__:76::SuperVdsm.**
>> ServerCallback:__:(wrapper)
>>  

Re: [Users] Problem with creating a glusterfs volume

2012-09-21 Thread Jason Brooks

On 09/21/2012 08:09 AM, Dominic Kaiser wrote:

I can mount to another computer with this command:

mount -o mountproto=tcp,vers=3 -t nfs gfs1.bostonvineyard.org:/data
/home/administrator/test


I notice that in your previous message, citing the mount that didn't 
work, you were mounting :/export, and above you're mounting :/data. Can 
you also mount the export volume from another computer?





So volumes work but I get a 500 error timeout when trying to add as a
storage domain in ovirt.  weird?

dk

On Fri, Sep 21, 2012 at 10:44 AM, Dominic Kaiser
mailto:domi...@bostonvineyard.org>> wrote:

Hey All,

So I finally found the problem.  Cheap NIC's.  Installed Intel NIC's
no problems creating gluster volumes and distributed replicated
ones.  Broadcom and Realtek yuk!  So now I am trying to mount the
gluster volume as a nfs mount and am having a problem.  It is timing
out like it is blocked by a firewall.

I am trying to:  mount -t nfs gfs1.bostonvineyard.org:/export
/home/administrator/test

Here is gfs1 tail vdsm.log

[root@gfs1 vdsm]# tail vdsm.log
Thread-88731::DEBUG::2012-09-21
10:35:56,566::resourceManager::844::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-88731::DEBUG::2012-09-21
10:35:56,567::task::978::TaskManager.Task::(_decref)
Task=`01b69eed-de59-4e87-8b28-5268b5dcbb50`::ref 0 aborting False
Thread-88737::DEBUG::2012-09-21
10:36:06,890::task::588::TaskManager.Task::(_updateState)
Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state init
-> state preparing
Thread-88737::INFO::2012-09-21
10:36:06,891::logUtils::37::dispatcher::(wrapper) Run and protect:
repoStats(options=None)
Thread-88737::INFO::2012-09-21
10:36:06,891::logUtils::39::dispatcher::(wrapper) Run and protect:
repoStats, Return response: {}
Thread-88737::DEBUG::2012-09-21
10:36:06,891::task::1172::TaskManager.Task::(prepare)
Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::finished: {}
Thread-88737::DEBUG::2012-09-21
10:36:06,892::task::588::TaskManager.Task::(_updateState)
Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state
preparing -> state finished
Thread-88737::DEBUG::2012-09-21
10:36:06,892::resourceManager::809::ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}
Thread-88737::DEBUG::2012-09-21
10:36:06,892::resourceManager::844::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-88737::DEBUG::2012-09-21
10:36:06,893::task::978::TaskManager.Task::(_decref)
Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::ref 0 aborting False

Do you know why I can not connect via NFS?  Using an older kernel
not 3.5 and iptables are off.

Dominic


On Mon, Sep 10, 2012 at 12:20 PM, Haim Ateya mailto:hat...@redhat.com>> wrote:

On 09/10/2012 06:27 PM, Dominic Kaiser wrote:

Here is the message and the logs again except zipped I
failed the first delivery:

Ok here are the logs 4 node and 1 engine log.  Tried making
/data folder owned by root and then tried by 36:36 neither
worked.  Name of volume is data to match folders on nodes also.

Let me know what you think,

Dominic


this is the actual failure (taken from gfs2vdsm.log).

Thread-332442::DEBUG::2012-09-__10
10:28:05,788::BindingXMLRPC::__859::vds::(wrapper) client
[10.3.0.241]::call volumeCreate with ('data',
['10.4.0.97:/data', '10.4.0.98:/data', '10.4.0.99:/data',
'10.4.0.100:/data'],
  2, 0, ['TCP']) {} flowID [406f2c8e]
MainProcess|Thread-332442::__DEBUG::2012-09-10
10:28:05,792::__init__::1249::__Storage.Misc.excCmd::(_log)
'/usr/sbin/gluster --mode=script volume create data replica 2
transport TCP 10.4.0.97:/data 10.4.0.98:/data 10
.4.0.99:/data 10.4.0.100:/data' (cwd None)
MainProcess|Thread-332442::__DEBUG::2012-09-10
10:28:05,900::__init__::1249::__Storage.Misc.excCmd::(_log)
FAILED:  = 'Host 10.4.0.99 not a friend\n';  = 255
MainProcess|Thread-332442::__ERROR::2012-09-10

10:28:05,900::supervdsmServer:__:76::SuperVdsm.ServerCallback:__:(wrapper)
Error in wrapper
Traceback (most recent call last):
   File "/usr/share/vdsm/__supervdsmServer.py", line 74, in wrapper
 return func(*args, **kwargs)
   File "/usr/share/vdsm/__supervdsmServer.py", line 286, in wrapper
 return func(*args, **kwargs)
   File "/usr/share/vdsm/gluster/cli.__py", line 46, in wrapper
 return func(*args, **kwargs)
   File "/usr/share/vdsm/gluster/cli.__py", line 176, in
volumeCreate
 raise ge.__GlusterVolumeCreateFailedExcep__tion(rc, out, err)
GlusterVolumeCreateFailedExcep__tion: Volume create failed
error: Hos

Re: [Users] Problem with creating a glusterfs volume

2012-09-21 Thread Dominic Kaiser
Any ideas?  Pretty please.

dk

On Fri, Sep 21, 2012 at 11:51 AM, Dominic Kaiser  wrote:

> I noticed something.  If I am trying to mount the gluster share from
> another computer and do not include mounproto=tcp it times out.  vers=3 or
> 4 does not matter.  Could this be why I can not add it from the engine gui?
>
> dk
>
>
> On Fri, Sep 21, 2012 at 11:12 AM, Dominic Kaiser <
> domi...@bostonvineyard.org> wrote:
>
>> Here is the engine.log info:
>>
>> [root@ovirt ovirt-engine]# tail engine.log
>> 2012-09-21 11:10:00,007 INFO
>>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
>> (QuartzScheduler_Worker-49) Autorecovering 0 hosts
>> 2012-09-21 11:10:00,007 INFO
>>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
>> (QuartzScheduler_Worker-49) Checking autorecoverable hosts done
>> 2012-09-21 11:10:00,008 INFO
>>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
>> (QuartzScheduler_Worker-49) Checking autorecoverable storage domains
>> 2012-09-21 11:10:00,009 INFO
>>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
>> (QuartzScheduler_Worker-49) Autorecovering 0 storage domains
>> 2012-09-21 11:10:00,010 INFO
>>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
>> (QuartzScheduler_Worker-49) Checking autorecoverable storage domains done
>> 2012-09-21 11:10:22,710 ERROR
>> [org.ovirt.engine.core.engineencryptutils.EncryptionUtils]
>> (QuartzScheduler_Worker-84) Failed to decryptData must not be longer than
>> 256 bytes
>> 2012-09-21 11:10:22,726 ERROR
>> [org.ovirt.engine.core.engineencryptutils.EncryptionUtils]
>> (QuartzScheduler_Worker-12) Failed to decryptData must start with zero
>> 2012-09-21 11:10:54,519 INFO
>>  [org.ovirt.engine.core.bll.storage.RemoveStorageServerConnectionCommand]
>> (ajp--0.0.0.0-8009-11) [3769be9c] Running command:
>> RemoveStorageServerConnectionCommand internal: false. Entities affected :
>>  ID: aaa0----123456789aaa Type: System
>> 2012-09-21 11:10:54,537 INFO
>>  
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
>> (ajp--0.0.0.0-8009-11) [3769be9c] START,
>> DisconnectStorageServerVDSCommand(vdsId =
>> 3822e6c0-0295-11e2-86e6-d74ad5358c03, storagePoolId =
>> ----, storageType = NFS, connectionList =
>> [{ id: null, connection: gfs1.bostonvineyard.org:/data };]), log id:
>> 16dd4a1b
>> 2012-09-21 11:10:56,417 INFO
>>  
>> [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
>> (ajp--0.0.0.0-8009-11) [3769be9c] FINISH,
>> DisconnectStorageServerVDSCommand, return:
>> {----=477}, log id: 16dd4a1b
>>
>> Thanks,
>>
>> dk
>>
>> On Fri, Sep 21, 2012 at 11:09 AM, Dominic Kaiser <
>> domi...@bostonvineyard.org> wrote:
>>
>>> I can mount to another computer with this command:
>>>
>>> mount -o mountproto=tcp,vers=3 -t nfs gfs1.bostonvineyard.org:/data
>>> /home/administrator/test
>>>
>>> So volumes work but I get a 500 error timeout when trying to add as a
>>> storage domain in ovirt.  weird?
>>>
>>> dk
>>>
>>> On Fri, Sep 21, 2012 at 10:44 AM, Dominic Kaiser <
>>> domi...@bostonvineyard.org> wrote:
>>>
 Hey All,

 So I finally found the problem.  Cheap NIC's.  Installed Intel NIC's no
 problems creating gluster volumes and distributed replicated ones.
  Broadcom and Realtek yuk!  So now I am trying to mount the gluster volume
 as a nfs mount and am having a problem.  It is timing out like it is
 blocked by a firewall.

 I am trying to:  mount -t nfs gfs1.bostonvineyard.org:/export
 /home/administrator/test

 Here is gfs1 tail vdsm.log

 [root@gfs1 vdsm]# tail vdsm.log
 Thread-88731::DEBUG::2012-09-21
 10:35:56,566::resourceManager::844::ResourceManager.Owner::(cancelAll)
 Owner.cancelAll requests {}
 Thread-88731::DEBUG::2012-09-21
 10:35:56,567::task::978::TaskManager.Task::(_decref)
 Task=`01b69eed-de59-4e87-8b28-5268b5dcbb50`::ref 0 aborting False
 Thread-88737::DEBUG::2012-09-21
 10:36:06,890::task::588::TaskManager.Task::(_updateState)
 Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state init ->
 state preparing
 Thread-88737::INFO::2012-09-21
 10:36:06,891::logUtils::37::dispatcher::(wrapper) Run and protect:
 repoStats(options=None)
 Thread-88737::INFO::2012-09-21
 10:36:06,891::logUtils::39::dispatcher::(wrapper) Run and protect:
 repoStats, Return response: {}
 Thread-88737::DEBUG::2012-09-21
 10:36:06,891::task::1172::TaskManager.Task::(prepare)
 Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::finished: {}
 Thread-88737::DEBUG::2012-09-21
 10:36:06,892::task::588::TaskManager.Task::(_updateState)
 Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state preparing ->
 state finished
 Thread-88737::DEBUG::2012-09-21
 10:36:06,892::resourceManager::809::ResourceManager.Owner::(releaseAll)
 Owner.releaseAll requests {} resources {}
 Thread-88737::DEBUG::2012-09-21
>>>

Re: [Users] Installation problem

2012-09-21 Thread Joop

Dave Neary wrote:

Hi,

Additional information:

After removing and re-running engine-setup, I try once again to run 
ovirt-engine. I get the following error in the logs:

==> /var/log/ovirt-engine/console.log <==
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Nothing else, unfortunately, and I have no idea how to debug JBoss AS 
to figure out what's going wrong.


Are there any requirements in terms of the JRE which aren't specified 
in the quick start guide?


@Dave,
Don't know if you have the luxury of starting from scratch but to be 
honest I have had problem with multple engin-setup/cleanup cycles and 
had to start from scratch which was faster than figuring out what went 
wrong. Those problems where with the pre-3.1 release. This is my 
ovirt.repo which I guess will be same for you:


[ovirt-stable]
name=Stable builds of the oVirt project
baseurl=http://ovirt.org/releases/stable/rpm/Fedora/$releasever/
enabled=1
skip_if_unavailable=1
gpgcheck=0

[ovirt-beta]
name=Beta builds of the oVirt project
baseurl=http://ovirt.org/releases/beta/rpm/Fedora/$releasever/
enabled=1
skip_if_unavailable=1
gpgcheck=0

[ovirt-nightly]
name=Nightly builds of the oVirt project
baseurl=http://ovirt.org/releases/nightly/rpm/Fedora/$releasever/
enabled=0
skip_if_unavailable=1
gpgcheck=0


And the following is the first part of my history of my managment server

   1  chkconfig sshd on
   2  service sshd start
   3  ifconfig
   4  yum update --exclude=kernel*
   5  uname -a
   6  yum install -y firefox
   7  yum install -y wget
   8  yum install -y mc
   9  yum install -y spice-xpi
  10  init 6
  11  system-config-network
  12  cd /etc/sysconfig/networking/
  13  ls
  14  ls -l devices/
  15  cat devices/ifcfg-em1
  16  nano devices/ifcfg-em1
  17  nano devices/ifcfg-p2p1
  18  nano devices/ifcfg-p2p2
  19  ls
  20  ls devices/
  21  chkconfig NetworkManager off
  22  service NetworkManager stop
  23  chkconfig network on
  24  service network start
  25  ifconfig
  26  init 6
  27  exit
  28  ifconfig
  29  ifconfig
  30  yum localinstall 
http://ovirt.org/releases/ovirt-release-fedora.noarch.rpm

  31  nano /etc/yum.repos.d/ovirt.repo
  32  yum install engine-setup -y
  33  ifconfig
  34  ifconfig
  35  pwd
  36  system-config-network
  37  ifconfig
  38  service network restart
  39  ifconfig
  40  init 6
  41  exit
  42  cat /proc/cpuinfo
  43  dmesg | grep CZ100
  44  yum install dmidecode
  45  cat /proc/meminfo
  46  dmidecode | less
  47  dmidecode | grep CZ
  48  ps
  49  ps -ef
  50  top
  51  cat /etc/exports
  52  nano /etc/exports


So all I did was follow the quick install and started of by getting some 
things installed I like, stop NetworkManager en setup the network, 
installed the ovirt.repo, entered my server in RT ;-) (dmidecode stuff) 
and not shown because I probably did it in another console engine-setup.
Both my engine server and the two hosts I run for testing are HP 
Proliant ML110 and one DL3xx host. I had one migration problem because 
the uuid in the ML110's is the same and then you can't migrate, error 
is: migrating to same host, or something close to it.

Base install is from Fedora 17 KDE live CD.

Joop


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Installation problem

2012-09-21 Thread Joop

Steve Gordon wrote:

Has anyone else experienced this issue?
  

Yes, not related to oVirt but on a database server also running
Postgres. It seems that either the package maintainer is very
conservative or postgres itself is. Standard on the Debian 6 server
was
also very low shmmax.
What is the OS you run ovirt-engine on?



I'm going to take a stab and guess Fedora. This came up for an unrelated reason 
in #fedora-devel the other day, because Fedora (and I suspect Debian as well) 
has a policy of sticking as close to upstream as possible it uses the shmmax of 
the upstream kernel - which is as you note quite low. In RHEL and other EL6 
derivatives this value is modified and set much higher.

Steve

  
OK, learned something because I didn't suspect that Fedora would be so 
low because its set higher in RHEL en for example CentOS that I use on a 
db production platform.

Thought that Fedora and RHEL were much closer.

Joop


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Can oVirt be installed in a virtual machine?

2012-09-21 Thread Alan Johnson
On Wed, Sep 12, 2012 at 9:22 PM, Nicolas Chenier  wrote:

> I was under the impression that my oVirt VM would show up in oVirt and
> that I could manage it through there...
>
> What you're saying is that I should just run it seperatly and not manage
> it with itself (oVirt)? keep it on my shared storage so that I can run it
> off any of the 2 servers? But not manage it with oVirt (itself). I think
> I'm starting to get it now...
>
> I really appreciate your help!
>
> Nic
>

Nic, how did you make out with this?  I'm looking to do the same thing and
am wondering if there is any risk in running the engine on a VM managed by
the same engine, as you were suggesting before.  Did you give this a shot?

Itamar, why did you steer Nic away from this?

___
Alan Johnson
a...@datdec.com
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Installation problem

2012-09-21 Thread Dave Neary

Hi,

Additional information:

After removing and re-running engine-setup, I try once again to run 
ovirt-engine. I get the following error in the logs:

==> /var/log/ovirt-engine/console.log <==
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Nothing else, unfortunately, and I have no idea how to debug JBoss AS to 
figure out what's going wrong.


Are there any requirements in terms of the JRE which aren't specified in 
the quick start guide?


Thanks,
Dave.

On 09/21/2012 03:57 PM, Joop wrote:

Dave Neary wrote:



It turns out, in /var/log/messages, that I have these error messages:

Sep 21 14:00:59 clare pg_ctl[5298]: FATAL:  could not create shared
memory segment: Invalid argument
Sep 21 14:00:59 clare pg_ctl[5298]: DETAIL:  Failed system call was
shmget(key=5432001, size=36519936, 03600).
Sep 21 14:00:59 clare pg_ctl[5298]: HINT:  This error usually means
that PostgreSQL's request for a shared memory segment exceeded your
kernel's SHMMAX parameter.  You can either reduce the request size or
reconfigure the kernel with larger SHMMAX.  To reduce the request
size (currently 36519936 bytes), reduce PostgreSQL's shared memory
usage, perhaps by reducing shared_buffers or max_connections.
Sep 21 14:00:59 clare pg_ctl[5298]: If the request size is already
small, it's possible that it is less than your kernel's SHMMIN
parameter, in which case raising the request size or reconfiguring
SHMMIN is called for.
Sep 21 14:00:59 clare pg_ctl[5298]: The PostgreSQL documentation
contains more information about shared memory configuration.
Sep 21 14:01:03 clare pg_ctl[5298]: pg_ctl: could not start server
Sep 21 14:01:03 clare pg_ctl[5298]: Examine the log output.
Sep 21 14:01:03 clare systemd[1]: postgresql.service: control process
exited, code=exited status=1
Sep 21 14:01:03 clare systemd[1]: Unit postgresql.service entered
failed state.


I increased the kernel's SHMMAX, and engine-cleanup worked correctly.

Has anyone else experienced this issue?

Yes, not related to oVirt but on a database server also running
Postgres. It seems that either the package maintainer is very
conservative or postgres itself is. Standard on the Debian 6 server was
also very low shmmax.
What is the OS you run ovirt-engine on?




When I re-run engine-setup, I also got stuck when reconfiguring NFS -
when engine-setup asked me if I wanted to configure the NFS domain, I
said "yes", but then it refused to accept my input of "/mnt/iso" since
it was already in /etc/exports - perhaps engine-cleanup should also
remove ISO shares managed by ovirt-engine, or else handle more
gracefully when someone enters an existing export? The only fix I
found was to interrupt and restart the engine set-up.


Just switch to a different terminal and edit /etc/exports and continue
engine-setup.


Also, I have no idea whether allowing oVirt to manage iptables will
keep any extra rules I have added (specifically for DNS services on
port 53 UDP) which I added to the iptables config. I didn't take the
risk of allowing it to reconfigure iptables the second time.

After all that, I got an error when starting the JBoss service:


Starting JBoss Service... [ ERROR ]
Error: Can't start the ovirt-engine service
Please check log file
/var/log/ovirt-engine/engine-setup_2012_09_21_14_28_11.log for more
information


And when I checked that log file:

2012-09-21 14:30:02::DEBUG::common_utils::790::root:: starting
ovirt-engine
2012-09-21 14:30:02::DEBUG::common_utils::835::root:: executing
action ovirt-engine on service start
2012-09-21 14:30:02::DEBUG::common_utils::309::root:: Executing
command --> '/sbin/service ovirt-engine start'
2012-09-21 14:30:02::DEBUG::common_utils::335::root:: output =
2012-09-21 14:30:02::DEBUG::common_utils::336::root:: stderr =
Redirecting to /bin/systemctl start  ovirt-engine.service
Job failed. See system journal and 'systemctl status' for details.

2012-09-21 14:30:02::DEBUG::common_utils::337::root:: retcode = 1
2012-09-21 14:30:02::DEBUG::setup_sequences::62::root:: Traceback
(most recent call last):
  File "/usr/share/ovirt-engine/scripts/setup_sequences.py", line 60,
in run
function()
  File "/bin/engine-setup", line 1535, in _startJboss
srv.start(True)
  File "/usr/share/ovirt-engine/scripts/common_utils.py", line 795,
in start
raise Exception(output_messages.ERR_FAILED_START_SERVICE %
self.name)
Exception: Error: Can't start the ovirt-engine service


And when I check the system journal, we're back to the service starts,
but the PID mentioned in the PID file does not exist.

Any pointers into how I might debug this issue? I haven't found
anything similar in a troubleshooting page, so perhaps it's not a
common error?

Cheers,
Dave.






Are you following the setup instructions from  the Wiki?
I have done that a couple of times now and haven't had problems sofar.
Had lots of problems with the pre-3.1 releases though.

Joop



--
Dave Neary
Com

Re: [Users] Is there a way to force remove a host?

2012-09-21 Thread Douglas Landgraf

Hi Dominic,

On 09/20/2012 12:11 PM, Dominic Kaiser wrote:

Sorry I did not explain.

I had tried to remove the host and had not luck troubleshooting it.  I 
then had removed it and used it for a storage unit reinstalling fedora 
17.  I foolishly thought that I could just remove the host manually. 
 It physically is not there. (My fault I know)  Is there a way that 
you know of to remove a host brute force.


dk


Fell free to try the below script (not part of official project) for 
brute force:


(from the engine side)
# yum install python-psycopg2 -y
# wget 
https://raw.github.com/dougsland/misc-rhev/master/engine_force_remove_Host.py

# (edit the file and change the db password)
# python ./engine_force_remove_Host.py

Thanks

--
Cheers
Douglas

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Problem with creating a glusterfs volume

2012-09-21 Thread Dominic Kaiser
I noticed something.  If I am trying to mount the gluster share from
another computer and do not include mounproto=tcp it times out.  vers=3 or
4 does not matter.  Could this be why I can not add it from the engine gui?

dk

On Fri, Sep 21, 2012 at 11:12 AM, Dominic Kaiser  wrote:

> Here is the engine.log info:
>
> [root@ovirt ovirt-engine]# tail engine.log
> 2012-09-21 11:10:00,007 INFO
>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (QuartzScheduler_Worker-49) Autorecovering 0 hosts
> 2012-09-21 11:10:00,007 INFO
>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (QuartzScheduler_Worker-49) Checking autorecoverable hosts done
> 2012-09-21 11:10:00,008 INFO
>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (QuartzScheduler_Worker-49) Checking autorecoverable storage domains
> 2012-09-21 11:10:00,009 INFO
>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (QuartzScheduler_Worker-49) Autorecovering 0 storage domains
> 2012-09-21 11:10:00,010 INFO
>  [org.ovirt.engine.core.bll.AutoRecoveryManager]
> (QuartzScheduler_Worker-49) Checking autorecoverable storage domains done
> 2012-09-21 11:10:22,710 ERROR
> [org.ovirt.engine.core.engineencryptutils.EncryptionUtils]
> (QuartzScheduler_Worker-84) Failed to decryptData must not be longer than
> 256 bytes
> 2012-09-21 11:10:22,726 ERROR
> [org.ovirt.engine.core.engineencryptutils.EncryptionUtils]
> (QuartzScheduler_Worker-12) Failed to decryptData must start with zero
> 2012-09-21 11:10:54,519 INFO
>  [org.ovirt.engine.core.bll.storage.RemoveStorageServerConnectionCommand]
> (ajp--0.0.0.0-8009-11) [3769be9c] Running command:
> RemoveStorageServerConnectionCommand internal: false. Entities affected :
>  ID: aaa0----123456789aaa Type: System
> 2012-09-21 11:10:54,537 INFO
>  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
> (ajp--0.0.0.0-8009-11) [3769be9c] START,
> DisconnectStorageServerVDSCommand(vdsId =
> 3822e6c0-0295-11e2-86e6-d74ad5358c03, storagePoolId =
> ----, storageType = NFS, connectionList =
> [{ id: null, connection: gfs1.bostonvineyard.org:/data };]), log id:
> 16dd4a1b
> 2012-09-21 11:10:56,417 INFO
>  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
> (ajp--0.0.0.0-8009-11) [3769be9c] FINISH,
> DisconnectStorageServerVDSCommand, return:
> {----=477}, log id: 16dd4a1b
>
> Thanks,
>
> dk
>
> On Fri, Sep 21, 2012 at 11:09 AM, Dominic Kaiser <
> domi...@bostonvineyard.org> wrote:
>
>> I can mount to another computer with this command:
>>
>> mount -o mountproto=tcp,vers=3 -t nfs gfs1.bostonvineyard.org:/data
>> /home/administrator/test
>>
>> So volumes work but I get a 500 error timeout when trying to add as a
>> storage domain in ovirt.  weird?
>>
>> dk
>>
>> On Fri, Sep 21, 2012 at 10:44 AM, Dominic Kaiser <
>> domi...@bostonvineyard.org> wrote:
>>
>>> Hey All,
>>>
>>> So I finally found the problem.  Cheap NIC's.  Installed Intel NIC's no
>>> problems creating gluster volumes and distributed replicated ones.
>>>  Broadcom and Realtek yuk!  So now I am trying to mount the gluster volume
>>> as a nfs mount and am having a problem.  It is timing out like it is
>>> blocked by a firewall.
>>>
>>> I am trying to:  mount -t nfs gfs1.bostonvineyard.org:/export
>>> /home/administrator/test
>>>
>>> Here is gfs1 tail vdsm.log
>>>
>>> [root@gfs1 vdsm]# tail vdsm.log
>>> Thread-88731::DEBUG::2012-09-21
>>> 10:35:56,566::resourceManager::844::ResourceManager.Owner::(cancelAll)
>>> Owner.cancelAll requests {}
>>> Thread-88731::DEBUG::2012-09-21
>>> 10:35:56,567::task::978::TaskManager.Task::(_decref)
>>> Task=`01b69eed-de59-4e87-8b28-5268b5dcbb50`::ref 0 aborting False
>>> Thread-88737::DEBUG::2012-09-21
>>> 10:36:06,890::task::588::TaskManager.Task::(_updateState)
>>> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state init ->
>>> state preparing
>>> Thread-88737::INFO::2012-09-21
>>> 10:36:06,891::logUtils::37::dispatcher::(wrapper) Run and protect:
>>> repoStats(options=None)
>>> Thread-88737::INFO::2012-09-21
>>> 10:36:06,891::logUtils::39::dispatcher::(wrapper) Run and protect:
>>> repoStats, Return response: {}
>>> Thread-88737::DEBUG::2012-09-21
>>> 10:36:06,891::task::1172::TaskManager.Task::(prepare)
>>> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::finished: {}
>>> Thread-88737::DEBUG::2012-09-21
>>> 10:36:06,892::task::588::TaskManager.Task::(_updateState)
>>> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state preparing ->
>>> state finished
>>> Thread-88737::DEBUG::2012-09-21
>>> 10:36:06,892::resourceManager::809::ResourceManager.Owner::(releaseAll)
>>> Owner.releaseAll requests {} resources {}
>>> Thread-88737::DEBUG::2012-09-21
>>> 10:36:06,892::resourceManager::844::ResourceManager.Owner::(cancelAll)
>>> Owner.cancelAll requests {}
>>> Thread-88737::DEBUG::2012-09-21
>>> 10:36:06,893::task::978::TaskManager.Task::(_decref)
>>> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::ref 0 a

Re: [Users] Is there a way to force remove a host?

2012-09-21 Thread Dominic Kaiser
No there is an active host in the cluster.  It still will not allow removal
of the non-existent host.

dk

On Fri, Sep 21, 2012 at 11:45 AM, Andrew Cathrow wrote:

>
>
> - Original Message -
> > From: "Itamar Heim" 
> > To: "Dominic Kaiser" 
> > Cc: users@ovirt.org
> > Sent: Thursday, September 20, 2012 2:38:23 PM
> > Subject: Re: [Users] Is there a way to force remove a host?
> >
> > On 09/20/2012 07:11 PM, Dominic Kaiser wrote:
> > > Sorry I did not explain.
> > >
> > > I had tried to remove the host and had not luck troubleshooting it.
> > >  I
> > > then had removed it and used it for a storage unit reinstalling
> > > fedora
> > > 17.  I foolishly thought that I could just remove the host
> > > manually.  It
> > > physically is not there. (My fault I know)  Is there a way that you
> > > know
> > > of to remove a host brute force.
> >
> > why can't you just move it to maint and delete it?
> > (you can right click and 'confirm host shutdown manually' to release
> > any
> > resources supposedly held by it)
>
> Is it the only host in the DC, if that's the case you can't put it to
> maintenance mode.
>
> >
> > >
> > > dk
> > >
> > > On Thu, Sep 20, 2012 at 12:00 PM, Eli Mesika  > > > wrote:
> > >
> > >
> > >
> > > - Original Message -
> > >  > From: "Dominic Kaiser"  > > >
> > >  > To: users@ovirt.org 
> > >  > Sent: Thursday, September 20, 2012 6:44:58 PM
> > >  > Subject: [Users] Is there a way to force remove a host?
> > >  >
> > >  >
> > >  > I could not remove old host even if others where up. Can I
> > >  > force
> > >  > remove I do not need it anymore.
> > >
> > > Dominic, please attach engine/vdsm logs so we will be able to
> > > see
> > > why the Host is not removed.
> > > Thanks
> > >  >
> > >  >
> > >  > --
> > >  > Dominic Kaiser
> > >  > Greater Boston Vineyard
> > >  > Director of Operations
> > >  >
> > >  > cell: 617-230-1412 
> > >  > fax: 617-252-0238 
> > >  > email: domi...@bostonvineyard.org
> > >  > 
> > >  >
> > >  >
> > >  >
> > >  > ___
> > >  > Users mailing list
> > >  > Users@ovirt.org 
> > >  > http://lists.ovirt.org/mailman/listinfo/users
> > >  >
> > >
> > >
> > >
> > >
> > > --
> > > Dominic Kaiser
> > > Greater Boston Vineyard
> > > Director of Operations
> > >
> > > cell: 617-230-1412
> > > fax: 617-252-0238
> > > email: domi...@bostonvineyard.org
> > > 
> > >
> > >
> > >
> > >
> > > ___
> > > Users mailing list
> > > Users@ovirt.org
> > > http://lists.ovirt.org/mailman/listinfo/users
> > >
> >
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>



-- 
Dominic Kaiser
Greater Boston Vineyard
Director of Operations

cell: 617-230-1412
fax: 617-252-0238
email: domi...@bostonvineyard.org
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Is there a way to force remove a host?

2012-09-21 Thread Andrew Cathrow


- Original Message -
> From: "Itamar Heim" 
> To: "Dominic Kaiser" 
> Cc: users@ovirt.org
> Sent: Thursday, September 20, 2012 2:38:23 PM
> Subject: Re: [Users] Is there a way to force remove a host?
> 
> On 09/20/2012 07:11 PM, Dominic Kaiser wrote:
> > Sorry I did not explain.
> >
> > I had tried to remove the host and had not luck troubleshooting it.
> >  I
> > then had removed it and used it for a storage unit reinstalling
> > fedora
> > 17.  I foolishly thought that I could just remove the host
> > manually.  It
> > physically is not there. (My fault I know)  Is there a way that you
> > know
> > of to remove a host brute force.
> 
> why can't you just move it to maint and delete it?
> (you can right click and 'confirm host shutdown manually' to release
> any
> resources supposedly held by it)

Is it the only host in the DC, if that's the case you can't put it to 
maintenance mode.

> 
> >
> > dk
> >
> > On Thu, Sep 20, 2012 at 12:00 PM, Eli Mesika  > > wrote:
> >
> >
> >
> > - Original Message -
> >  > From: "Dominic Kaiser"  > >
> >  > To: users@ovirt.org 
> >  > Sent: Thursday, September 20, 2012 6:44:58 PM
> >  > Subject: [Users] Is there a way to force remove a host?
> >  >
> >  >
> >  > I could not remove old host even if others where up. Can I
> >  > force
> >  > remove I do not need it anymore.
> >
> > Dominic, please attach engine/vdsm logs so we will be able to
> > see
> > why the Host is not removed.
> > Thanks
> >  >
> >  >
> >  > --
> >  > Dominic Kaiser
> >  > Greater Boston Vineyard
> >  > Director of Operations
> >  >
> >  > cell: 617-230-1412 
> >  > fax: 617-252-0238 
> >  > email: domi...@bostonvineyard.org
> >  > 
> >  >
> >  >
> >  >
> >  > ___
> >  > Users mailing list
> >  > Users@ovirt.org 
> >  > http://lists.ovirt.org/mailman/listinfo/users
> >  >
> >
> >
> >
> >
> > --
> > Dominic Kaiser
> > Greater Boston Vineyard
> > Director of Operations
> >
> > cell: 617-230-1412
> > fax: 617-252-0238
> > email: domi...@bostonvineyard.org
> > 
> >
> >
> >
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
> 
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Installation problem

2012-09-21 Thread Steve Gordon
- Original Message -
> From: "Joop" 
> To: "Dave Neary" 
> Cc: "users" 
> Sent: Friday, September 21, 2012 9:57:02 AM
> Subject: Re: [Users] Installation problem
> 
> Dave Neary wrote:
> >
> >
> > It turns out, in /var/log/messages, that I have these error
> > messages:
> >> Sep 21 14:00:59 clare pg_ctl[5298]: FATAL:  could not create
> >> shared
> >> memory segment: Invalid argument
> >> Sep 21 14:00:59 clare pg_ctl[5298]: DETAIL:  Failed system call
> >> was
> >> shmget(key=5432001, size=36519936, 03600).
> >> Sep 21 14:00:59 clare pg_ctl[5298]: HINT:  This error usually
> >> means
> >> that PostgreSQL's request for a shared memory segment exceeded
> >> your
> >> kernel's SHMMAX parameter.  You can either reduce the request size
> >> or
> >> reconfigure the kernel with larger SHMMAX.  To reduce the request
> >> size (currently 36519936 bytes), reduce PostgreSQL's shared memory
> >> usage, perhaps by reducing shared_buffers or max_connections.
> >> Sep 21 14:00:59 clare pg_ctl[5298]: If the request size is already
> >> small, it's possible that it is less than your kernel's SHMMIN
> >> parameter, in which case raising the request size or reconfiguring
> >> SHMMIN is called for.
> >> Sep 21 14:00:59 clare pg_ctl[5298]: The PostgreSQL documentation
> >> contains more information about shared memory configuration.
> >> Sep 21 14:01:03 clare pg_ctl[5298]: pg_ctl: could not start server
> >> Sep 21 14:01:03 clare pg_ctl[5298]: Examine the log output.
> >> Sep 21 14:01:03 clare systemd[1]: postgresql.service: control
> >> process
> >> exited, code=exited status=1
> >> Sep 21 14:01:03 clare systemd[1]: Unit postgresql.service entered
> >> failed state.
> >
> > I increased the kernel's SHMMAX, and engine-cleanup worked
> > correctly.
> >
> > Has anyone else experienced this issue?
> Yes, not related to oVirt but on a database server also running
> Postgres. It seems that either the package maintainer is very
> conservative or postgres itself is. Standard on the Debian 6 server
> was
> also very low shmmax.
> What is the OS you run ovirt-engine on?

I'm going to take a stab and guess Fedora. This came up for an unrelated reason 
in #fedora-devel the other day, because Fedora (and I suspect Debian as well) 
has a policy of sticking as close to upstream as possible it uses the shmmax of 
the upstream kernel - which is as you note quite low. In RHEL and other EL6 
derivatives this value is modified and set much higher.

Steve
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Installation problem

2012-09-21 Thread Dave Neary

Hi,

On 09/21/2012 03:57 PM, Joop wrote:

Are you following the setup instructions from  the Wiki?
I have done that a couple of times now and haven't had problems sofar.
Had lots of problems with the pre-3.1 releases though.


I am installing RPMs on Fedora 17 through the 
http://ovirt.org/releases/stable/rpm/Fedora/$releasever/ repository.


I've been following the quick start guide: 
http://wiki.ovirt.org/wiki/Quick_Start_Guide and have been noting my 
various difficulties for a "Troubleshooting" page.


Is there a better document to follow?

Cheers,
Dave.

--
Dave Neary
Community Action and Impact
Open Source and Standards, Red Hat
Ph: +33 9 50 71 55 62 / Cell: +33 6 77 01 92 13
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Problem with creating a glusterfs volume

2012-09-21 Thread Dominic Kaiser
Here is the engine.log info:

[root@ovirt ovirt-engine]# tail engine.log
2012-09-21 11:10:00,007 INFO
 [org.ovirt.engine.core.bll.AutoRecoveryManager]
(QuartzScheduler_Worker-49) Autorecovering 0 hosts
2012-09-21 11:10:00,007 INFO
 [org.ovirt.engine.core.bll.AutoRecoveryManager]
(QuartzScheduler_Worker-49) Checking autorecoverable hosts done
2012-09-21 11:10:00,008 INFO
 [org.ovirt.engine.core.bll.AutoRecoveryManager]
(QuartzScheduler_Worker-49) Checking autorecoverable storage domains
2012-09-21 11:10:00,009 INFO
 [org.ovirt.engine.core.bll.AutoRecoveryManager]
(QuartzScheduler_Worker-49) Autorecovering 0 storage domains
2012-09-21 11:10:00,010 INFO
 [org.ovirt.engine.core.bll.AutoRecoveryManager]
(QuartzScheduler_Worker-49) Checking autorecoverable storage domains done
2012-09-21 11:10:22,710 ERROR
[org.ovirt.engine.core.engineencryptutils.EncryptionUtils]
(QuartzScheduler_Worker-84) Failed to decryptData must not be longer than
256 bytes
2012-09-21 11:10:22,726 ERROR
[org.ovirt.engine.core.engineencryptutils.EncryptionUtils]
(QuartzScheduler_Worker-12) Failed to decryptData must start with zero
2012-09-21 11:10:54,519 INFO
 [org.ovirt.engine.core.bll.storage.RemoveStorageServerConnectionCommand]
(ajp--0.0.0.0-8009-11) [3769be9c] Running command:
RemoveStorageServerConnectionCommand internal: false. Entities affected :
 ID: aaa0----123456789aaa Type: System
2012-09-21 11:10:54,537 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
(ajp--0.0.0.0-8009-11) [3769be9c] START,
DisconnectStorageServerVDSCommand(vdsId =
3822e6c0-0295-11e2-86e6-d74ad5358c03, storagePoolId =
----, storageType = NFS, connectionList =
[{ id: null, connection: gfs1.bostonvineyard.org:/data };]), log id:
16dd4a1b
2012-09-21 11:10:56,417 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand]
(ajp--0.0.0.0-8009-11) [3769be9c] FINISH,
DisconnectStorageServerVDSCommand, return:
{----=477}, log id: 16dd4a1b

Thanks,

dk

On Fri, Sep 21, 2012 at 11:09 AM, Dominic Kaiser  wrote:

> I can mount to another computer with this command:
>
> mount -o mountproto=tcp,vers=3 -t nfs gfs1.bostonvineyard.org:/data
> /home/administrator/test
>
> So volumes work but I get a 500 error timeout when trying to add as a
> storage domain in ovirt.  weird?
>
> dk
>
> On Fri, Sep 21, 2012 at 10:44 AM, Dominic Kaiser <
> domi...@bostonvineyard.org> wrote:
>
>> Hey All,
>>
>> So I finally found the problem.  Cheap NIC's.  Installed Intel NIC's no
>> problems creating gluster volumes and distributed replicated ones.
>>  Broadcom and Realtek yuk!  So now I am trying to mount the gluster volume
>> as a nfs mount and am having a problem.  It is timing out like it is
>> blocked by a firewall.
>>
>> I am trying to:  mount -t nfs gfs1.bostonvineyard.org:/export
>> /home/administrator/test
>>
>> Here is gfs1 tail vdsm.log
>>
>> [root@gfs1 vdsm]# tail vdsm.log
>> Thread-88731::DEBUG::2012-09-21
>> 10:35:56,566::resourceManager::844::ResourceManager.Owner::(cancelAll)
>> Owner.cancelAll requests {}
>> Thread-88731::DEBUG::2012-09-21
>> 10:35:56,567::task::978::TaskManager.Task::(_decref)
>> Task=`01b69eed-de59-4e87-8b28-5268b5dcbb50`::ref 0 aborting False
>> Thread-88737::DEBUG::2012-09-21
>> 10:36:06,890::task::588::TaskManager.Task::(_updateState)
>> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state init ->
>> state preparing
>> Thread-88737::INFO::2012-09-21
>> 10:36:06,891::logUtils::37::dispatcher::(wrapper) Run and protect:
>> repoStats(options=None)
>> Thread-88737::INFO::2012-09-21
>> 10:36:06,891::logUtils::39::dispatcher::(wrapper) Run and protect:
>> repoStats, Return response: {}
>> Thread-88737::DEBUG::2012-09-21
>> 10:36:06,891::task::1172::TaskManager.Task::(prepare)
>> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::finished: {}
>> Thread-88737::DEBUG::2012-09-21
>> 10:36:06,892::task::588::TaskManager.Task::(_updateState)
>> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state preparing ->
>> state finished
>> Thread-88737::DEBUG::2012-09-21
>> 10:36:06,892::resourceManager::809::ResourceManager.Owner::(releaseAll)
>> Owner.releaseAll requests {} resources {}
>> Thread-88737::DEBUG::2012-09-21
>> 10:36:06,892::resourceManager::844::ResourceManager.Owner::(cancelAll)
>> Owner.cancelAll requests {}
>> Thread-88737::DEBUG::2012-09-21
>> 10:36:06,893::task::978::TaskManager.Task::(_decref)
>> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::ref 0 aborting False
>>
>> Do you know why I can not connect via NFS?  Using an older kernel not 3.5
>> and iptables are off.
>>
>> Dominic
>>
>>
>> On Mon, Sep 10, 2012 at 12:20 PM, Haim Ateya  wrote:
>>
>>> On 09/10/2012 06:27 PM, Dominic Kaiser wrote:
>>>
 Here is the message and the logs again except zipped I failed the first
 delivery:

 Ok here are the logs 4 node and 1 engine log.  Tried making /data
 folder owned by root and then tried by 36:3

Re: [Users] Problem with creating a glusterfs volume

2012-09-21 Thread Dominic Kaiser
I can mount to another computer with this command:

mount -o mountproto=tcp,vers=3 -t nfs gfs1.bostonvineyard.org:/data
/home/administrator/test

So volumes work but I get a 500 error timeout when trying to add as a
storage domain in ovirt.  weird?

dk

On Fri, Sep 21, 2012 at 10:44 AM, Dominic Kaiser  wrote:

> Hey All,
>
> So I finally found the problem.  Cheap NIC's.  Installed Intel NIC's no
> problems creating gluster volumes and distributed replicated ones.
>  Broadcom and Realtek yuk!  So now I am trying to mount the gluster volume
> as a nfs mount and am having a problem.  It is timing out like it is
> blocked by a firewall.
>
> I am trying to:  mount -t nfs gfs1.bostonvineyard.org:/export
> /home/administrator/test
>
> Here is gfs1 tail vdsm.log
>
> [root@gfs1 vdsm]# tail vdsm.log
> Thread-88731::DEBUG::2012-09-21
> 10:35:56,566::resourceManager::844::ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> Thread-88731::DEBUG::2012-09-21
> 10:35:56,567::task::978::TaskManager.Task::(_decref)
> Task=`01b69eed-de59-4e87-8b28-5268b5dcbb50`::ref 0 aborting False
> Thread-88737::DEBUG::2012-09-21
> 10:36:06,890::task::588::TaskManager.Task::(_updateState)
> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state init ->
> state preparing
> Thread-88737::INFO::2012-09-21
> 10:36:06,891::logUtils::37::dispatcher::(wrapper) Run and protect:
> repoStats(options=None)
> Thread-88737::INFO::2012-09-21
> 10:36:06,891::logUtils::39::dispatcher::(wrapper) Run and protect:
> repoStats, Return response: {}
> Thread-88737::DEBUG::2012-09-21
> 10:36:06,891::task::1172::TaskManager.Task::(prepare)
> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::finished: {}
> Thread-88737::DEBUG::2012-09-21
> 10:36:06,892::task::588::TaskManager.Task::(_updateState)
> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state preparing ->
> state finished
> Thread-88737::DEBUG::2012-09-21
> 10:36:06,892::resourceManager::809::ResourceManager.Owner::(releaseAll)
> Owner.releaseAll requests {} resources {}
> Thread-88737::DEBUG::2012-09-21
> 10:36:06,892::resourceManager::844::ResourceManager.Owner::(cancelAll)
> Owner.cancelAll requests {}
> Thread-88737::DEBUG::2012-09-21
> 10:36:06,893::task::978::TaskManager.Task::(_decref)
> Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::ref 0 aborting False
>
> Do you know why I can not connect via NFS?  Using an older kernel not 3.5
> and iptables are off.
>
> Dominic
>
>
> On Mon, Sep 10, 2012 at 12:20 PM, Haim Ateya  wrote:
>
>> On 09/10/2012 06:27 PM, Dominic Kaiser wrote:
>>
>>> Here is the message and the logs again except zipped I failed the first
>>> delivery:
>>>
>>> Ok here are the logs 4 node and 1 engine log.  Tried making /data folder
>>> owned by root and then tried by 36:36 neither worked.  Name of volume is
>>> data to match folders on nodes also.
>>>
>>> Let me know what you think,
>>>
>>> Dominic
>>>
>>
>> this is the actual failure (taken from gfs2vdsm.log).
>>
>> Thread-332442::DEBUG::2012-09-**10 
>> 10:28:05,788::BindingXMLRPC::**859::vds::(wrapper)
>> client [10.3.0.241]::call volumeCreate with ('data', ['10.4.0.97:/data',
>> '10.4.0.98:/data', '10.4.0.99:/data', '10.4.0.100:/data'],
>>  2, 0, ['TCP']) {} flowID [406f2c8e]
>> MainProcess|Thread-332442::**DEBUG::2012-09-10
>> 10:28:05,792::__init__::1249::**Storage.Misc.excCmd::(_log)
>> '/usr/sbin/gluster --mode=script volume create data replica 2 transport TCP
>> 10.4.0.97:/data 10.4.0.98:/data 10
>> .4.0.99:/data 10.4.0.100:/data' (cwd None)
>> MainProcess|Thread-332442::**DEBUG::2012-09-10
>> 10:28:05,900::__init__::1249::**Storage.Misc.excCmd::(_log) FAILED:
>>  = 'Host 10.4.0.99 not a friend\n';  = 255
>> MainProcess|Thread-332442::**ERROR::2012-09-10
>> 10:28:05,900::supervdsmServer:**:76::SuperVdsm.ServerCallback:**:(wrapper)
>> Error in wrapper
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/**supervdsmServer.py", line 74, in wrapper
>> return func(*args, **kwargs)
>>   File "/usr/share/vdsm/**supervdsmServer.py", line 286, in wrapper
>> return func(*args, **kwargs)
>>   File "/usr/share/vdsm/gluster/cli.**py", line 46, in wrapper
>> return func(*args, **kwargs)
>>   File "/usr/share/vdsm/gluster/cli.**py", line 176, in volumeCreate
>> raise ge.**GlusterVolumeCreateFailedExcep**tion(rc, out, err)
>> GlusterVolumeCreateFailedExcep**tion: Volume create failed
>> error: Host 10.4.0.99 not a friend
>> return code: 255
>> Thread-332442::ERROR::2012-09-**10 
>> 10:28:05,901::BindingXMLRPC::**877::vds::(wrapper)
>> unexpected error
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/**BindingXMLRPC.py", line 864, in wrapper
>> res = f(*args, **kwargs)
>>   File "/usr/share/vdsm/gluster/api.**py", line 32, in wrapper
>> rv = func(*args, **kwargs)
>>   File "/usr/share/vdsm/gluster/api.**py", line 87, in volumeCreate
>> transportList)
>>   File "/usr/share/vdsm/supervdsm.py"**, line 67, in __call__
>> return callMethod()
>>   File "/usr/share/v

Re: [Users] Problem with creating a glusterfs volume

2012-09-21 Thread Dominic Kaiser
Hey All,

So I finally found the problem.  Cheap NIC's.  Installed Intel NIC's no
problems creating gluster volumes and distributed replicated ones.
 Broadcom and Realtek yuk!  So now I am trying to mount the gluster volume
as a nfs mount and am having a problem.  It is timing out like it is
blocked by a firewall.

I am trying to:  mount -t nfs gfs1.bostonvineyard.org:/export
/home/administrator/test

Here is gfs1 tail vdsm.log

[root@gfs1 vdsm]# tail vdsm.log
Thread-88731::DEBUG::2012-09-21
10:35:56,566::resourceManager::844::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-88731::DEBUG::2012-09-21
10:35:56,567::task::978::TaskManager.Task::(_decref)
Task=`01b69eed-de59-4e87-8b28-5268b5dcbb50`::ref 0 aborting False
Thread-88737::DEBUG::2012-09-21
10:36:06,890::task::588::TaskManager.Task::(_updateState)
Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state init ->
state preparing
Thread-88737::INFO::2012-09-21
10:36:06,891::logUtils::37::dispatcher::(wrapper) Run and protect:
repoStats(options=None)
Thread-88737::INFO::2012-09-21
10:36:06,891::logUtils::39::dispatcher::(wrapper) Run and protect:
repoStats, Return response: {}
Thread-88737::DEBUG::2012-09-21
10:36:06,891::task::1172::TaskManager.Task::(prepare)
Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::finished: {}
Thread-88737::DEBUG::2012-09-21
10:36:06,892::task::588::TaskManager.Task::(_updateState)
Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::moving from state preparing ->
state finished
Thread-88737::DEBUG::2012-09-21
10:36:06,892::resourceManager::809::ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}
Thread-88737::DEBUG::2012-09-21
10:36:06,892::resourceManager::844::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-88737::DEBUG::2012-09-21
10:36:06,893::task::978::TaskManager.Task::(_decref)
Task=`f70222ad-f8b4-4733-9526-eff1d214ebd8`::ref 0 aborting False

Do you know why I can not connect via NFS?  Using an older kernel not 3.5
and iptables are off.

Dominic


On Mon, Sep 10, 2012 at 12:20 PM, Haim Ateya  wrote:

> On 09/10/2012 06:27 PM, Dominic Kaiser wrote:
>
>> Here is the message and the logs again except zipped I failed the first
>> delivery:
>>
>> Ok here are the logs 4 node and 1 engine log.  Tried making /data folder
>> owned by root and then tried by 36:36 neither worked.  Name of volume is
>> data to match folders on nodes also.
>>
>> Let me know what you think,
>>
>> Dominic
>>
>
> this is the actual failure (taken from gfs2vdsm.log).
>
> Thread-332442::DEBUG::2012-09-**10 
> 10:28:05,788::BindingXMLRPC::**859::vds::(wrapper)
> client [10.3.0.241]::call volumeCreate with ('data', ['10.4.0.97:/data',
> '10.4.0.98:/data', '10.4.0.99:/data', '10.4.0.100:/data'],
>  2, 0, ['TCP']) {} flowID [406f2c8e]
> MainProcess|Thread-332442::**DEBUG::2012-09-10
> 10:28:05,792::__init__::1249::**Storage.Misc.excCmd::(_log)
> '/usr/sbin/gluster --mode=script volume create data replica 2 transport TCP
> 10.4.0.97:/data 10.4.0.98:/data 10
> .4.0.99:/data 10.4.0.100:/data' (cwd None)
> MainProcess|Thread-332442::**DEBUG::2012-09-10
> 10:28:05,900::__init__::1249::**Storage.Misc.excCmd::(_log) FAILED: 
> = 'Host 10.4.0.99 not a friend\n';  = 255
> MainProcess|Thread-332442::**ERROR::2012-09-10
> 10:28:05,900::supervdsmServer:**:76::SuperVdsm.ServerCallback:**:(wrapper)
> Error in wrapper
> Traceback (most recent call last):
>   File "/usr/share/vdsm/**supervdsmServer.py", line 74, in wrapper
> return func(*args, **kwargs)
>   File "/usr/share/vdsm/**supervdsmServer.py", line 286, in wrapper
> return func(*args, **kwargs)
>   File "/usr/share/vdsm/gluster/cli.**py", line 46, in wrapper
> return func(*args, **kwargs)
>   File "/usr/share/vdsm/gluster/cli.**py", line 176, in volumeCreate
> raise ge.**GlusterVolumeCreateFailedExcep**tion(rc, out, err)
> GlusterVolumeCreateFailedExcep**tion: Volume create failed
> error: Host 10.4.0.99 not a friend
> return code: 255
> Thread-332442::ERROR::2012-09-**10 
> 10:28:05,901::BindingXMLRPC::**877::vds::(wrapper)
> unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/**BindingXMLRPC.py", line 864, in wrapper
> res = f(*args, **kwargs)
>   File "/usr/share/vdsm/gluster/api.**py", line 32, in wrapper
> rv = func(*args, **kwargs)
>   File "/usr/share/vdsm/gluster/api.**py", line 87, in volumeCreate
> transportList)
>   File "/usr/share/vdsm/supervdsm.py"**, line 67, in __call__
> return callMethod()
>   File "/usr/share/vdsm/supervdsm.py"**, line 65, in 
> **kwargs)
>   File "", line 2, in glusterVolumeCreate
>   File "/usr/lib64/python2.7/**multiprocessing/managers.py", line 759, in
> _callmethod
> kind, result = conn.recv()
> TypeError: ('__init__() takes exactly 4 arguments (1 given)',  'gluster.exception.**GlusterVolumeCreateFailedExcep**tion'>, ())
>
> can you please run  gluster peer status on all your nodes ? also, it
> appears that '10.4.0.99' is problematic, can you

Re: [Users] vdsmd doesn't restart after rebooting

2012-09-21 Thread Douglas Landgraf

Hi Nathanaël,

On 09/21/2012 10:18 AM, Nathanaël Blanchet wrote:

Hi all,

In the latest vdsm build from git 
(vdsm-4.10.0-0.452.git87594e3.fc17.x86_64), vdsmd.service never starts 
alone after rebooting.

I have had a look to journalctl anfd I've found this :

systemd-vdsmd[538]: vdsm: Failed to define network filters on 
libvirt[FAILED]


[root@node ~]# service vdsmd status
Redirecting to /bin/systemctl  status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
  Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
  Active: failed (Result: exit-code) since Fri, 21 Sep 2012 
12:13:01 +0200; 4min 56s ago
 Process: 543 ExecStart=/lib/systemd/systemd-vdsmd start 
(code=exited, status=1/FAILURE)

  CGroup: name=systemd:/system/vdsmd.service

Sep 21 12:12:55 node.abes.fr systemd-vdsmd[543]: Note: Forwarding 
request to 'systemctl disable libvirt-guests.service'.
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: vdsm: libvirt already 
configured for vdsm [  OK  ]

Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting wdmd...
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting sanlock...
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting iscsid:
Sep 21 12:13:01 node.abes.fr systemd-vdsmd[543]: Starting libvirtd 
(via systemctl):  [  OK  ]


May this nwfilter be the cause of the failure? If yes, do I need to 
open a BZ?



Thanks for your report. If you can, please open a BZ.
Just for the record: I remember we have fixed something really similar 
of this report with the below patch, I am going to check..


commit dccf4d0260f735265158c3d0cda6e65a0d8c9a4d
Author: Dan Kenigsberg 
Date:   Thu Sep 6 16:37:32 2012 +0300

vdsmd: set nwfilter on ovirt-node

ovirt-node is shipped with .pyc only. Do not try running a missing
executable.

Change-Id: I35e9a3526f6ec70ef40f586319259903e9e1f5fe
Signed-off-by: Dan Kenigsberg 
Reviewed-on: http://gerrit.ovirt.org/7821
Reviewed-by: Douglas Schilling Landgraf 
Tested-by: Moti Asayag 
Tested-by: Douglas Schilling Landgraf 



--
Cheers
Douglas

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] vdsmd doesn't restart after rebooting

2012-09-21 Thread Nathanaël Blanchet

Hi all,

In the latest vdsm build from git 
(vdsm-4.10.0-0.452.git87594e3.fc17.x86_64), vdsmd.service never starts 
alone after rebooting.

I have had a look to journalctl anfd I've found this :

systemd-vdsmd[538]: vdsm: Failed to define network filters on 
libvirt[FAILED]


[root@node ~]# service vdsmd status
Redirecting to /bin/systemctl  status vdsmd.service
vdsmd.service - Virtual Desktop Server Manager
  Loaded: loaded (/usr/lib/systemd/system/vdsmd.service; enabled)
  Active: failed (Result: exit-code) since Fri, 21 Sep 2012 
12:13:01 +0200; 4min 56s ago
 Process: 543 ExecStart=/lib/systemd/systemd-vdsmd start 
(code=exited, status=1/FAILURE)

  CGroup: name=systemd:/system/vdsmd.service

Sep 21 12:12:55 node.abes.fr systemd-vdsmd[543]: Note: Forwarding 
request to 'systemctl disable libvirt-guests.service'.
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: vdsm: libvirt already 
configured for vdsm [  OK  ]

Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting wdmd...
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting sanlock...
Sep 21 12:12:56 node.abes.fr systemd-vdsmd[543]: Starting iscsid:
Sep 21 12:13:01 node.abes.fr systemd-vdsmd[543]: Starting libvirtd (via 
systemctl):  [  OK  ]


May this nwfilter be the cause of the failure? If yes, do I need to open 
a BZ?


Thank you for your answer

--
Nathanaël Blanchet

Supervision réseau
Pôle exploitation et maintenance
Département des systèmes d'information
227 avenue Professeur-Jean-Louis-Viala
34193 MONTPELLIER CEDEX 5   
Tél. 33 (0)4 67 54 84 55
Fax  33 (0)4 67 54 84 14
blanc...@abes.fr

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Installation problem

2012-09-21 Thread Joop

Dave Neary wrote:



It turns out, in /var/log/messages, that I have these error messages:
Sep 21 14:00:59 clare pg_ctl[5298]: FATAL:  could not create shared 
memory segment: Invalid argument
Sep 21 14:00:59 clare pg_ctl[5298]: DETAIL:  Failed system call was 
shmget(key=5432001, size=36519936, 03600).
Sep 21 14:00:59 clare pg_ctl[5298]: HINT:  This error usually means 
that PostgreSQL's request for a shared memory segment exceeded your 
kernel's SHMMAX parameter.  You can either reduce the request size or 
reconfigure the kernel with larger SHMMAX.  To reduce the request 
size (currently 36519936 bytes), reduce PostgreSQL's shared memory 
usage, perhaps by reducing shared_buffers or max_connections.
Sep 21 14:00:59 clare pg_ctl[5298]: If the request size is already 
small, it's possible that it is less than your kernel's SHMMIN 
parameter, in which case raising the request size or reconfiguring 
SHMMIN is called for.
Sep 21 14:00:59 clare pg_ctl[5298]: The PostgreSQL documentation 
contains more information about shared memory configuration.

Sep 21 14:01:03 clare pg_ctl[5298]: pg_ctl: could not start server
Sep 21 14:01:03 clare pg_ctl[5298]: Examine the log output.
Sep 21 14:01:03 clare systemd[1]: postgresql.service: control process 
exited, code=exited status=1
Sep 21 14:01:03 clare systemd[1]: Unit postgresql.service entered 
failed state.


I increased the kernel's SHMMAX, and engine-cleanup worked correctly.

Has anyone else experienced this issue?
Yes, not related to oVirt but on a database server also running 
Postgres. It seems that either the package maintainer is very 
conservative or postgres itself is. Standard on the Debian 6 server was 
also very low shmmax.

What is the OS you run ovirt-engine on?




When I re-run engine-setup, I also got stuck when reconfiguring NFS - 
when engine-setup asked me if I wanted to configure the NFS domain, I 
said "yes", but then it refused to accept my input of "/mnt/iso" since 
it was already in /etc/exports - perhaps engine-cleanup should also 
remove ISO shares managed by ovirt-engine, or else handle more 
gracefully when someone enters an existing export? The only fix I 
found was to interrupt and restart the engine set-up.


Just switch to a different terminal and edit /etc/exports and continue 
engine-setup.


Also, I have no idea whether allowing oVirt to manage iptables will 
keep any extra rules I have added (specifically for DNS services on 
port 53 UDP) which I added to the iptables config. I didn't take the 
risk of allowing it to reconfigure iptables the second time.


After all that, I got an error when starting the JBoss service:


Starting JBoss Service... [ ERROR ]
Error: Can't start the ovirt-engine service
Please check log file 
/var/log/ovirt-engine/engine-setup_2012_09_21_14_28_11.log for more 
information


And when I checked that log file:
2012-09-21 14:30:02::DEBUG::common_utils::790::root:: starting 
ovirt-engine
2012-09-21 14:30:02::DEBUG::common_utils::835::root:: executing 
action ovirt-engine on service start
2012-09-21 14:30:02::DEBUG::common_utils::309::root:: Executing 
command --> '/sbin/service ovirt-engine start'

2012-09-21 14:30:02::DEBUG::common_utils::335::root:: output =
2012-09-21 14:30:02::DEBUG::common_utils::336::root:: stderr = 
Redirecting to /bin/systemctl start  ovirt-engine.service

Job failed. See system journal and 'systemctl status' for details.

2012-09-21 14:30:02::DEBUG::common_utils::337::root:: retcode = 1
2012-09-21 14:30:02::DEBUG::setup_sequences::62::root:: Traceback 
(most recent call last):
  File "/usr/share/ovirt-engine/scripts/setup_sequences.py", line 60, 
in run

function()
  File "/bin/engine-setup", line 1535, in _startJboss
srv.start(True)
  File "/usr/share/ovirt-engine/scripts/common_utils.py", line 795, 
in start
raise Exception(output_messages.ERR_FAILED_START_SERVICE % 
self.name)

Exception: Error: Can't start the ovirt-engine service


And when I check the system journal, we're back to the service starts, 
but the PID mentioned in the PID file does not exist.


Any pointers into how I might debug this issue? I haven't found 
anything similar in a troubleshooting page, so perhaps it's not a 
common error?


Cheers,
Dave.






Are you following the setup instructions from  the Wiki?
I have done that a couple of times now and haven't had problems sofar. 
Had lots of problems with the pre-3.1 releases though.


Joop

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] API Documentation

2012-09-21 Thread Michael Pasternak

Also our api has descriptor (RSDL), you should be able finding there all
you need, try: /api?rsdl

On 09/21/2012 12:41 PM, Laszlo Hornyak wrote:
> Hi!
> 
> Try this:
> https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Virtualization/3.0/html-single/REST_API_Guide/index.html
> https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Virtualization/3.1-Beta/html/Developer_Guide/pt02.html
> 
> Laszlo
> 
> - Original Message -
>> From: "??" 
>> To: users@ovirt.org
>> Sent: Friday, September 21, 2012 10:11:02 AM
>> Subject: [Users] API Documentation
>>
>>
>>
>>
>>
>> Hi, where can I documentation ovirt api?
>>
>> Interest features:
>>
>> 1. Suspended Virtual Machine
>>
>> 2. Creating a snapshot
>>
>> 3. Import snapshot to export storage
>>
>>
>>
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users


-- 

Michael Pasternak
RedHat, ENG-Virtualization R&D
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Installation problem

2012-09-21 Thread Dave Neary

Hi all,

I was working through the installation of ovirt-engine today (after 
spending more time than I care to admit struggling with networking & DNS 
issues - VPNs, dnsmasq, "classic" network start-up and iptables/firewall 
rules can interract with each other in strange and surprising ways).


Anyway - I went through the engine set-up successfully, and got the 
expected message at the end: " Installation completed successfully 
**" with a message to visit the engine web application to finish set-up.


Unfortunately, when I connected (after resolving networking issues) to 
the server in question, I got a "Service temporarily unavailable" error 
(503) from Apache.


in httpd's error.log, I have:

 [Fri Sep 21 13:37:03 2012] [error] (111)Connection refused: proxy: AJP: 
attempt to connect to 127.0.0.1:8009 (localhost) failed
 [Fri Sep 21 13:37:03 2012] [error] ap_proxy_connect_backend disabling worker 
for (localhost)
 [Fri Sep 21 13:37:03 2012] [error] proxy: AJP: failed to make connection to 
backend: localhost




When I try to restart the ovirt-engine service, I get the following in 
journalctl:

 Sep 21 13:34:44 clare.neary.home engine-service.py[5172]: The engine PID file 
"/var/run/ovirt-engine.pid" already exists.
 Sep 21 13:34:44 clare.neary.home systemd[1]: PID 1264 read from file 
/var/run/ovirt-engine.pid does not exist.
 Sep 21 13:34:44 clare.neary.home systemd[1]: Unit ovirt-engine.service entered 
failed state.




I tried to clean up and restart, but engine-cleanup failed:

[root@clare ovirt-engine]# engine-cleanup -u

Stopping JBoss service...[ DONE ]

Error: Couldn't connect to the database server.Check that connection is working 
and rerun the cleanup utility
Error: Cleanup failed.
please check log at /var/log/ovirt-engine/engine-cleanup_2012_09_21_14_02_37.log




It turns out, in /var/log/messages, that I have these error messages:

Sep 21 14:00:59 clare pg_ctl[5298]: FATAL:  could not create shared memory 
segment: Invalid argument
Sep 21 14:00:59 clare pg_ctl[5298]: DETAIL:  Failed system call was 
shmget(key=5432001, size=36519936, 03600).
Sep 21 14:00:59 clare pg_ctl[5298]: HINT:  This error usually means that 
PostgreSQL's request for a shared memory segment exceeded your kernel's SHMMAX 
parameter.  You can either reduce the request size or reconfigure the kernel 
with larger SHMMAX.  To reduce the request size (currently 36519936 bytes), 
reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or 
max_connections.
Sep 21 14:00:59 clare pg_ctl[5298]: If the request size is already small, it's 
possible that it is less than your kernel's SHMMIN parameter, in which case 
raising the request size or reconfiguring SHMMIN is called for.
Sep 21 14:00:59 clare pg_ctl[5298]: The PostgreSQL documentation contains more 
information about shared memory configuration.
Sep 21 14:01:03 clare pg_ctl[5298]: pg_ctl: could not start server
Sep 21 14:01:03 clare pg_ctl[5298]: Examine the log output.
Sep 21 14:01:03 clare systemd[1]: postgresql.service: control process exited, 
code=exited status=1
Sep 21 14:01:03 clare systemd[1]: Unit postgresql.service entered failed state.


I increased the kernel's SHMMAX, and engine-cleanup worked correctly.

Has anyone else experienced this issue?


When I re-run engine-setup, I also got stuck when reconfiguring NFS - 
when engine-setup asked me if I wanted to configure the NFS domain, I 
said "yes", but then it refused to accept my input of "/mnt/iso" since 
it was already in /etc/exports - perhaps engine-cleanup should also 
remove ISO shares managed by ovirt-engine, or else handle more 
gracefully when someone enters an existing export? The only fix I found 
was to interrupt and restart the engine set-up.


Also, I have no idea whether allowing oVirt to manage iptables will keep 
any extra rules I have added (specifically for DNS services on port 53 
UDP) which I added to the iptables config. I didn't take the risk of 
allowing it to reconfigure iptables the second time.


After all that, I got an error when starting the JBoss service:


Starting JBoss Service... [ ERROR ]
Error: Can't start the ovirt-engine service
Please check log file 
/var/log/ovirt-engine/engine-setup_2012_09_21_14_28_11.log for more information


And when I checked that log file:

2012-09-21 14:30:02::DEBUG::common_utils::790::root:: starting ovirt-engine
2012-09-21 14:30:02::DEBUG::common_utils::835::root:: executing action 
ovirt-engine on service start
2012-09-21 14:30:02::DEBUG::common_utils::309::root:: Executing command --> 
'/sbin/service ovirt-engine start'
2012-09-21 14:30:02::DEBUG::common_utils::335::root:: output =
2012-09-21 14:30:02::DEBUG::common_utils::336::root:: stderr = Redirecting to 
/bin/systemctl start  ovirt-engine.service
Job failed. See system journal and 'systemctl status' for details.

2012-09-21 14:30:02::DEBUG::common_utils::337::root:: retcode = 1
2012-09-21 14:30:02

Re: [Users] non-operational state as host does not meet clusters' minimu CPU level.

2012-09-21 Thread wujieke
currently, I create a cluster with Conroe family. My host/node works. And I
can create some VMs on it.
But I am not sure if the issue talking here will impact the VM's
performance? Or stability ? 

-Original Message-
From: Itamar Heim [mailto:ih...@redhat.com] 
Sent: Friday, September 21, 2012 3:57 PM
To: wujieke
Cc: 'Mark Wu'; users@ovirt.org
Subject: Re: [Users] non-operational state as host does not meet clusters'
minimu CPU level.

On 09/21/2012 09:24 AM, wujieke wrote:
> Thanks a lot. Mark.
>
> Attach output for reference.

ok, so libvirt/virsh detect sandybridge, but vdsm only reports conroe.
there used to be a bug around this in vdsm - which version of vdsm are you
running?

>
> -Original Message-
> From: Mark Wu [mailto:wu...@linux.vnet.ibm.com]
> Sent: Friday, September 21, 2012 2:15 PM
> To: wujieke
> Cc: 'Itamar Heim'; users@ovirt.org
> Subject: Re: [Users] non-operational state as host does not meet clusters'
> minimu CPU level.
>
> On 09/21/2012 01:01 PM, wujieke wrote:
>> I follow the wiki page to re-install ovirt with all-in-one version .
>> my local host in ovirt is working now.
>> Thanks a lot.
>>
>> Btw: the cmd " virsh capabilities" complains out :
>>
>> [root@localhost ~]# virsh capabilities Please enter your 
>> authentication name:
>> Please enter your password:
>> error: Failed to reconnect to the hypervisor
>> error: no valid connection
>> error: authentication failed: Failed to step SASL negotiation: -1
> (SASL(-1):
>> generic failure: All-whitespace username.)
>>
>> any idea?
> Please try "virsh -r capabilities"
>>
>> -Original Message-
>> From: Itamar Heim [mailto:ih...@redhat.com]
>> Sent: Friday, September 21, 2012 12:44 PM
>> To: wujieke
>> Cc: node-de...@ovirt.org; users@ovirt.org
>> Subject: Re: [Users] non-operational state as host does not meet
clusters'
>> minimu CPU level.
>>
>> On 09/21/2012 03:54 AM, wujieke wrote:
>>> [root@localhost ~]# vdsClient -s 0 getVdsCaps | grep -i flags
>>>cpuFlags =
>>> fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,ps
>>> e
>>> 3
>>> 6,clfl
>>> ush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,pdpe1gb,rdtsc
>>> p
>>> ,
>>> lm,con
>>> stant_tsc,arch_perfmon,pebs,bts,rep_good,nopl,xtopology,nonstop_tsc,
>>> a
>>> p
>>> erfmpe
>>> rf,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xt
>>> p
>>> r
>>> ,pdcm,
>>> pcid,dca,sse4_1,sse4_2,x2apic,popcnt,tsc_deadline_timer,aes,xsave,av
>>> x
>>> ,
>>> lahf_l
>>> m,ida,arat,epb,xsaveopt,pln,pts,dts,tpr_shadow,vnmi,flexpriority,ept
>>> ,
>>> v
>>> pid,mo
>>> del_coreduo,model_Conroe
>>>
>>> seems only support model_Conroe?
>> and output of: virsh capabilities?
>>
>>
>>> -Original Message-
>>> From: Itamar Heim [mailto:ih...@redhat.com]
>>> Sent: Thursday, September 20, 2012 10:04 PM
>>> To: wujieke
>>> Cc: node-de...@ovirt.org; users@ovirt.org
>>> Subject: Re: [Users] non-operational state as host does not meet
> clusters'
>>> minimu CPU level.
>>>
>>> On 09/20/2012 12:19 PM, wujieke wrote:
 Hi, everyone, if it's not the right mail list, pls point out.. thanks..

 I am trying to install the ovirt on my Xeon E5-2650 process on Dell 
 server, which is installed with Fedora 17. While I create a new 
 host , which actually is the same server as overt-engine is running.

 The host is created ,and starting to "installing". But it ends with 
 "Non operational state".

 Error:

 Host CPU type is not compatible with cluster properties, missing 
 CPU
 feature: model_sandybridge.

 But in my cluster, I select "sandybridge" CPU, and my Xeon C5 is 
 also in Sandy bridge family.  And also this error lead my server
reboot.

 Any help is appreciated.

 Btw: I have enable INTEL-VT in BIOS. And modprobe KVM and kvm-intel 
 modules. . attached is screen shot for error.



 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

>>> please send output of this command from the host (not engine) 
>>> vdsClient -s 0 getVdsCaps | grep -i flags
>>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Fatal error during migration

2012-09-21 Thread Mike Burns
On Fri, 2012-09-21 at 01:58 -0400, Michal Skrivanek wrote:
> Well,looks like 16514 is not open on node. I guess it should,tls migration is 
> new in 3.1,isn't it?
> 

I'm surprised this wasn't caught earlier.  I've submitted a patch to add
the port to the default firewall [1].  

You can run the following command to open the firewall port manually on
ovirt-node.  

python -c 'from ovirtnode.ovirtfunctions import *; 
manage_firewall_port("16514","open","tcp")'

To make it work across reboots, do the following:

 1. Press F2 on the TUI to get a shell
 2. scp the attached patch file to /tmp on ovirt-node (you need to
initiate this from ovirt-node, not from your local machine)
 3. on ovirt-node, run # mount -o remount,rw /
 4. cd /usr/libexec
 5. patch http://gerrit.ovirt.org/8116



> On 20 Sep 2012, at 15:25, Mike Burns  wrote:
> 
> > On Thu, 2012-09-20 at 06:46 -0400, Doron Fediuck wrote:
> >> 
> >> __
> >>From: "Dmitriy A Pyryakov" 
> >>To: "Michal Skrivanek" 
> >>Cc: users@ovirt.org
> >>Sent: Thursday, September 20, 2012 1:34:46 PM
> >>Subject: Re: [Users] Fatal error during migration
> >> 
> >> 
> >> 
> >>Michal Skrivanek  написано
> >>20.09.2012 16:23:31:
> >> 
> >>> От: Michal Skrivanek 
> >>> Кому: Dmitriy A Pyryakov 
> >>> Копия: users@ovirt.org
> >>> Дата: 20.09.2012 16:24
> >>> Тема: Re: [Users] Fatal error during migration
> >>> 
> >>> 
> >>> On Sep 20, 2012, at 12:19 , Dmitriy A Pyryakov wrote:
> >>> 
>  Michal Skrivanek  написано
> >>20.09.201216:13:16:
>  
> > От: Michal Skrivanek 
> > Кому: Dmitriy A Pyryakov 
> > Копия: users@ovirt.org
> > Дата: 20.09.2012 16:13
> > Тема: Re: [Users] Fatal error during migration
> > 
> > 
> > On Sep 20, 2012, at 12:07 , Dmitriy A Pyryakov wrote:
> > 
> >> Michal Skrivanek 
> >>написано 20.09.
> >>> 201216:02:11:
> >> 
> >>> От: Michal Skrivanek 
> >>> Кому: Dmitriy A Pyryakov 
> >>> Копия: users@ovirt.org
> >>> Дата: 20.09.2012 16:02
> >>> Тема: Re: [Users] Fatal error during migration
> >>> 
> >>> Hi,
> >>> well, so what is the other side saying? Maybe some
> >>connectivity 
> >>> problems between those 2 hosts? firewall? 
> >>> 
> >>> Thanks,
> >>> michal
> >> 
> >> Yes, firewall is not configured properly by default.
> >>If I stop it,
> > migration done.
> >> Thanks.
> > The default is supposed to be:
> > 
> > # oVirt default firewall configuration. Automatically
> >>generated by 
> > vdsm bootstrap script.
> > *filter
> > :INPUT ACCEPT [0:0]
> > :FORWARD ACCEPT [0:0]
> > :OUTPUT ACCEPT [0:0]
> > -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
> > -A INPUT -p icmp -j ACCEPT
> > -A INPUT -i lo -j ACCEPT
> > # vdsm
> > -A INPUT -p tcp --dport 54321 -j ACCEPT
> > # libvirt tls
> > -A INPUT -p tcp --dport 16514 -j ACCEPT
> > # SSH
> > -A INPUT -p tcp --dport 22 -j ACCEPT
> > # guest consoles
> > -A INPUT -p tcp -m multiport --dports 5634:6166 -j
> >>ACCEPT
> > # migration
> > -A INPUT -p tcp -m multiport --dports 49152:49216 -j
> >>ACCEPT
> > # snmp
> > -A INPUT -p udp --dport 161 -j ACCEPT
> > # Reject any other input traffic
> > -A INPUT -j REJECT --reject-with icmp-host-prohibited
> > -A FORWARD -m physdev ! --physdev-is-bridged -j REJECT
> >>--reject-with
> > icmp-host-prohibited
> > COMMIT
>  
>  my default is:
>  
>  # cat /etc/sysconfig/iptables
>  # oVirt automatically generated firewall configuration
>  *filter
>  :INPUT ACCEPT [0:0]
>  :FORWARD ACCEPT [0:0]
>  :OUTPUT ACCEPT [0:0]
>  -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
>  -A INPUT -p icmp -j ACCEPT
>  -A INPUT -i lo -j ACCEPT
>  #vdsm
>  -A INPUT -p tcp --dport 54321 -j ACCEPT
>  # SSH
>  -A INPUT -p tcp --dport 22 -j ACCEPT
>  # guest consoles
>  -A INPUT -p tcp -m multiport --dports 5634:6166 -j ACCEPT
>  # migration
>  -A INPUT -p tcp -m multiport --dports 49152:49216 -j
> >>ACCEPT
>  # snmp
>  -A INPUT -p udp --dport 161 -j ACCEPT
>  #
>  -A INPUT -j REJECT --reject-with icmp-host-prohibited
>  -A FORWARD -m physdev ! --physdev-is-bridged -j REJECT
> >>--reject-
> >>> with icmp-host-prohibited
>  COMMIT
>  
> > 
> > did you change it manually or is the default missing
> >>anything?
>  
>  default missing "libvirt tls" field.
> >>> was it an upgrade of some sort?
> >>No.
> >> 
> >>> These are installed at node setup 
> >>> from ovirt-engine. Check the engine version and/or the 
> >>> IPTablesConfig in vdc_options table on engine
> >> 
> >>oVirt engine version: 3.1.0-2.fc17

Re: [Users] SPM not selected after host failed

2012-09-21 Thread Marc-Christian Schröer | ingenit GmbH & Co. KG
Am 20.09.2012 16:01, schrieb Itamar Heim:

Thanks again.

> yes... no auto recovery if can't verify node was fenced.
> for your tests, maybe power off the machine for your tests as opposed to "no 
> power"?
>

So I figured, I could use our Eaton/Raritan metered pdus to allow fencing
the designated SPM nodes but than realized that controlling pdus by snmp
was not supported by oVirt. Any chance that is going to change? Or can you
point me to the Java class where I can add this?

Kind regards,
   Marc

PS: Sent the reply to Itamar directly. Sorry about that...
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] API Documentation

2012-09-21 Thread Laszlo Hornyak
Hi!

Try this:
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Virtualization/3.0/html-single/REST_API_Guide/index.html
https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Virtualization/3.1-Beta/html/Developer_Guide/pt02.html

Laszlo

- Original Message -
> From: "??" 
> To: users@ovirt.org
> Sent: Friday, September 21, 2012 10:11:02 AM
> Subject: [Users] API Documentation
> 
> 
> 
> 
> 
> Hi, where can I documentation ovirt api?
> 
> Interest features:
> 
> 1. Suspended Virtual Machine
> 
> 2. Creating a snapshot
> 
> 3. Import snapshot to export storage
> 
> 
> 
> 
> 
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] non-operational state as host does not meet clusters' minimu CPU level.

2012-09-21 Thread wujieke
Some packages info.

[root@localhost ~]# rpm -qa | grep vdsm
vdsm-bootstrap-4.10.0-7.fc17.noarch
vdsm-xmlrpc-4.10.0-7.fc17.noarch
vdsm-python-4.10.0-7.fc17.x86_64
vdsm-4.10.0-7.fc17.x86_64
vdsm-cli-4.10.0-7.fc17.noarch
[root@localhost ~]# rpm -qa | grep ovirt
ovirt-image-uploader-3.1.0-0.git9c42c8.fc17.noarch
ovirt-engine-notification-service-3.1.0-2.fc17.noarch
ovirt-engine-dbscripts-3.1.0-2.fc17.noarch
ovirt-engine-3.1.0-2.fc17.noarch
ovirt-release-fedora-4-2.noarch
ovirt-iso-uploader-3.1.0-0.git1841d9.fc17.noarch
ovirt-engine-setup-3.1.0-2.fc17.noarch
ovirt-engine-webadmin-portal-3.1.0-2.fc17.noarch
ovirt-engine-setup-plugin-allinone-3.1.0-2.fc17.noarch
ovirt-engine-sdk-3.1.0.4-1.fc17.noarch
ovirt-engine-backend-3.1.0-2.fc17.noarch
ovirt-engine-config-3.1.0-2.fc17.noarch
ovirt-engine-userportal-3.1.0-2.fc17.noarch
ovirt-engine-genericapi-3.1.0-2.fc17.noarch
ovirt-engine-restapi-3.1.0-2.fc17.noarch
ovirt-log-collector-3.1.0-0.git10d719.fc17.noarch
ovirt-engine-tools-common-3.1.0-2.fc17.noarch
[root@localhost ~]# cat /etc/issue
Fedora release 17 (Beefy Miracle)
Kernel \r on an \m (\l)



-Original Message-
From: Itamar Heim [mailto:ih...@redhat.com] 
Sent: Friday, September 21, 2012 3:57 PM
To: wujieke
Cc: 'Mark Wu'; users@ovirt.org
Subject: Re: [Users] non-operational state as host does not meet clusters'
minimu CPU level.

On 09/21/2012 09:24 AM, wujieke wrote:
> Thanks a lot. Mark.
>
> Attach output for reference.

ok, so libvirt/virsh detect sandybridge, but vdsm only reports conroe.
there used to be a bug around this in vdsm - which version of vdsm are you
running?

>
> -Original Message-
> From: Mark Wu [mailto:wu...@linux.vnet.ibm.com]
> Sent: Friday, September 21, 2012 2:15 PM
> To: wujieke
> Cc: 'Itamar Heim'; users@ovirt.org
> Subject: Re: [Users] non-operational state as host does not meet clusters'
> minimu CPU level.
>
> On 09/21/2012 01:01 PM, wujieke wrote:
>> I follow the wiki page to re-install ovirt with all-in-one version .
>> my local host in ovirt is working now.
>> Thanks a lot.
>>
>> Btw: the cmd " virsh capabilities" complains out :
>>
>> [root@localhost ~]# virsh capabilities Please enter your 
>> authentication name:
>> Please enter your password:
>> error: Failed to reconnect to the hypervisor
>> error: no valid connection
>> error: authentication failed: Failed to step SASL negotiation: -1
> (SASL(-1):
>> generic failure: All-whitespace username.)
>>
>> any idea?
> Please try "virsh -r capabilities"
>>
>> -Original Message-
>> From: Itamar Heim [mailto:ih...@redhat.com]
>> Sent: Friday, September 21, 2012 12:44 PM
>> To: wujieke
>> Cc: node-de...@ovirt.org; users@ovirt.org
>> Subject: Re: [Users] non-operational state as host does not meet
clusters'
>> minimu CPU level.
>>
>> On 09/21/2012 03:54 AM, wujieke wrote:
>>> [root@localhost ~]# vdsClient -s 0 getVdsCaps | grep -i flags
>>>cpuFlags =
>>> fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,ps
>>> e
>>> 3
>>> 6,clfl
>>> ush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,pdpe1gb,rdtsc
>>> p
>>> ,
>>> lm,con
>>> stant_tsc,arch_perfmon,pebs,bts,rep_good,nopl,xtopology,nonstop_tsc,
>>> a
>>> p
>>> erfmpe
>>> rf,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xt
>>> p
>>> r
>>> ,pdcm,
>>> pcid,dca,sse4_1,sse4_2,x2apic,popcnt,tsc_deadline_timer,aes,xsave,av
>>> x
>>> ,
>>> lahf_l
>>> m,ida,arat,epb,xsaveopt,pln,pts,dts,tpr_shadow,vnmi,flexpriority,ept
>>> ,
>>> v
>>> pid,mo
>>> del_coreduo,model_Conroe
>>>
>>> seems only support model_Conroe?
>> and output of: virsh capabilities?
>>
>>
>>> -Original Message-
>>> From: Itamar Heim [mailto:ih...@redhat.com]
>>> Sent: Thursday, September 20, 2012 10:04 PM
>>> To: wujieke
>>> Cc: node-de...@ovirt.org; users@ovirt.org
>>> Subject: Re: [Users] non-operational state as host does not meet
> clusters'
>>> minimu CPU level.
>>>
>>> On 09/20/2012 12:19 PM, wujieke wrote:
 Hi, everyone, if it's not the right mail list, pls point out.. thanks..

 I am trying to install the ovirt on my Xeon E5-2650 process on Dell 
 server, which is installed with Fedora 17. While I create a new 
 host , which actually is the same server as overt-engine is running.

 The host is created ,and starting to "installing". But it ends with 
 "Non operational state".

 Error:

 Host CPU type is not compatible with cluster properties, missing 
 CPU
 feature: model_sandybridge.

 But in my cluster, I select "sandybridge" CPU, and my Xeon C5 is 
 also in Sandy bridge family.  And also this error lead my server
reboot.

 Any help is appreciated.

 Btw: I have enable INTEL-VT in BIOS. And modprobe KVM and kvm-intel 
 modules. . attached is screen shot for error.



 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users

>>> please send output of thi

[Users] API Documentation

2012-09-21 Thread ??????
Hi, where can I documentation ovirt api?

Interest features:

1.   Suspended Virtual Machine

2.   Creating a snapshot

3.   Import snapshot to export storage

 

 

 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] non-operational state as host does not meet clusters' minimu CPU level.

2012-09-21 Thread Itamar Heim

On 09/21/2012 09:24 AM, wujieke wrote:

Thanks a lot. Mark.

Attach output for reference.


ok, so libvirt/virsh detect sandybridge, but vdsm only reports conroe.
there used to be a bug around this in vdsm - which version of vdsm are 
you running?




-Original Message-
From: Mark Wu [mailto:wu...@linux.vnet.ibm.com]
Sent: Friday, September 21, 2012 2:15 PM
To: wujieke
Cc: 'Itamar Heim'; users@ovirt.org
Subject: Re: [Users] non-operational state as host does not meet clusters'
minimu CPU level.

On 09/21/2012 01:01 PM, wujieke wrote:

I follow the wiki page to re-install ovirt with all-in-one version .
my local host in ovirt is working now.
Thanks a lot.

Btw: the cmd " virsh capabilities" complains out :

[root@localhost ~]# virsh capabilities Please enter your
authentication name:
Please enter your password:
error: Failed to reconnect to the hypervisor
error: no valid connection
error: authentication failed: Failed to step SASL negotiation: -1

(SASL(-1):

generic failure: All-whitespace username.)

any idea?

Please try "virsh -r capabilities"


-Original Message-
From: Itamar Heim [mailto:ih...@redhat.com]
Sent: Friday, September 21, 2012 12:44 PM
To: wujieke
Cc: node-de...@ovirt.org; users@ovirt.org
Subject: Re: [Users] non-operational state as host does not meet clusters'
minimu CPU level.

On 09/21/2012 03:54 AM, wujieke wrote:

[root@localhost ~]# vdsClient -s 0 getVdsCaps | grep -i flags
   cpuFlags =
fpu,vme,de,pse,tsc,msr,pae,mce,cx8,apic,sep,mtrr,pge,mca,cmov,pat,pse
3
6,clfl
ush,dts,acpi,mmx,fxsr,sse,sse2,ss,ht,tm,pbe,syscall,nx,pdpe1gb,rdtscp
,
lm,con
stant_tsc,arch_perfmon,pebs,bts,rep_good,nopl,xtopology,nonstop_tsc,a
p
erfmpe
rf,pni,pclmulqdq,dtes64,monitor,ds_cpl,vmx,smx,est,tm2,ssse3,cx16,xtp
r
,pdcm,
pcid,dca,sse4_1,sse4_2,x2apic,popcnt,tsc_deadline_timer,aes,xsave,avx
,
lahf_l
m,ida,arat,epb,xsaveopt,pln,pts,dts,tpr_shadow,vnmi,flexpriority,ept,
v
pid,mo
del_coreduo,model_Conroe

seems only support model_Conroe?

and output of: virsh capabilities?



-Original Message-
From: Itamar Heim [mailto:ih...@redhat.com]
Sent: Thursday, September 20, 2012 10:04 PM
To: wujieke
Cc: node-de...@ovirt.org; users@ovirt.org
Subject: Re: [Users] non-operational state as host does not meet

clusters'

minimu CPU level.

On 09/20/2012 12:19 PM, wujieke wrote:

Hi, everyone, if it's not the right mail list, pls point out.. thanks..

I am trying to install the ovirt on my Xeon E5-2650 process on Dell
server, which is installed with Fedora 17. While I create a new host
, which actually is the same server as overt-engine is running.

The host is created ,and starting to "installing". But it ends with
"Non operational state".

Error:

Host CPU type is not compatible with cluster properties, missing CPU
feature: model_sandybridge.

But in my cluster, I select "sandybridge" CPU, and my Xeon C5 is
also in Sandy bridge family.  And also this error lead my server reboot.

Any help is appreciated.

Btw: I have enable INTEL-VT in BIOS. And modprobe KVM and kvm-intel
modules. . attached is screen shot for error.



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


please send output of this command from the host (not engine)
vdsClient -s 0 getVdsCaps | grep -i flags



___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users




___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users