[ovirt-users] SPM contending loop

2020-01-13 Thread Jayme
My cluster appears to be experiencing an SPM problem. I recently placed
each host in maintenance to move the ovirt management network to another
interface.  All was successful and all VMs are currently running.  However,
I'm not facing an SPM contending loop with data center going in and out of
responsive status.

I have a 3 server HCI setup and all volumes are active and healed, there
are no unsynced entries or split brains.

Does anyone know how I could diagnose the SPM issue?

engine.log:

2020-01-13 22:24:54,777-04 INFO
 [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand]
(DefaultQuartzScheduler2) [213adf4f] START,
GlusterTasksListVDSCommand(HostName = Orchard0,
VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}),
log id: 349f80a9
2020-01-13 22:24:55,231-04 INFO
 [org.ovirt.engine.core.vdsbroker.gluster.GlusterTasksListVDSCommand]
(DefaultQuartzScheduler2) [213adf4f] FINISH, GlusterTasksListVDSCommand,
return: [], log id: 349f80a9
2020-01-13 22:24:58,245-04 INFO
 [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler3) [4f66c75b] START,
GlusterServersListVDSCommand(HostName = Orchard0,
VdsIdVDSCommandParametersBase:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}),
log id: 7b04f110
2020-01-13 22:24:58,887-04 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) [] Command
'org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand'
return value '
TaskStatusListReturn:{status='Status [code=654, message=Not SPM]'}
'
2020-01-13 22:24:58,888-04 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) [] HostName = Orchard1
2020-01-13 22:24:58,888-04 ERROR
[org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) [] Command
'HSMGetAllTasksStatusesVDSCommand(HostName = Orchard1,
VdsIdVDSCommandParametersBase:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d'})'
execution failed: IRSGenericException: IRSErrorException:
IRSNonOperationalException: Not SPM
2020-01-13 22:24:59,034-04 INFO
 [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler3) [4f66c75b] FINISH, GlusterServersListVDSCommand,
return: [10.12.0.220/24:CONNECTED, orchard1.grove.silverorange.com:CONNECTED,
orchard2.grove.silverorange.com:DISCONNECTED], log id: 7b04f110
2020-01-13 22:24:59,049-04 INFO
 [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler3) [4f66c75b] START,
GlusterServersListVDSCommand(HostName = Orchard2,
VdsIdVDSCommandParametersBase:{hostId='fd0752d8-2d41-45b0-887a-0ffacbb8a237'}),
log id: 43f1dd82
2020-01-13 22:24:59,099-04 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) [] START,
ConnectStoragePoolVDSCommand(HostName = Orchard1,
ConnectStoragePoolVDSCommandParameters:{hostId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d',
vdsId='fb1e62d5-1dc1-4ccc-8b2b-cf48f7077d0d',
storagePoolId='a45e442e-9989-11e8-b0e4-00163e4bf18a', masterVersion='1'}),
log id: 2b397b31
2020-01-13 22:24:59,099-04 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) [] Executing with
domain map: {edc68a7c-7604-47e6-89bc-3738d727e8fc=active,
23c22a0f-0482-425e-8ada-730cf8ec0751=active,
390c0320-e843-4ff3-a4bb-a9973058447f=active,
fb43d33a-82c8-44cb-8169-090cd0d8f56e=active,
d70b171e-7488-4d52-8cad-bbc581dbf16e=active,
1f2e9989-9ab3-43d5-971d-568b8feca918=active}
2020-01-13 22:24:59,850-04 INFO
 [org.ovirt.engine.core.vdsbroker.gluster.GlusterServersListVDSCommand]
(DefaultQuartzScheduler3) [4f66c75b] FINISH, GlusterServersListVDSCommand,
return: [10.12.0.222/24:CONNECTED, 10.11.0.220:CONNECTED,
orchard1.grove.silverorange.com:CONNECTED], log id: 43f1dd82
2020-01-13 22:24:59,852-04 INFO
 [org.ovirt.engine.core.vdsbroker.gluster.GlusterVolumesListVDSCommand]
(DefaultQuartzScheduler3) [4f66c75b] START,
GlusterVolumesListVDSCommand(HostName = Orchard0,
GlusterVolumesListVDSParameters:{hostId='771c67eb-56e6-4736-8c67-668502d4ecf5'}),
log id: 263be6f8
2020-01-13 22:25:00,019-04 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStoragePoolVDSCommand]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) [] FINISH,
ConnectStoragePoolVDSCommand, return: , log id: 2b397b31
2020-01-13 22:25:00,036-04 INFO
 [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) []
hostFromVds::selectedVds - 'Orchard1', spmStatus 'Free', storage pool
'Default', storage pool version '4.3'
2020-01-13 22:25:00,056-04 INFO
 [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy]
(EE-ManagedThreadFactory-engineScheduled-Thread-72) [] starting spm on vds
'Orchard1', storage pool 'Default', prevI

[ovirt-users] SPM and Task error ...

2019-07-25 Thread Enrico

 Hi all,
my ovirt cluster has got 3 Hypervisors runnig Centos 7.5.1804 vdsm is 
4.20.39.1-1.el7,
ovirt engine is 4.2.4.5-1.el7, the storage systems are HP MSA P2000 and 
2050 (fibre channel).


I need to stop one of the hypervisors for maintenance but this system is 
the storage pool manager.


For this reason I decided to manually activate SPM in one of the other 
nodes but this operation is not

successful.

In the ovirt engine (engine.log) the error is this:

2019-07-25 12:39:16,744+02 INFO 
[org.ovirt.engine.core.bll.storage.pool.ForceSelectSPMCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] Running command: 
ForceSelectSPMCommand internal: false. Entities affected : ID: 
81c9bd3c-ae0a-467f-bf7f-63ab30cd8d9e Type: VDSAction group 
MANIPULATE_HOST with role type ADMIN
2019-07-25 12:39:16,745+02 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand] 
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START, 
SpmStopOnIrsVDSCommand( 
SpmStopOnIrsVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7', 
ignoreFailoverLimit='false'}), log id: 37bf4639
2019-07-25 12:39:16,747+02 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START, 
ResetIrsVDSCommand( 
ResetIrsVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7', 
ignoreFailoverLimit='false', 
vdsId='751f3e99-b95e-4c31-bc38-77f5661a0bdc', 
ignoreStopFailed='false'}), log id: 2522686f
2019-07-25 12:39:16,749+02 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START, 
SpmStopVDSCommand(HostName = infn-vm05.management, 
SpmStopVDSCommandParameters:{hostId='751f3e99-b95e-4c31-bc38-77f5661a0bdc', 
storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7'}), log id: 1810fd8b
2019-07-25 12:39:16,758+02 *ERROR* 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] SpmStopVDSCommand::Not 
stopping SPM on vds 'infn-vm05.management', pool id 
'18d57688-6ed4-43b8-bd7c-0665b55950b7' as there are uncleared tasks 
'Task 'fdcf4d1b-82fe-49a6-b233-323ebe568f8e', status 'running''
2019-07-25 12:39:16,758+02 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH, 
SpmStopVDSCommand, log id: 1810fd8b
2019-07-25 12:39:16,758+02 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH, 
ResetIrsVDSCommand, log id: 2522686f
2019-07-25 12:39:16,758+02 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand] 
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] FINISH, 
SpmStopOnIrsVDSCommand, log id: 37bf4639
2019-07-25 12:39:16,760+02 *ERROR* 
[org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] 
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] EVENT_ID: 
USER_FORCE_SELECTED_SPM_STOP_FAILED(4,096), Failed to force select 
infn-vm07.management as the SPM due to a failure to stop the current SPM.


while in the hypervisor (SPM) vdsm.log:

2019-07-25 12:39:16,744+02 INFO 
[org.ovirt.engine.core.bll.storage.pool.ForceSelectSPMCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] Running command: 
ForceSelectSPMCommand internal: false. Entities affected : ID: 
81c9bd3c-ae0a-467f-bf7f-63ab30cd8d9e Type: VDSAction group 
MANIPULATE_HOST with role type ADMIN
2019-07-25 12:39:16,745+02 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.SpmStopOnIrsVDSCommand] 
(default task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START, 
SpmStopOnIrsVDSCommand( 
SpmStopOnIrsVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7', 
ignoreFailoverLimit='false'}), log id: 37bf4639
2019-07-25 12:39:16,747+02 INFO 
[org.ovirt.engine.core.vdsbroker.irsbroker.ResetIrsVDSCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START, 
ResetIrsVDSCommand( 
ResetIrsVDSCommandParameters:{storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7', 
ignoreFailoverLimit='false', 
vdsId='751f3e99-b95e-4c31-bc38-77f5661a0bdc', 
ignoreStopFailed='false'}), log id: 2522686f
2019-07-25 12:39:16,749+02 INFO 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] START, 
SpmStopVDSCommand(HostName = infn-vm05.management, 
SpmStopVDSCommandParameters:{hostId='751f3e99-b95e-4c31-bc38-77f5661a0bdc', 
storagePoolId='18d57688-6ed4-43b8-bd7c-0665b55950b7'}), log id: 1810fd8b
2019-07-25 12:39:16,758+02 *ERROR* 
[org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (default 
task-30) [7c374384-f884-4dc9-87d0-7af27dce706b] SpmStopVDSCommand::Not 
stopping SPM on vds 'infn-vm05.management', pool id 
'18d57688-6ed4-43b8-bd7c-0665b55950b7' as there are uncleared tasks 
'Task 'fdcf4d1b-82fe-49a6-b233-323ebe568f8e', status 'running''
2019

[ovirt-users] SPM recovery after disaster

2017-10-02 Thread Alexander Vrublevskiy

Hello Community!
Recently we had a disaster with our oVirt 4.1 three nodes cluster 
with HE and GlusterFS (RF=3) storage domain. We've moved one node to 
maintenance and during actual maintenance one of working nodes with SPM 
role went down. It was hardware failure so we had to remove it from the 
cluster.
After tinkering around now we have almost working cluster with two 
nodes and with GlusterFS RF=2. But the problem is oVirt can't find SPM 
and spaming web interface logs with "HSMGetAllTasksStatusesVDS failed: 
Not SPM" error.
After some time of operating with stated configuration we lost contents of 
dom_md somehow.
Looks like these two problems are related and second one is a consequence of 
the first.
Please suggest how to recover SPM and dom_md. Is there a way to recreate both?
TIA
Regards
Alex___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM in case of Failure

2017-06-13 Thread Maor Lipchuk
Hi arsene,

See my comments inline

On Mon, Jun 12, 2017 at 1:02 PM, Arsène Gschwind
 wrote:
> Hi
>
> Our setup looks like:
>
> - 2 clusters in 2 different site connected with 10GBit LAN
> - Storage based on FC SAN replicated on both site and available for both
> site (The LUNs are available over 4 pathes, 2 from each site)
>
> My observation:
>
> In case one site goes down and this site owned SPM is it not possible to
> move or force SPM on the second site.

It could be a sanlock issue.
The SPM uses sanlock on the storage domain, so once the SPM host will
be rebooted and sanlock will be released from the storage domain (IINM
after 80 seconds) another Host can obtain a lock on that storage
domain and become the new SPM.
What is the message in the logs that you get when you try to do that?


> On the site which is down it's possible to reset all VMs that crashed using
> the "Confirm Host rebooted" menu on the oVirt Host but this does not reset
> SPM.
> The only solution I found was to bring the Host which owned SPM up again to
> be able to move it to the other site and then reactivate the storage
> domains.

I would try to attach the storage domain ( detach it first if it is
already attached) so you could register any VMs/Templates/Disks that
were added in the original env.

>
> Is this a normal behavior?
> Is there any way to force SPM reelection ?
>
> Thanks for your help or idea...
>
> Regards,
> Arsène
>
> --
>
> Arsène Gschwind
> Fa. Sapify AG im Auftrag der Universität Basel
> IT Services
> Klingelbergstr. 70 |  CH-4056 Basel  |  Switzerland
> Tel. +41 79 449 25 63  |  http://its.unibas.ch
> ITS-ServiceDesk: support-...@unibas.ch | +41 61 267 14 11
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] SPM in case of Failure

2017-06-12 Thread Arsène Gschwind

Hi

Our setup looks like:

- 2 clusters in 2 different site connected with 10GBit LAN
- Storage based on FC SAN replicated on both site and available for both 
site (The LUNs are available over 4 pathes, 2 from each site)


My observation:

In case one site goes down and this site owned SPM is it not possible to 
move or force SPM on the second site.
On the site which is down it's possible to reset all VMs that crashed 
using the "Confirm Host rebooted" menu on the oVirt Host but this does 
not reset SPM.
The only solution I found was to bring the Host which owned SPM up again 
to be able to move it to the other site and then reactivate the storage 
domains.


Is this a normal behavior?
Is there any way to force SPM reelection ?

Thanks for your help or idea...

Regards,
Arsène

--

*Arsène Gschwind*
Fa. Sapify AG im Auftrag der Universität Basel
IT Services
Klingelbergstr. 70 |  CH-4056 Basel  |  Switzerland
Tel. +41 79 449 25 63  | http://its.unibas.ch 
ITS-ServiceDesk: support-...@unibas.ch | +41 61 267 14 11

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM

2016-01-02 Thread Nir Soffer
On Thu, Dec 31, 2015 at 5:20 PM, Fernando Fuentes  wrote:
> Team,
>
> I noticed that my SPM moved to another host which was odd because I have
> a set SPM.

What do you mean by "set SPM"?

> Somehow when that happen two of my hosts went down and all my vms when
> in pause state.
> The oddity behind all this is that my primary storage which has allways
> been my SPM was online without any issues..

Your primary storage is a hypervisor used as SPM?

> What could of have cause that? and is there a way prevent from the SPM
> migrating unless there is an issue?

Can you attach enging log showing the timeframe when the SPM moved to another
host?

Can you attach logs from the host used as SPM showing the same timeframe?

Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] SPM

2015-12-31 Thread Fernando Fuentes
Team,

I noticed that my SPM moved to another host which was odd because I have
a set SPM.
Somehow when that happen two of my hosts went down and all my vms when
in pause state.
The oddity behind all this is that my primary storage which has allways
been my SPM was online without any issues..

What could of have cause that? and is there a way prevent from the SPM
migrating unless there is an issue?

-- 
Fernando Fuentes
ffuen...@txweather.org
http://www.txweather.org
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] spm changes hourly due to unexpected error of getAllTasksStatuses

2015-06-23 Thread like...@cs2c.com.cn
Thank you for your reply.

The result of both hosts is zh_CN.utf8
But i think may be it's not the problem of locale, 
because when i run 'vdsClient -s 0 getAllTasksStatuses' manually on the SPM 
host, there is no problem(no errors in vdsm.log)



like...@cs2c.com.cn
 
From: Eli Mesika
Date: 2015-06-23 16:44
To: like ma
CC: devel; users
Subject: Re: [ovirt-users] spm changes hourly due to unexpected error of 
getAllTasksStatuses
 
 
- Original Message -
> From: "like ma" 
> To: de...@ovirt.org, users@ovirt.org
> Sent: Tuesday, June 23, 2015 11:26:17 AM
> Subject: [ovirt-users] spm changes hourly due to unexpected error of 
> getAllTasksStatuses
> 
> Hello all,
> 
> In our ovirt environment(oVirt 3.4.4), we have 2 hosts, and the spm changes
> hourly between the 2 hosts. In the events of DataCenters there is following
> message each hour:
> 'Data Center is being initialized, please wait for initialization to
> complete'
> 
> In the vdsm.log i found following error message:
> Thread-13::ERROR::2015-06-23
> 13:38:36,667::task::866::TaskManager.Task::(_setError)
> Task=`e450d7d6-771d-4c51-90b7-c6b10da37897`::Unexpected error
> Traceback (most recent call last):
> File "/usr/share/vdsm/storage/task.py", line 873, in _run
> return fn(*args, **kargs)
> File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
> res = f(*args, **kwargs)
> File "/usr/share/vdsm/storage/hsm.py", line 2125, in getAllTasksStatuses
> allTasksStatus = sp.getAllTasksStatuses()
> File "/usr/share/vdsm/storage/securable.py", line 73, in wrapper
> raise SecureError("Secured object is not in safe state")
> SecureError: Secured object is not in safe state
> 
> The attachment is vdsm.log and engine.log. The error occured at 2015-06-23
> 13:38 and 2015-06-23 15:38 in vdsm.log.
> (by the way, in the other host the error occured at 2015-06-23 12:38 and
> 2015-06-23 14:38 )
> 
> And in the engine.log i found following error message:
> 2015-06-23 13:38:37,933 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (DefaultQuartzScheduler_Worker-34) [781beaea] IrsBroker::Failed::UpdateVMVDS
> due to: XmlRpcException: :'utf8' codec
> can't decode byte 0xeb in position 749: invalid continuation byte
> 
> 2015-06-23 13:38:37,971 ERROR [org.ovirt.engine.core.bll.OvfDataUpdater]
> (DefaultQuartzScheduler_Worker-34) [781beaea] Exception while trying to
> update or remove VMs/Templates ovf in Data Center Default.:
> org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException:
> java.lang.reflect.UndeclaredThrowableException (Failed with error
> VDS_NETWORK_ERROR and code 5022)
> at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:116)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:33)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.OvfDataUpdater.executeUpdateVmInSpmCommand(OvfDataUpdater.java:383)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.OvfDataUpdater.performOvfUpdate(OvfDataUpdater.java:163)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.OvfDataUpdater.updateOvfForVmsOfStoragePool(OvfDataUpdater.java:135)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.OvfDataUpdater.ovfUpdate_timer(OvfDataUpdater.java:93)
> [bll.jar:]
> at sun.reflect.GeneratedMethodAccessor250.invoke(Unknown Source) [:1.7.0_65]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [rt.jar:1.7.0_65]
> at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_65]
> at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:60)
> [scheduler.jar:]
> at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
> at
> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
> [quartz.jar:]
> 
> Can anyone help?
 
Can you please share the result of the following on both hosts ?
 
> echo $LANG
 
thanks 
 
 
> Thanks
> 
> like...@cs2c.com.cn
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] spm changes hourly due to unexpected error of getAllTasksStatuses

2015-06-23 Thread Eli Mesika


- Original Message -
> From: "like ma" 
> To: de...@ovirt.org, users@ovirt.org
> Sent: Tuesday, June 23, 2015 11:26:17 AM
> Subject: [ovirt-users] spm changes hourly due to unexpected error of  
> getAllTasksStatuses
> 
> Hello all,
> 
> In our ovirt environment(oVirt 3.4.4), we have 2 hosts, and the spm changes
> hourly between the 2 hosts. In the events of DataCenters there is following
> message each hour:
> 'Data Center is being initialized, please wait for initialization to
> complete'
> 
> In the vdsm.log i found following error message:
> Thread-13::ERROR::2015-06-23
> 13:38:36,667::task::866::TaskManager.Task::(_setError)
> Task=`e450d7d6-771d-4c51-90b7-c6b10da37897`::Unexpected error
> Traceback (most recent call last):
> File "/usr/share/vdsm/storage/task.py", line 873, in _run
> return fn(*args, **kargs)
> File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
> res = f(*args, **kwargs)
> File "/usr/share/vdsm/storage/hsm.py", line 2125, in getAllTasksStatuses
> allTasksStatus = sp.getAllTasksStatuses()
> File "/usr/share/vdsm/storage/securable.py", line 73, in wrapper
> raise SecureError("Secured object is not in safe state")
> SecureError: Secured object is not in safe state
> 
> The attachment is vdsm.log and engine.log. The error occured at 2015-06-23
> 13:38 and 2015-06-23 15:38 in vdsm.log.
> (by the way, in the other host the error occured at 2015-06-23 12:38 and
> 2015-06-23 14:38 )
> 
> And in the engine.log i found following error message:
> 2015-06-23 13:38:37,933 ERROR
> [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> (DefaultQuartzScheduler_Worker-34) [781beaea] IrsBroker::Failed::UpdateVMVDS
> due to: XmlRpcException: :'utf8' codec
> can't decode byte 0xeb in position 749: invalid continuation byte
> 
> 2015-06-23 13:38:37,971 ERROR [org.ovirt.engine.core.bll.OvfDataUpdater]
> (DefaultQuartzScheduler_Worker-34) [781beaea] Exception while trying to
> update or remove VMs/Templates ovf in Data Center Default.:
> org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException:
> java.lang.reflect.UndeclaredThrowableException (Failed with error
> VDS_NETWORK_ERROR and code 5022)
> at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:116)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:33)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.OvfDataUpdater.executeUpdateVmInSpmCommand(OvfDataUpdater.java:383)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.OvfDataUpdater.performOvfUpdate(OvfDataUpdater.java:163)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.OvfDataUpdater.updateOvfForVmsOfStoragePool(OvfDataUpdater.java:135)
> [bll.jar:]
> at
> org.ovirt.engine.core.bll.OvfDataUpdater.ovfUpdate_timer(OvfDataUpdater.java:93)
> [bll.jar:]
> at sun.reflect.GeneratedMethodAccessor250.invoke(Unknown Source) [:1.7.0_65]
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [rt.jar:1.7.0_65]
> at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_65]
> at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:60)
> [scheduler.jar:]
> at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
> at
> org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
> [quartz.jar:]
> 
> Can anyone help?

Can you please share the result of the following on both hosts ?

> echo $LANG

thanks 


> Thanks
> 
> like...@cs2c.com.cn
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] SPM host and snapshot deletion

2014-12-22 Thread Demeter Tibor
Hi, 
I have an ovirt 3.5 with glusterfs and three nodes. centos 6.5 and glusterfs 
3.5.2. 
When I do a snapshot deletion on stopped VM, then on the SPM host eats all of 
virtual memory and whit this kill all of running VMs that is running on SPM 
host. 
It is a very big problem because I need to delete a lot of snapshot. 
In this case I need to powereing off these VMs because there are no other 
options for stopping. 
I've try out the live migration, but in this case the live migration does not 
working. 

is it a know bug? 

Thanks in advance. 
Tibor 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM in oVirt 3.6

2014-08-05 Thread Sven Kieske
I'm not 100% sure what you mean with "forum"
but there is the ovirt bugtracker
with all reported and solved issues:

https://bugzilla.redhat.com/buglist.cgi?list_id=2714543&product=oVirt

HTH

Am 05.08.2014 16:22, schrieb Chandrahasa S:
> Dear All,
> 
> Why we do not have Ovirt Forum like CentOS. Where I can see all reported / 
> solved issues.

-- 
Mit freundlichen Grüßen / Regards

Sven Kieske

Systemadministrator
Mittwald CM Service GmbH & Co. KG
Königsberger Straße 6
32339 Espelkamp
T: +49-5772-293-100
F: +49-5772-293-333
https://www.mittwald.de
Geschäftsführer: Robert Meyer
St.Nr.: 331/5721/1033, USt-IdNr.: DE814773217, HRA 6640, AG Bad Oeynhausen
Komplementärin: Robert Meyer Verwaltungs GmbH, HRB 13260, AG Bad Oeynhausen
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM in oVirt 3.6

2014-08-05 Thread Chandrahasa S
Dear All,

Why we do not have Ovirt Forum like CentOS. Where I can see all reported / 
solved issues.

Regards,
Chandrahasa S
Tata Consultancy Services
Data Center- ( Non STPI)
2nd Pokharan Road,
Subash Nagar ,
Mumbai - 400601,Maharashtra
India
Ph:- +91 22 677-81825
Buzz:- 4221825
Mailto: chandrahas...@tcs.com
Website: http://www.tcs.com

Experience certainty.   IT Services
Business Solutions
Consulting




From:   Federico Simoncelli 
To: Daniel Helgenberger 
Cc: users@ovirt.org
Date:   08/05/2014 06:51 PM
Subject:Re: [ovirt-users] SPM in oVirt 3.6
Sent by:users-boun...@ovirt.org



- Original Message -
> From: "Nir Soffer" 
> To: "Daniel Helgenberger" 
> Cc: users@ovirt.org, "Federico Simoncelli" 
> Sent: Monday, July 28, 2014 6:43:30 PM
> Subject: Re: [ovirt-users] SPM in oVirt 3.6
> 
> - Original Message -
> > From: "Daniel Helgenberger" 
> > To: users@ovirt.org
> > Sent: Friday, July 25, 2014 7:51:33 PM
> > Subject: [ovirt-users] SPM in oVirt 3.6
> > 
> > just out of pure curiosity: In a BZ [1] Allon mentions SPM will go 
away
> > in ovirt 3.6.
> > 
> > This seems like a major change for me. I assume this will replace
> > sanlock as well? What will SPM be replaced with?
> 
> No, sanlock is not going anywhere.
> 
> The change is that we will not have an SPM node, but any node that need 
to
> make meta data changes, will take a lock using sanlock while it make the
> changes.
> 
> Federico: can you describe in more details how it is going to work?

Most of the information can be found on the feature page:

http://www.ovirt.org/Features/Decommission_Master_Domain_and_SPM

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

=-=-=
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM in oVirt 3.6

2014-08-05 Thread Federico Simoncelli
- Original Message -
> From: "Nir Soffer" 
> To: "Daniel Helgenberger" 
> Cc: users@ovirt.org, "Federico Simoncelli" 
> Sent: Monday, July 28, 2014 6:43:30 PM
> Subject: Re: [ovirt-users] SPM in oVirt 3.6
> 
> - Original Message -
> > From: "Daniel Helgenberger" 
> > To: users@ovirt.org
> > Sent: Friday, July 25, 2014 7:51:33 PM
> > Subject: [ovirt-users] SPM in oVirt 3.6
> > 
> > just out of pure curiosity: In a BZ [1] Allon mentions SPM will go away
> > in ovirt 3.6.
> > 
> > This seems like a major change for me. I assume this will replace
> > sanlock as well? What will SPM be replaced with?
> 
> No, sanlock is not going anywhere.
> 
> The change is that we will not have an SPM node, but any node that need to
> make meta data changes, will take a lock using sanlock while it make the
> changes.
> 
> Federico: can you describe in more details how it is going to work?

Most of the information can be found on the feature page:

http://www.ovirt.org/Features/Decommission_Master_Domain_and_SPM

-- 
Federico
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM in oVirt 3.6

2014-07-28 Thread Nir Soffer
- Original Message -
> From: "Daniel Helgenberger" 
> To: users@ovirt.org
> Sent: Friday, July 25, 2014 7:51:33 PM
> Subject: [ovirt-users] SPM in oVirt 3.6
> 
> just out of pure curiosity: In a BZ [1] Allon mentions SPM will go away
> in ovirt 3.6.
> 
> This seems like a major change for me. I assume this will replace
> sanlock as well? What will SPM be replaced with?

No, sanlock is not going anywhere.

The change is that we will not have an SPM node, but any node that need to
make meta data changes, will take a lock using sanlock while it make the
changes.

Federico: can you describe in more details how it is going to work?

Nir
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] SPM in oVirt 3.6

2014-07-25 Thread Daniel Helgenberger
Hello,

just out of pure curiosity: In a BZ [1] Allon mentions SPM will go away
in ovirt 3.6.

This seems like a major change for me. I assume this will replace
sanlock as well? What will SPM be replaced with?

Thanks,
Daniel 

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1116558#c9

-- 

Daniel Helgenberger 
m box bewegtbild GmbH 

P: +49/30/2408781-22
F: +49/30/2408781-10

ACKERSTR. 19 
D-10115 BERLIN 


www.m-box.de  www.monkeymen.tv 

Geschäftsführer: Martin Retschitzegger / Michaela Göllner
Handeslregister: Amtsgericht Charlottenburg / HRB 112767 


smime.p7s
Description: S/MIME cryptographic signature
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-16 Thread Maurice James
Im running CentOS 6.5 2.6.32-431.11.2.el6.x86_64

VDSM:
vdsm-python-4.14.6-15.git746e2e9.el6.x86_64
vdsm-hook-isolatedprivatevlan-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-vmfex-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-sriov-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-checkimages-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-smbios-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-hostusb-4.14.6-15.git746e2e9.el6.noarch
vdsm-4.14.6-15.git746e2e9.el6.x86_64
vdsm-hook-faqemu-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-promisc-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-scratchpad-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-qos-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-numa-4.14.6-15.git746e2e9.el6.noarch
vdsm-python-zombiereaper-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-qemucmdline-4.14.6-15.git746e2e9.el6.noarch
vdsm-cli-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-fileinject-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-directlun-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-vmdisk-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-macspoof-4.14.6-15.git746e2e9.el6.noarch
vdsm-gluster-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-floppy-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-hugepages-4.14.6-15.git746e2e9.el6.noarch
vdsm-xmlrpc-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-pincpu-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-openstacknet-4.14.6-15.git746e2e9.el6.noarch



Libvirt:
libvirt-lock-sanlock-0.10.2-29.el6_5.7.x86_64
libvirt-client-0.10.2-29.el6_5.7.x86_64
libvirt-0.10.2-29.el6_5.7.x86_64
libvirt-python-0.10.2-29.el6_5.7.x86_64

Ovirt-engine:
ovirt-engine-dwh-setup-3.4.1-0.0.master.20140406181125.git4081b13.el6.noarch
ovirt-engine-dwh-3.4.1-0.0.master.20140406181125.git4081b13.el6.noarch
ovirt-engine-websocket-proxy-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-setup-plugin-ovirt-engine-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-dbscripts-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-sdk-python-3.4.0.7-1.20140228.git19e14c5.el6.noarch
ovirt-engine-reports-3.4.1-0.0.master.20140410124141.gitbf81400.el6.noarch
ovirt-engine-setup-base-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-setup-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-backend-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-tools-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-reports-setup-3.4.1-0.0.master.20140410124141.gitbf81400.el6.noarch
ovirt-engine-setup-plugin-websocket-proxy-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-setup-plugin-allinone-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-webadmin-portal-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-cli-3.4.0.6-1.20140227.gite87e2bc.el6.noarch
ovirt-engine-setup-plugin-ovirt-engine-common-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-userportal-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-lib-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
ovirt-engine-restapi-3.4.1-0.0.master.20140413010852.git43746c6.el6.noarch
 

Qemu:
qemu-kvm-tools-0.12.1.2-2.415.el6_5.7.x86_64
qemu-kvm-0.12.1.2-2.415.el6_5.7.x86_64
vdsm-hook-faqemu-4.14.6-15.git746e2e9.el6.noarch
vdsm-hook-qemucmdline-4.14.6-15.git746e2e9.el6.noarch
qemu-img-0.12.1.2-2.415.el6_5.7.x86_64
qemu-guest-agent-0.12.1.2-2.415.el6_5.7.x86_64
gpxe-roms-qemu-0.9.7-6.10.el6.noarch

- Original Message -
From: "Liron Aravot" 
To: "Maurice James" , "fsimonce" 
Cc: users@ovirt.org
Sent: Wednesday, April 16, 2014 10:30:11 AM
Subject: Re: [ovirt-users] SPM error



- Original Message -
> From: "Maurice James" 
> To: "Liron Aravot" 
> Cc: users@ovirt.org
> Sent: Wednesday, April 16, 2014 4:49:18 PM
> Subject: Re: [ovirt-users] SPM error
> 
> After a few sleepless nights. I went through my entire system again and I
> found an interface on one of my hosts that had already been removed from the
> UI. Even after multiple "service network restart" it would still show up
> when I ran "ip addr". I had to end up forcefully removing it with rm -rf
> /etc/sysconfig/network-scripts/ifcfg-. After that I rebooted the
> node and the SPM came out of contention. I cant make sense of it but it
> worked

ok, so we might have a bug here - what os are you running? 

as it seems the initial issue SPM issue is as the bug i provided earlier,
seems like the bug you opened on that issue can be closed as a duplicate, 
adding federico to verify that there's no further sanlock issue there.
> 
> - Original Message -
> From: "Liron Aravot" 
> To: "Maurice \"Moe\" James" 
> Cc: users@ovirt.org
> Sent: Wednesday, April 16, 2014 8:49:05 AM
> Subject: Re: [ovirt-users] SPM error
> 
> Hi Maurice,
> any u

Re: [ovirt-users] SPM error

2014-04-16 Thread Liron Aravot


- Original Message -
> From: "Maurice James" 
> To: "Liron Aravot" 
> Cc: users@ovirt.org
> Sent: Wednesday, April 16, 2014 4:49:18 PM
> Subject: Re: [ovirt-users] SPM error
> 
> After a few sleepless nights. I went through my entire system again and I
> found an interface on one of my hosts that had already been removed from the
> UI. Even after multiple "service network restart" it would still show up
> when I ran "ip addr". I had to end up forcefully removing it with rm -rf
> /etc/sysconfig/network-scripts/ifcfg-. After that I rebooted the
> node and the SPM came out of contention. I cant make sense of it but it
> worked

ok, so we might have a bug here - what os are you running? 

as it seems the initial issue SPM issue is as the bug i provided earlier,
seems like the bug you opened on that issue can be closed as a duplicate, 
adding federico to verify that there's no further sanlock issue there.
> 
> - Original Message -
> From: "Liron Aravot" 
> To: "Maurice \"Moe\" James" 
> Cc: users@ovirt.org
> Sent: Wednesday, April 16, 2014 8:49:05 AM
> Subject: Re: [ovirt-users] SPM error
> 
> Hi Maurice,
> any updates on the above?
> 
> thanks, Liron
> 
> - Original Message -----
> > From: "Liron Aravot" 
> > To: "Maurice \"Moe\" James" 
> > Cc: users@ovirt.org
> > Sent: Tuesday, April 15, 2014 11:53:40 AM
> > Subject: Re: [ovirt-users] SPM error
> > 
> > 
> > 
> > - Original Message -
> > > From: "Maurice \"Moe\" James" 
> > > To: "Liron Aravot" 
> > > Cc: "Itamar Heim" , users@ovirt.org
> > > Sent: Tuesday, April 15, 2014 3:14:16 AM
> > > Subject: Re: [ovirt-users] SPM error
> > > 
> > > Sorry forgot to paste
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1086951
> > 
> > Hi Maurice,
> > the issue is that the host doesn't have access to all the storage domains
> > which causes to the spm start process to fail.
> > There's a bug open for that issue -
> > https://bugzilla.redhat.com/show_bug.cgi?id=1072900 so seems we'll be able
> > to close the one you opened as a duplicate but let's wait with that till
> > your issue is solved.
> > From looking in the logs, it seems like that host have problem accessing
> > two
> > storage domains -
> > 3406665e-4adc-4fd4-aa1e-037547b29adb
> > f3b51811-4a7f-43af-8633-322b3db23c48
> > 
> > Can you verify that the host can access those domains? from the log it
> > seems
> > like the nfs paths for those are:
> > shtistg01.suprtekstic.com:/storage/infrastructure
> > shtistg01.suprtekstic.com:/storage/exports
> > 
> > 
> > log snippet:
> > 1.
> > Thread-14::DEBUG::2014-04-11
> > 22:54:44,331::mount::226::Storage.Misc.excCmd::(_runcmd) '/usr/bin/sudo -n
> > /bin/mount -t nfs -o soft,nosharecache,timeo=600,retra
> > ns=6,nfsvers=3 ashtistg01.suprtekstic.com:/storage/exports
> > /rhev/data-center/mnt/ashtistg01.suprtekstic.com:_storage_exports' (cwd
> > None)
> > Thread-14::ERROR::2014-04-11
> > 22:55:36,659::storageServer::209::StorageServer.MountConnection::(connect)
> > Mount failed: (32, ';mount.nfs: Failed to resolve serv
> > er ashtistg01.suprtekstic.com: Name or service not known\n')
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/storage/storageServer.py", line 207, in connect
> > self._mount.mount(self.options, self._vfsType)
> >   File "/usr/share/vdsm/storage/mount.py", line 222, in mount
> > return self._runcmd(cmd, timeout)
> >   File "/usr/share/vdsm/storage/mount.py", line 238, in _runcmd
> > raise MountError(rc, ";".join((out, err)))
> > MountError: (32, ';mount.nfs: Failed to resolve server
> > ashtistg01.suprtekstic.com: Name or service not known\n')
> > Thread-14::ERROR::2014-04-11
> > 22:55:36,705::hsm::2379::Storage.HSM::(connectStorageServer) Could not
> > connect to storageServer
> > Traceback (most recent call last):
> >   File "/usr/share/vdsm/storage/hsm.py", line 2376, in connectStorageServer
> > conObj.connect()
> >   File "/usr/share/vdsm/storage/storageServer.py", line 320, in connect
> > return self._mountCon.connect()
> >   File "/usr/share/vdsm/storage/storageServer.py", line 215, in connect
> > raise e
> > MountError: (32, ';mount.nfs: Failed to resolve server
> > ashtistg01.suprtekst

Re: [ovirt-users] SPM error

2014-04-16 Thread Maurice James
After a few sleepless nights. I went through my entire system again and I found 
an interface on one of my hosts that had already been removed from the UI. Even 
after multiple "service network restart" it would still show up when I ran "ip 
addr". I had to end up forcefully removing it with rm -rf 
/etc/sysconfig/network-scripts/ifcfg-. After that I rebooted the 
node and the SPM came out of contention. I cant make sense of it but it worked

- Original Message -
From: "Liron Aravot" 
To: "Maurice \"Moe\" James" 
Cc: users@ovirt.org
Sent: Wednesday, April 16, 2014 8:49:05 AM
Subject: Re: [ovirt-users] SPM error

Hi Maurice,
any updates on the above?

thanks, Liron

- Original Message -
> From: "Liron Aravot" 
> To: "Maurice \"Moe\" James" 
> Cc: users@ovirt.org
> Sent: Tuesday, April 15, 2014 11:53:40 AM
> Subject: Re: [ovirt-users] SPM error
> 
> 
> 
> - Original Message -
> > From: "Maurice \"Moe\" James" 
> > To: "Liron Aravot" 
> > Cc: "Itamar Heim" , users@ovirt.org
> > Sent: Tuesday, April 15, 2014 3:14:16 AM
> > Subject: Re: [ovirt-users] SPM error
> > 
> > Sorry forgot to paste
> > https://bugzilla.redhat.com/show_bug.cgi?id=1086951
> 
> Hi Maurice,
> the issue is that the host doesn't have access to all the storage domains
> which causes to the spm start process to fail.
> There's a bug open for that issue -
> https://bugzilla.redhat.com/show_bug.cgi?id=1072900 so seems we'll be able
> to close the one you opened as a duplicate but let's wait with that till
> your issue is solved.
> From looking in the logs, it seems like that host have problem accessing two
> storage domains -
> 3406665e-4adc-4fd4-aa1e-037547b29adb
> f3b51811-4a7f-43af-8633-322b3db23c48
> 
> Can you verify that the host can access those domains? from the log it seems
> like the nfs paths for those are:
> shtistg01.suprtekstic.com:/storage/infrastructure
> shtistg01.suprtekstic.com:/storage/exports
> 
> 
> log snippet:
> 1.
> Thread-14::DEBUG::2014-04-11
> 22:54:44,331::mount::226::Storage.Misc.excCmd::(_runcmd) '/usr/bin/sudo -n
> /bin/mount -t nfs -o soft,nosharecache,timeo=600,retra
> ns=6,nfsvers=3 ashtistg01.suprtekstic.com:/storage/exports
> /rhev/data-center/mnt/ashtistg01.suprtekstic.com:_storage_exports' (cwd
> None)
> Thread-14::ERROR::2014-04-11
> 22:55:36,659::storageServer::209::StorageServer.MountConnection::(connect)
> Mount failed: (32, ';mount.nfs: Failed to resolve serv
> er ashtistg01.suprtekstic.com: Name or service not known\n')
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/storageServer.py", line 207, in connect
> self._mount.mount(self.options, self._vfsType)
>   File "/usr/share/vdsm/storage/mount.py", line 222, in mount
> return self._runcmd(cmd, timeout)
>   File "/usr/share/vdsm/storage/mount.py", line 238, in _runcmd
> raise MountError(rc, ";".join((out, err)))
> MountError: (32, ';mount.nfs: Failed to resolve server
> ashtistg01.suprtekstic.com: Name or service not known\n')
> Thread-14::ERROR::2014-04-11
> 22:55:36,705::hsm::2379::Storage.HSM::(connectStorageServer) Could not
> connect to storageServer
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/hsm.py", line 2376, in connectStorageServer
> conObj.connect()
>   File "/usr/share/vdsm/storage/storageServer.py", line 320, in connect
> return self._mountCon.connect()
>   File "/usr/share/vdsm/storage/storageServer.py", line 215, in connect
> raise e
> MountError: (32, ';mount.nfs: Failed to resolve server
> ashtistg01.suprtekstic.com: Name or service not known\n')
> 
> 
> 
> 
> 
> 2.
> Thread-14::ERROR::2014-04-11
> 22:56:29,307::storageServer::209::StorageServer.MountConnection::(connect)
> Mount failed: (32, ';mount.nfs: Failed to resolve serv
> er ashtistg01.suprtekstic.com: Name or service not known\n')
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/storageServer.py", line 207, in connect
> self._mount.mount(self.options, self._vfsType)
>   File "/usr/share/vdsm/storage/mount.py", line 222, in mount
> return self._runcmd(cmd, timeout)
>   File "/usr/share/vdsm/storage/mount.py", line 238, in _runcmd
> raise MountError(rc, ";".join((out, err)))
> MountError: (32, ';mount.nfs: Failed to resolve server
> ashtistg01.suprtekstic.com: Name or service not known\n')
> Thread-14::ERROR::2014-04-11
> 22:56:29,309::hsm::2379::Stor

Re: [ovirt-users] SPM error

2014-04-16 Thread Liron Aravot
Hi Maurice,
any updates on the above?

thanks, Liron

- Original Message -
> From: "Liron Aravot" 
> To: "Maurice \"Moe\" James" 
> Cc: users@ovirt.org
> Sent: Tuesday, April 15, 2014 11:53:40 AM
> Subject: Re: [ovirt-users] SPM error
> 
> 
> 
> - Original Message -
> > From: "Maurice \"Moe\" James" 
> > To: "Liron Aravot" 
> > Cc: "Itamar Heim" , users@ovirt.org
> > Sent: Tuesday, April 15, 2014 3:14:16 AM
> > Subject: Re: [ovirt-users] SPM error
> > 
> > Sorry forgot to paste
> > https://bugzilla.redhat.com/show_bug.cgi?id=1086951
> 
> Hi Maurice,
> the issue is that the host doesn't have access to all the storage domains
> which causes to the spm start process to fail.
> There's a bug open for that issue -
> https://bugzilla.redhat.com/show_bug.cgi?id=1072900 so seems we'll be able
> to close the one you opened as a duplicate but let's wait with that till
> your issue is solved.
> From looking in the logs, it seems like that host have problem accessing two
> storage domains -
> 3406665e-4adc-4fd4-aa1e-037547b29adb
> f3b51811-4a7f-43af-8633-322b3db23c48
> 
> Can you verify that the host can access those domains? from the log it seems
> like the nfs paths for those are:
> shtistg01.suprtekstic.com:/storage/infrastructure
> shtistg01.suprtekstic.com:/storage/exports
> 
> 
> log snippet:
> 1.
> Thread-14::DEBUG::2014-04-11
> 22:54:44,331::mount::226::Storage.Misc.excCmd::(_runcmd) '/usr/bin/sudo -n
> /bin/mount -t nfs -o soft,nosharecache,timeo=600,retra
> ns=6,nfsvers=3 ashtistg01.suprtekstic.com:/storage/exports
> /rhev/data-center/mnt/ashtistg01.suprtekstic.com:_storage_exports' (cwd
> None)
> Thread-14::ERROR::2014-04-11
> 22:55:36,659::storageServer::209::StorageServer.MountConnection::(connect)
> Mount failed: (32, ';mount.nfs: Failed to resolve serv
> er ashtistg01.suprtekstic.com: Name or service not known\n')
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/storageServer.py", line 207, in connect
> self._mount.mount(self.options, self._vfsType)
>   File "/usr/share/vdsm/storage/mount.py", line 222, in mount
> return self._runcmd(cmd, timeout)
>   File "/usr/share/vdsm/storage/mount.py", line 238, in _runcmd
> raise MountError(rc, ";".join((out, err)))
> MountError: (32, ';mount.nfs: Failed to resolve server
> ashtistg01.suprtekstic.com: Name or service not known\n')
> Thread-14::ERROR::2014-04-11
> 22:55:36,705::hsm::2379::Storage.HSM::(connectStorageServer) Could not
> connect to storageServer
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/hsm.py", line 2376, in connectStorageServer
> conObj.connect()
>   File "/usr/share/vdsm/storage/storageServer.py", line 320, in connect
> return self._mountCon.connect()
>   File "/usr/share/vdsm/storage/storageServer.py", line 215, in connect
> raise e
> MountError: (32, ';mount.nfs: Failed to resolve server
> ashtistg01.suprtekstic.com: Name or service not known\n')
> 
> 
> 
> 
> 
> 2.
> Thread-14::ERROR::2014-04-11
> 22:56:29,307::storageServer::209::StorageServer.MountConnection::(connect)
> Mount failed: (32, ';mount.nfs: Failed to resolve serv
> er ashtistg01.suprtekstic.com: Name or service not known\n')
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/storageServer.py", line 207, in connect
> self._mount.mount(self.options, self._vfsType)
>   File "/usr/share/vdsm/storage/mount.py", line 222, in mount
> return self._runcmd(cmd, timeout)
>   File "/usr/share/vdsm/storage/mount.py", line 238, in _runcmd
> raise MountError(rc, ";".join((out, err)))
> MountError: (32, ';mount.nfs: Failed to resolve server
> ashtistg01.suprtekstic.com: Name or service not known\n')
> Thread-14::ERROR::2014-04-11
> 22:56:29,309::hsm::2379::Storage.HSM::(connectStorageServer) Could not
> connect to storageServer
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/hsm.py", line 2376, in connectStorageServer
> conObj.connect()
>   File "/usr/share/vdsm/storage/storageServer.py", line 320, in connect
> return self._mountCon.connect()
>   File "/usr/share/vdsm/storage/storageServer.py", line 215, in connect
> raise e
> MountError: (32, ';mount.nfs: Failed to resolve server
> ashtistg01.suprtekstic.com: Name or service not known\n')
> 
> 
> Regardless of that, there are sanlock errors over the log when try

Re: [ovirt-users] SPM error

2014-04-15 Thread Liron Aravot


- Original Message -
> From: "Maurice \"Moe\" James" 
> To: "Liron Aravot" 
> Cc: "Itamar Heim" , users@ovirt.org
> Sent: Tuesday, April 15, 2014 3:14:16 AM
> Subject: Re: [ovirt-users] SPM error
> 
> Sorry forgot to paste
> https://bugzilla.redhat.com/show_bug.cgi?id=1086951

Hi Maurice,
the issue is that the host doesn't have access to all the storage domains which 
causes to the spm start process to fail.
There's a bug open for that issue - 
https://bugzilla.redhat.com/show_bug.cgi?id=1072900 so seems we'll be able to 
close the one you opened as a duplicate but let's wait with that till your 
issue is solved.
From looking in the logs, it seems like that host have problem accessing two 
storage domains - 
3406665e-4adc-4fd4-aa1e-037547b29adb
f3b51811-4a7f-43af-8633-322b3db23c48

Can you verify that the host can access those domains? from the log it seems 
like the nfs paths for those are:
shtistg01.suprtekstic.com:/storage/infrastructure
shtistg01.suprtekstic.com:/storage/exports


log snippet:
1.
Thread-14::DEBUG::2014-04-11 
22:54:44,331::mount::226::Storage.Misc.excCmd::(_runcmd) '/usr/bin/sudo -n 
/bin/mount -t nfs -o soft,nosharecache,timeo=600,retra
ns=6,nfsvers=3 ashtistg01.suprtekstic.com:/storage/exports 
/rhev/data-center/mnt/ashtistg01.suprtekstic.com:_storage_exports' (cwd None)
Thread-14::ERROR::2014-04-11 
22:55:36,659::storageServer::209::StorageServer.MountConnection::(connect) 
Mount failed: (32, ';mount.nfs: Failed to resolve serv
er ashtistg01.suprtekstic.com: Name or service not known\n')
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/storageServer.py", line 207, in connect
self._mount.mount(self.options, self._vfsType)
  File "/usr/share/vdsm/storage/mount.py", line 222, in mount
return self._runcmd(cmd, timeout)
  File "/usr/share/vdsm/storage/mount.py", line 238, in _runcmd
raise MountError(rc, ";".join((out, err)))
MountError: (32, ';mount.nfs: Failed to resolve server 
ashtistg01.suprtekstic.com: Name or service not known\n')
Thread-14::ERROR::2014-04-11 
22:55:36,705::hsm::2379::Storage.HSM::(connectStorageServer) Could not connect 
to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2376, in connectStorageServer
conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 320, in connect
return self._mountCon.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 215, in connect
raise e
MountError: (32, ';mount.nfs: Failed to resolve server 
ashtistg01.suprtekstic.com: Name or service not known\n')





2.
Thread-14::ERROR::2014-04-11 
22:56:29,307::storageServer::209::StorageServer.MountConnection::(connect) 
Mount failed: (32, ';mount.nfs: Failed to resolve serv
er ashtistg01.suprtekstic.com: Name or service not known\n')
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/storageServer.py", line 207, in connect
self._mount.mount(self.options, self._vfsType)
  File "/usr/share/vdsm/storage/mount.py", line 222, in mount
return self._runcmd(cmd, timeout)
  File "/usr/share/vdsm/storage/mount.py", line 238, in _runcmd
raise MountError(rc, ";".join((out, err)))
MountError: (32, ';mount.nfs: Failed to resolve server 
ashtistg01.suprtekstic.com: Name or service not known\n')
Thread-14::ERROR::2014-04-11 
22:56:29,309::hsm::2379::Storage.HSM::(connectStorageServer) Could not connect 
to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2376, in connectStorageServer
conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 320, in connect
return self._mountCon.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 215, in connect
raise e
MountError: (32, ';mount.nfs: Failed to resolve server 
ashtistg01.suprtekstic.com: Name or service not known\n')


Regardless of that, there are sanlock errors over the log when trying to 
acquire host-id over the log.
I'd start with check the connectivity issues to the storage domains above, 
later on you can attach check that sanlock is running and operational and/or 
attach the sanlock logs.


> 
> On Mon, 2014-04-14 at 17:11 -0400, Liron Aravot wrote:
> > 
> > - Original Message -
> > > From: "Maurice \"Moe\" James" 
> > > To: "Itamar Heim" 
> > > Cc: users@ovirt.org
> > > Sent: Sunday, April 13, 2014 2:28:45 AM
> > > Subject: Re: [ovirt-users] SPM error
> > > 
> > > Were you able to find out anything? Is there anything that I can check
> > > in the meanwhile?
> > > 
> 

Re: [ovirt-users] SPM error

2014-04-14 Thread Maurice "Moe" James
Sorry forgot to paste
https://bugzilla.redhat.com/show_bug.cgi?id=1086951 

On Mon, 2014-04-14 at 17:11 -0400, Liron Aravot wrote:
> 
> - Original Message -
> > From: "Maurice \"Moe\" James" 
> > To: "Itamar Heim" 
> > Cc: users@ovirt.org
> > Sent: Sunday, April 13, 2014 2:28:45 AM
> > Subject: Re: [ovirt-users] SPM error
> > 
> > Were you able to find out anything? Is there anything that I can check
> > in the meanwhile?
> > 
> 
> Hi Muarice,
> can you please attach the ovirt engine/vdsm logs?
> thanks,
> Liron
> > 
> > On Sat, 2014-04-12 at 19:23 +0300, Itamar Heim wrote:
> > > On 04/12/2014 03:40 PM, Maurice James wrote:
> > > > What did you do to try to fix the sanlock? Anything is better than
> > > > nothing at this point
> > > >
> > > > - Original Message -
> > > > From: "Ted Miller" 
> > > > To: "Maurice James" 
> > > > Sent: Friday, April 11, 2014 7:27:24 PM
> > > > Subject: Re: [ovirt-users] SPM error
> > > >
> > > > I did receive some help on one stage of rebuilding my sanlock, but there
> > > > were
> > > > too many other things wrong to get it started again. Only advice I have
> > > > is --
> > > > look at your sanlock logs, and see if you can find anything there that 
> > > > is
> > > > helpful.
> > > >
> > > > On 4/11/2014 7:23 PM, Maurice James wrote:
> > > >> Nooo.
> > > >>
> > > >>
> > > >> Sent from my Galaxy S®III
> > > >>
> > > >>  Original message 
> > > >> From: Ted Miller 
> > > >> Date:04/11/2014  7:08 PM  (GMT-05:00)
> > > >> To: Maurice James 
> > > >> Subject: Re: [ovirt-users] SPM error
> > > >>
> > > >>
> > > >>
> > > >> I didn't, really.  I did something wrong along the way, and ended up
> > > >> having
> > > >> to rebuild the engine and hosts.  (My problems were due to a glusterfs
> > > >> split-brain.)
> > > >> Ted Miller
> > > >>
> > > >> On 4/11/2014 6:03 PM, Maurice James wrote:
> > > >>> How did you fix it?
> > > >>>
> > > >>>
> > > >>> Sent from my Galaxy S®III
> > > >>>
> > > >>>  Original message 
> > > >>> From: Ted Miller 
> > > >>> Date:04/11/2014  6:00 PM  (GMT-05:00)
> > > >>> To: users@ovirt.org
> > > >>> Subject: Re: [ovirt-users] SPM error
> > > >>>
> > > >>>
> > > >>>
> > > >>> On 4/11/2014 2:05 PM, Maurice James wrote:
> > > >>>> I have an error trying to bring the master DC back online. After
> > > >>>> several
> > > >>>> reboots, no luck. I took the other cluster members offline to try to
> > > >>>> troubleshoot. The remaining host is constantly in contention with
> > > >>>> itself
> > > >>>> for SPM
> > > >>>>
> > > >>>>
> > > >>>> ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> > > >>>> (DefaultQuartzScheduler_Worker-40) [38d400ea]
> > > >>>> IrsBroker::Failed::GetStoragePoolInfoVDS due to:
> > > >>>> IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
> > > >>>> SpmStart failed
> > > >>>>
> > > >>> I'm no expert, but the last time I beat my head on that rock, 
> > > >>> something
> > > >>> was
> > > >>> wrong with my sanlock storage.  YMMV
> > > >>> Ted Miller
> > > >>> Elkhart, IN, USA
> > > >>>
> > > >
> > > 
> > > Maurice - which type of storage is this?
> > 
> > 
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> > 


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-14 Thread Maurice "Moe" James
Log files are here

On Mon, 2014-04-14 at 17:11 -0400, Liron Aravot wrote:
> 
> - Original Message -
> > From: "Maurice \"Moe\" James" 
> > To: "Itamar Heim" 
> > Cc: users@ovirt.org
> > Sent: Sunday, April 13, 2014 2:28:45 AM
> > Subject: Re: [ovirt-users] SPM error
> > 
> > Were you able to find out anything? Is there anything that I can check
> > in the meanwhile?
> > 
> 
> Hi Muarice,
> can you please attach the ovirt engine/vdsm logs?
> thanks,
> Liron
> > 
> > On Sat, 2014-04-12 at 19:23 +0300, Itamar Heim wrote:
> > > On 04/12/2014 03:40 PM, Maurice James wrote:
> > > > What did you do to try to fix the sanlock? Anything is better than
> > > > nothing at this point
> > > >
> > > > - Original Message -
> > > > From: "Ted Miller" 
> > > > To: "Maurice James" 
> > > > Sent: Friday, April 11, 2014 7:27:24 PM
> > > > Subject: Re: [ovirt-users] SPM error
> > > >
> > > > I did receive some help on one stage of rebuilding my sanlock, but there
> > > > were
> > > > too many other things wrong to get it started again. Only advice I have
> > > > is --
> > > > look at your sanlock logs, and see if you can find anything there that 
> > > > is
> > > > helpful.
> > > >
> > > > On 4/11/2014 7:23 PM, Maurice James wrote:
> > > >> Nooo.
> > > >>
> > > >>
> > > >> Sent from my Galaxy S®III
> > > >>
> > > >>  Original message 
> > > >> From: Ted Miller 
> > > >> Date:04/11/2014  7:08 PM  (GMT-05:00)
> > > >> To: Maurice James 
> > > >> Subject: Re: [ovirt-users] SPM error
> > > >>
> > > >>
> > > >>
> > > >> I didn't, really.  I did something wrong along the way, and ended up
> > > >> having
> > > >> to rebuild the engine and hosts.  (My problems were due to a glusterfs
> > > >> split-brain.)
> > > >> Ted Miller
> > > >>
> > > >> On 4/11/2014 6:03 PM, Maurice James wrote:
> > > >>> How did you fix it?
> > > >>>
> > > >>>
> > > >>> Sent from my Galaxy S®III
> > > >>>
> > > >>>  Original message 
> > > >>> From: Ted Miller 
> > > >>> Date:04/11/2014  6:00 PM  (GMT-05:00)
> > > >>> To: users@ovirt.org
> > > >>> Subject: Re: [ovirt-users] SPM error
> > > >>>
> > > >>>
> > > >>>
> > > >>> On 4/11/2014 2:05 PM, Maurice James wrote:
> > > >>>> I have an error trying to bring the master DC back online. After
> > > >>>> several
> > > >>>> reboots, no luck. I took the other cluster members offline to try to
> > > >>>> troubleshoot. The remaining host is constantly in contention with
> > > >>>> itself
> > > >>>> for SPM
> > > >>>>
> > > >>>>
> > > >>>> ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> > > >>>> (DefaultQuartzScheduler_Worker-40) [38d400ea]
> > > >>>> IrsBroker::Failed::GetStoragePoolInfoVDS due to:
> > > >>>> IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
> > > >>>> SpmStart failed
> > > >>>>
> > > >>> I'm no expert, but the last time I beat my head on that rock, 
> > > >>> something
> > > >>> was
> > > >>> wrong with my sanlock storage.  YMMV
> > > >>> Ted Miller
> > > >>> Elkhart, IN, USA
> > > >>>
> > > >
> > > 
> > > Maurice - which type of storage is this?
> > 
> > 
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> > 


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-14 Thread Liron Aravot


- Original Message -
> From: "Maurice \"Moe\" James" 
> To: "Itamar Heim" 
> Cc: users@ovirt.org
> Sent: Sunday, April 13, 2014 2:28:45 AM
> Subject: Re: [ovirt-users] SPM error
> 
> Were you able to find out anything? Is there anything that I can check
> in the meanwhile?
> 

Hi Muarice,
can you please attach the ovirt engine/vdsm logs?
thanks,
Liron
> 
> On Sat, 2014-04-12 at 19:23 +0300, Itamar Heim wrote:
> > On 04/12/2014 03:40 PM, Maurice James wrote:
> > > What did you do to try to fix the sanlock? Anything is better than
> > > nothing at this point
> > >
> > > - Original Message -
> > > From: "Ted Miller" 
> > > To: "Maurice James" 
> > > Sent: Friday, April 11, 2014 7:27:24 PM
> > > Subject: Re: [ovirt-users] SPM error
> > >
> > > I did receive some help on one stage of rebuilding my sanlock, but there
> > > were
> > > too many other things wrong to get it started again. Only advice I have
> > > is --
> > > look at your sanlock logs, and see if you can find anything there that is
> > > helpful.
> > >
> > > On 4/11/2014 7:23 PM, Maurice James wrote:
> > >> Nooo.
> > >>
> > >>
> > >> Sent from my Galaxy S®III
> > >>
> > >>  Original message 
> > >> From: Ted Miller 
> > >> Date:04/11/2014  7:08 PM  (GMT-05:00)
> > >> To: Maurice James 
> > >> Subject: Re: [ovirt-users] SPM error
> > >>
> > >>
> > >>
> > >> I didn't, really.  I did something wrong along the way, and ended up
> > >> having
> > >> to rebuild the engine and hosts.  (My problems were due to a glusterfs
> > >> split-brain.)
> > >> Ted Miller
> > >>
> > >> On 4/11/2014 6:03 PM, Maurice James wrote:
> > >>> How did you fix it?
> > >>>
> > >>>
> > >>> Sent from my Galaxy S®III
> > >>>
> > >>>  Original message 
> > >>> From: Ted Miller 
> > >>> Date:04/11/2014  6:00 PM  (GMT-05:00)
> > >>> To: users@ovirt.org
> > >>> Subject: Re: [ovirt-users] SPM error
> > >>>
> > >>>
> > >>>
> > >>> On 4/11/2014 2:05 PM, Maurice James wrote:
> > >>>> I have an error trying to bring the master DC back online. After
> > >>>> several
> > >>>> reboots, no luck. I took the other cluster members offline to try to
> > >>>> troubleshoot. The remaining host is constantly in contention with
> > >>>> itself
> > >>>> for SPM
> > >>>>
> > >>>>
> > >>>> ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> > >>>> (DefaultQuartzScheduler_Worker-40) [38d400ea]
> > >>>> IrsBroker::Failed::GetStoragePoolInfoVDS due to:
> > >>>> IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
> > >>>> SpmStart failed
> > >>>>
> > >>> I'm no expert, but the last time I beat my head on that rock, something
> > >>> was
> > >>> wrong with my sanlock storage.  YMMV
> > >>> Ted Miller
> > >>> Elkhart, IN, USA
> > >>>
> > >
> > 
> > Maurice - which type of storage is this?
> 
> 
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
> 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-14 Thread Ted Miller



On 4/12/2014 12:23 PM, Itamar Heim wrote:

On 04/12/2014 03:40 PM, Maurice James wrote:
What did you do to try to fix the sanlock? Anything is better than nothing 
at this point

My thread is at http://lists.ovirt.org/pipermail/users/2014-January/020394.html


- Original Message -
From: "Ted Miller" 
To: "Maurice James" 
Sent: Friday, April 11, 2014 7:27:24 PM
Subject: Re: [ovirt-users] SPM error

I did receive some help on one stage of rebuilding my sanlock, but there were
too many other things wrong to get it started again. Only advice I have is --
look at your sanlock logs, and see if you can find anything there that is
helpful.

On 4/11/2014 7:23 PM, Maurice James wrote:

Nooo.


Sent from my Galaxy S®III

 Original message 
From: Ted Miller 
Date:04/11/2014  7:08 PM  (GMT-05:00)
To: Maurice James 
Subject: Re: [ovirt-users] SPM error



I didn't, really.  I did something wrong along the way, and ended up having
to rebuild the engine and hosts.  (My problems were due to a glusterfs
split-brain.)
Ted Miller

On 4/11/2014 6:03 PM, Maurice James wrote:

How did you fix it?


Sent from my Galaxy S®III

 Original message 
From: Ted Miller 
Date:04/11/2014  6:00 PM  (GMT-05:00)
To: users@ovirt.org
Subject: Re: [ovirt-users] SPM error



On 4/11/2014 2:05 PM, Maurice James wrote:

I have an error trying to bring the master DC back online. After several
reboots, no luck. I took the other cluster members offline to try to
troubleshoot. The remaining host is constantly in contention with itself
for SPM


ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-40) [38d400ea]
IrsBroker::Failed::GetStoragePoolInfoVDS due to:
IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
SpmStart failed


I'm no expert, but the last time I beat my head on that rock, something was
wrong with my sanlock storage.  YMMV
Ted Miller
Elkhart, IN, USA





Maurice - which type of storage is this?


--
"He is no fool who gives what he cannot keep, to gain what he cannot lose." - - 
Jim Elliot
For more information about Jim Elliot and his unusual life, see 
http://www.christianliteratureandliving.com/march2003/carolyn.html.

Ted Miller
Design Engineer
HCJB Global Technology Center, a ministry of Reach Beyond
2830 South 17th St
Elkhart, IN  46517
574--970-4272 my desk
574--970-4252 receptionist

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-12 Thread Maurice "Moe" James
Were you able to find out anything? Is there anything that I can check
in the meanwhile?


On Sat, 2014-04-12 at 19:23 +0300, Itamar Heim wrote:
> On 04/12/2014 03:40 PM, Maurice James wrote:
> > What did you do to try to fix the sanlock? Anything is better than nothing 
> > at this point
> >
> > - Original Message -
> > From: "Ted Miller" 
> > To: "Maurice James" 
> > Sent: Friday, April 11, 2014 7:27:24 PM
> > Subject: Re: [ovirt-users] SPM error
> >
> > I did receive some help on one stage of rebuilding my sanlock, but there 
> > were
> > too many other things wrong to get it started again. Only advice I have is 
> > --
> > look at your sanlock logs, and see if you can find anything there that is
> > helpful.
> >
> > On 4/11/2014 7:23 PM, Maurice James wrote:
> >> Nooo.
> >>
> >>
> >> Sent from my Galaxy S®III
> >>
> >>  Original message 
> >> From: Ted Miller 
> >> Date:04/11/2014  7:08 PM  (GMT-05:00)
> >> To: Maurice James 
> >> Subject: Re: [ovirt-users] SPM error
> >>
> >>
> >>
> >> I didn't, really.  I did something wrong along the way, and ended up having
> >> to rebuild the engine and hosts.  (My problems were due to a glusterfs
> >> split-brain.)
> >> Ted Miller
> >>
> >> On 4/11/2014 6:03 PM, Maurice James wrote:
> >>> How did you fix it?
> >>>
> >>>
> >>> Sent from my Galaxy S®III
> >>>
> >>>  Original message 
> >>> From: Ted Miller 
> >>> Date:04/11/2014  6:00 PM  (GMT-05:00)
> >>> To: users@ovirt.org
> >>> Subject: Re: [ovirt-users] SPM error
> >>>
> >>>
> >>>
> >>> On 4/11/2014 2:05 PM, Maurice James wrote:
> >>>> I have an error trying to bring the master DC back online. After several
> >>>> reboots, no luck. I took the other cluster members offline to try to
> >>>> troubleshoot. The remaining host is constantly in contention with itself
> >>>> for SPM
> >>>>
> >>>>
> >>>> ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> >>>> (DefaultQuartzScheduler_Worker-40) [38d400ea]
> >>>> IrsBroker::Failed::GetStoragePoolInfoVDS due to:
> >>>> IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
> >>>> SpmStart failed
> >>>>
> >>> I'm no expert, but the last time I beat my head on that rock, something 
> >>> was
> >>> wrong with my sanlock storage.  YMMV
> >>> Ted Miller
> >>> Elkhart, IN, USA
> >>>
> >
> 
> Maurice - which type of storage is this?


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-12 Thread Maurice "Moe" James
I uploaded logs to this bug report
https://bugzilla.redhat.com/show_bug.cgi?id=1086951 

On Sat, 2014-04-12 at 19:23 +0300, Itamar Heim wrote:
> On 04/12/2014 03:40 PM, Maurice James wrote:
> > What did you do to try to fix the sanlock? Anything is better than nothing 
> > at this point
> >
> > - Original Message -
> > From: "Ted Miller" 
> > To: "Maurice James" 
> > Sent: Friday, April 11, 2014 7:27:24 PM
> > Subject: Re: [ovirt-users] SPM error
> >
> > I did receive some help on one stage of rebuilding my sanlock, but there 
> > were
> > too many other things wrong to get it started again. Only advice I have is 
> > --
> > look at your sanlock logs, and see if you can find anything there that is
> > helpful.
> >
> > On 4/11/2014 7:23 PM, Maurice James wrote:
> >> Nooo.
> >>
> >>
> >> Sent from my Galaxy S®III
> >>
> >>  Original message 
> >> From: Ted Miller 
> >> Date:04/11/2014  7:08 PM  (GMT-05:00)
> >> To: Maurice James 
> >> Subject: Re: [ovirt-users] SPM error
> >>
> >>
> >>
> >> I didn't, really.  I did something wrong along the way, and ended up having
> >> to rebuild the engine and hosts.  (My problems were due to a glusterfs
> >> split-brain.)
> >> Ted Miller
> >>
> >> On 4/11/2014 6:03 PM, Maurice James wrote:
> >>> How did you fix it?
> >>>
> >>>
> >>> Sent from my Galaxy S®III
> >>>
> >>>  Original message 
> >>> From: Ted Miller 
> >>> Date:04/11/2014  6:00 PM  (GMT-05:00)
> >>> To: users@ovirt.org
> >>> Subject: Re: [ovirt-users] SPM error
> >>>
> >>>
> >>>
> >>> On 4/11/2014 2:05 PM, Maurice James wrote:
> >>>> I have an error trying to bring the master DC back online. After several
> >>>> reboots, no luck. I took the other cluster members offline to try to
> >>>> troubleshoot. The remaining host is constantly in contention with itself
> >>>> for SPM
> >>>>
> >>>>
> >>>> ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> >>>> (DefaultQuartzScheduler_Worker-40) [38d400ea]
> >>>> IrsBroker::Failed::GetStoragePoolInfoVDS due to:
> >>>> IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
> >>>> SpmStart failed
> >>>>
> >>> I'm no expert, but the last time I beat my head on that rock, something 
> >>> was
> >>> wrong with my sanlock storage.  YMMV
> >>> Ted Miller
> >>> Elkhart, IN, USA
> >>>
> >
> 
> Maurice - which type of storage is this?


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-12 Thread Maurice "Moe" James
Its is NFS storage

On Sat, 2014-04-12 at 19:23 +0300, Itamar Heim wrote:
> On 04/12/2014 03:40 PM, Maurice James wrote:
> > What did you do to try to fix the sanlock? Anything is better than nothing 
> > at this point
> >
> > - Original Message -
> > From: "Ted Miller" 
> > To: "Maurice James" 
> > Sent: Friday, April 11, 2014 7:27:24 PM
> > Subject: Re: [ovirt-users] SPM error
> >
> > I did receive some help on one stage of rebuilding my sanlock, but there 
> > were
> > too many other things wrong to get it started again. Only advice I have is 
> > --
> > look at your sanlock logs, and see if you can find anything there that is
> > helpful.
> >
> > On 4/11/2014 7:23 PM, Maurice James wrote:
> >> Nooo.
> >>
> >>
> >> Sent from my Galaxy S®III
> >>
> >>  Original message 
> >> From: Ted Miller 
> >> Date:04/11/2014  7:08 PM  (GMT-05:00)
> >> To: Maurice James 
> >> Subject: Re: [ovirt-users] SPM error
> >>
> >>
> >>
> >> I didn't, really.  I did something wrong along the way, and ended up having
> >> to rebuild the engine and hosts.  (My problems were due to a glusterfs
> >> split-brain.)
> >> Ted Miller
> >>
> >> On 4/11/2014 6:03 PM, Maurice James wrote:
> >>> How did you fix it?
> >>>
> >>>
> >>> Sent from my Galaxy S®III
> >>>
> >>>  Original message 
> >>> From: Ted Miller 
> >>> Date:04/11/2014  6:00 PM  (GMT-05:00)
> >>> To: users@ovirt.org
> >>> Subject: Re: [ovirt-users] SPM error
> >>>
> >>>
> >>>
> >>> On 4/11/2014 2:05 PM, Maurice James wrote:
> >>>> I have an error trying to bring the master DC back online. After several
> >>>> reboots, no luck. I took the other cluster members offline to try to
> >>>> troubleshoot. The remaining host is constantly in contention with itself
> >>>> for SPM
> >>>>
> >>>>
> >>>> ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
> >>>> (DefaultQuartzScheduler_Worker-40) [38d400ea]
> >>>> IrsBroker::Failed::GetStoragePoolInfoVDS due to:
> >>>> IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
> >>>> SpmStart failed
> >>>>
> >>> I'm no expert, but the last time I beat my head on that rock, something 
> >>> was
> >>> wrong with my sanlock storage.  YMMV
> >>> Ted Miller
> >>> Elkhart, IN, USA
> >>>
> >
> 
> Maurice - which type of storage is this?


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-12 Thread Itamar Heim

On 04/12/2014 03:40 PM, Maurice James wrote:

What did you do to try to fix the sanlock? Anything is better than nothing at 
this point

- Original Message -
From: "Ted Miller" 
To: "Maurice James" 
Sent: Friday, April 11, 2014 7:27:24 PM
Subject: Re: [ovirt-users] SPM error

I did receive some help on one stage of rebuilding my sanlock, but there were
too many other things wrong to get it started again. Only advice I have is --
look at your sanlock logs, and see if you can find anything there that is
helpful.

On 4/11/2014 7:23 PM, Maurice James wrote:

Nooo.


Sent from my Galaxy S®III

 Original message 
From: Ted Miller 
Date:04/11/2014  7:08 PM  (GMT-05:00)
To: Maurice James 
Subject: Re: [ovirt-users] SPM error



I didn't, really.  I did something wrong along the way, and ended up having
to rebuild the engine and hosts.  (My problems were due to a glusterfs
split-brain.)
Ted Miller

On 4/11/2014 6:03 PM, Maurice James wrote:

How did you fix it?


Sent from my Galaxy S®III

 Original message 
From: Ted Miller 
Date:04/11/2014  6:00 PM  (GMT-05:00)
To: users@ovirt.org
Subject: Re: [ovirt-users] SPM error



On 4/11/2014 2:05 PM, Maurice James wrote:

I have an error trying to bring the master DC back online. After several
reboots, no luck. I took the other cluster members offline to try to
troubleshoot. The remaining host is constantly in contention with itself
for SPM


ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
(DefaultQuartzScheduler_Worker-40) [38d400ea]
IrsBroker::Failed::GetStoragePoolInfoVDS due to:
IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
SpmStart failed


I'm no expert, but the last time I beat my head on that rock, something was
wrong with my sanlock storage.  YMMV
Ted Miller
Elkhart, IN, USA





Maurice - which type of storage is this?
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-12 Thread Maurice James
What did you do to try to fix the sanlock? Anything is better than nothing at 
this point

- Original Message -
From: "Ted Miller" 
To: "Maurice James" 
Sent: Friday, April 11, 2014 7:27:24 PM
Subject: Re: [ovirt-users] SPM error

I did receive some help on one stage of rebuilding my sanlock, but there were 
too many other things wrong to get it started again. Only advice I have is -- 
look at your sanlock logs, and see if you can find anything there that is 
helpful.

On 4/11/2014 7:23 PM, Maurice James wrote:
> Nooo.
>
>
> Sent from my Galaxy S®III
>
>  Original message 
> From: Ted Miller 
> Date:04/11/2014  7:08 PM  (GMT-05:00)
> To: Maurice James 
> Subject: Re: [ovirt-users] SPM error
>
>
>
> I didn't, really.  I did something wrong along the way, and ended up having
> to rebuild the engine and hosts.  (My problems were due to a glusterfs
> split-brain.)
> Ted Miller
>
> On 4/11/2014 6:03 PM, Maurice James wrote:
>> How did you fix it?
>>
>>
>> Sent from my Galaxy S®III
>>
>>  Original message ----
>> From: Ted Miller 
>> Date:04/11/2014  6:00 PM  (GMT-05:00)
>> To: users@ovirt.org
>> Subject: Re: [ovirt-users] SPM error
>>
>>
>>
>> On 4/11/2014 2:05 PM, Maurice James wrote:
>>> I have an error trying to bring the master DC back online. After several
>>> reboots, no luck. I took the other cluster members offline to try to
>>> troubleshoot. The remaining host is constantly in contention with itself
>>> for SPM
>>>
>>>
>>> ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
>>> (DefaultQuartzScheduler_Worker-40) [38d400ea]
>>> IrsBroker::Failed::GetStoragePoolInfoVDS due to:
>>> IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
>>> SpmStart failed
>>>
>> I'm no expert, but the last time I beat my head on that rock, something was
>> wrong with my sanlock storage.  YMMV
>> Ted Miller
>> Elkhart, IN, USA
>>

-- 
"He is no fool who gives what he cannot keep, to gain what he cannot lose." - - 
Jim Elliot
For more information about Jim Elliot and his unusual life, see 
http://www.christianliteratureandliving.com/march2003/carolyn.html.

Ted Miller
Design Engineer
HCJB Global Technology Center, a ministry of Reach Beyond
2830 South 17th St
Elkhart, IN  46517
574--970-4272 my desk
574--970-4252 receptionist

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-11 Thread Maurice James
Nooo. 


Sent from my Galaxy S®III

 Original message 
From: Ted Miller  
Date:04/11/2014  7:08 PM  (GMT-05:00) 
To: Maurice James  
Subject: Re: [ovirt-users] SPM error 



I didn't, really.  I did something wrong along the way, and ended up having 
to rebuild the engine and hosts.  (My problems were due to a glusterfs 
split-brain.)
Ted Miller

On 4/11/2014 6:03 PM, Maurice James wrote:
> How did you fix it?
>
>
> Sent from my Galaxy S®III
>
>  Original message 
> From: Ted Miller 
> Date:04/11/2014  6:00 PM  (GMT-05:00)
> To: users@ovirt.org
> Subject: Re: [ovirt-users] SPM error
>
>
>
> On 4/11/2014 2:05 PM, Maurice James wrote:
>> I have an error trying to bring the master DC back online. After several
>> reboots, no luck. I took the other cluster members offline to try to
>> troubleshoot. The remaining host is constantly in contention with itself
>> for SPM
>>
>>
>> ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand]
>> (DefaultQuartzScheduler_Worker-40) [38d400ea]
>> IrsBroker::Failed::GetStoragePoolInfoVDS due to:
>> IrsSpmStartFailedException: IRSGenericException: IRSErrorException:
>> SpmStart failed
>>
> I'm no expert, but the last time I beat my head on that rock, something was
> wrong with my sanlock storage.  YMMV
> Ted Miller
> Elkhart, IN, USA
>

-- 
"He is no fool who gives what he cannot keep, to gain what he cannot lose." - - 
Jim Elliot
For more information about Jim Elliot and his unusual life, see 
http://www.christianliteratureandliving.com/march2003/carolyn.html.

Ted Miller
Design Engineer
HCJB Global Technology Center, a ministry of Reach Beyond
2830 South 17th St
Elkhart, IN  46517
574--970-4272 my desk
574--970-4252 receptionist

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-11 Thread Maurice James
How did you fix it?


Sent from my Galaxy S®III

 Original message 
From: Ted Miller  
Date:04/11/2014  6:00 PM  (GMT-05:00) 
To: users@ovirt.org 
Subject: Re: [ovirt-users] SPM error 



On 4/11/2014 2:05 PM, Maurice James wrote:
> I have an error trying to bring the master DC back online. After several 
> reboots, no luck. I took the other cluster members offline to try to 
> troubleshoot. The remaining host is constantly in contention with itself 
> for SPM
>
>
> ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
> (DefaultQuartzScheduler_Worker-40) [38d400ea] 
> IrsBroker::Failed::GetStoragePoolInfoVDS due to: 
> IrsSpmStartFailedException: IRSGenericException: IRSErrorException: 
> SpmStart failed
>
I'm no expert, but the last time I beat my head on that rock, something was 
wrong with my sanlock storage.  YMMV
Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-11 Thread Ted Miller

On 4/11/2014 2:05 PM, Maurice James wrote:
I have an error trying to bring the master DC back online. After several 
reboots, no luck. I took the other cluster members offline to try to 
troubleshoot. The remaining host is constantly in contention with itself 
for SPM



ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(DefaultQuartzScheduler_Worker-40) [38d400ea] 
IrsBroker::Failed::GetStoragePoolInfoVDS due to: 
IrsSpmStartFailedException: IRSGenericException: IRSErrorException: 
SpmStart failed


I'm no expert, but the last time I beat my head on that rock, something was 
wrong with my sanlock storage.  YMMV

Ted Miller
Elkhart, IN, USA

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] SPM error

2014-04-11 Thread Maurice James
Anyone? 

- Original Message -

From: "Maurice James"  
To: users@ovirt.org 
Sent: Friday, April 11, 2014 2:05:17 PM 
Subject: [ovirt-users] SPM error 

I have an error trying to bring the master DC back online. After several 
reboots, no luck. I took the other cluster members offline to try to 
troubleshoot. The remaining host is constantly in contention with itself for 
SPM 


ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(DefaultQuartzScheduler_Worker-40) [38d400ea] 
IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: 
IRSGenericException: IRSErrorException: SpmStart failed 






___ 
Users mailing list 
Users@ovirt.org 
http://lists.ovirt.org/mailman/listinfo/users 

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] SPM error

2014-04-11 Thread Maurice James
I have an error trying to bring the master DC back online. After several 
reboots, no luck. I took the other cluster members offline to try to 
troubleshoot. The remaining host is constantly in contention with itself for 
SPM 


ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] 
(DefaultQuartzScheduler_Worker-40) [38d400ea] 
IrsBroker::Failed::GetStoragePoolInfoVDS due to: IrsSpmStartFailedException: 
IRSGenericException: IRSErrorException: SpmStart failed 





___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users