What is the status of your Datacenter?  Are these hosts both operational?
Are you experiencing other problems with your storage other than the
inconsistent task state?    Do you see the KeyError: 'VERSION' message
related to domain *b6730d64-2cf8-42a3-8f08-24b8cc2c0cd8 also on Node02?
Did you experience any disaster (power outage, FC storage outage, network,
etc) around the time this started happening?*

On Wed, Oct 11, 2017 at 9:21 AM, yayo (j) <jag...@gmail.com> wrote:

> Hi all,
>
> ovirt 4.1 hosted engine on 2 node cluster and FC LUN Storage
>
> I'm trying to clear some task pending from months using vdsClient but I
> can't do anything.  Below are the steps (on node 1, the SPM):
>
> 1. Show all tasks:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *# vdsClient -s 0 getAllTasksInfofd319af4-d160-48ce-b682-5a908333a5e1 :
>      verb = createVolume         id =
> fd319af4-d160-48ce-b682-5a908333a5e19bbc2bc4-3c73-4814-a785-6ea737904528 :
>        verb = prepareMerge         id =
> 9bbc2bc4-3c73-4814-a785-6ea737904528e70feb21-964d-49d9-9b5a-8e3f70a92db1 :
>        verb = prepareMerge         id =
> e70feb21-964d-49d9-9b5a-8e3f70a92db1cf064461-f0ab-4e44-a68f-b2d58fa83a21 :
>        verb = prepareMerge         id =
> cf064461-f0ab-4e44-a68f-b2d58fa83a2185b7cf4e-d658-4785-94f0-391fe9616b41 :
>        verb = prepareMerge         id =
> 85b7cf4e-d658-4785-94f0-391fe9616b417416627a-fe50-4353-b129-e01bba066a66 :
>        verb = prepareMerge         id =
> 7416627a-fe50-4353-b129-e01bba066a66*
>
>
> 2. Stop all tasks (repeted for every task):
>
> *# vdsClient -s 0 stopTask 7416627a-fe50-4353-b129-e01bba066a66 *
> Task is aborted: u'7416627a-fe50-4353-b129-e01bba066a66' - code 411
>
> 3. Tring to clear tasks:
>
> * # vdsClient -s 0 clearTask 7416627a-fe50-4353-b129-e01bba066a66*
> *Operation is not allowed in this task state: ("can't clean in state
> running",)*
>
>
>
> *On Node 01 (the SPM) I have multiple errors
> in /var/log/vdsm/vdsm.log like this:*
>
> *2017-10-11 15:09:53,719+0200 INFO  (jsonrpc/3) [storage.TaskManager.Task]
> (Task='9519d4db-2960-4b88-82f2-e4c1094eac54') aborting: Task is aborted:
> u'Operation is not allowed in this task state: ("can\'t clean in state
> running",)' - code 100 (task:1175)*
> *2017-10-11 15:09:53,719+0200 ERROR (jsonrpc/3) [storage.Dispatcher]
> FINISH clearTask error=Operation is not allowed in this task state: ("can't
> clean in state running",) (dispatcher:78)*
> *2017-10-11 15:09:53,720+0200 INFO  (jsonrpc/3) [jsonrpc.JsonRpcServer]
> RPC call Task.clear failed (error 410) in 0.01 seconds (__init__:539)*
> *2017-10-11 15:09:53,743+0200 INFO  (jsonrpc/6) [vdsm.api] START
> clearTask(taskID=u'7416627a-fe50-4353-b129-e01bba066a66', spUUID=None,
> options=None) from=::ffff:192.168.0.226,36724, flow_id=7cd340ec (api:46)*
> *2017-10-11 15:09:53,743+0200 INFO  (jsonrpc/6) [vdsm.api] FINISH
> clearTask error=Operation is not allowed in this task state: ("can't clean
> in state running",) from=::ffff:192.168.0.226,36724, flow_id=7cd340ec
> (api:50)*
> *2017-10-11 15:09:53,743+0200 ERROR (jsonrpc/6) [storage.TaskManager.Task]
> (Task='0e12e052-2aca-480d-b50f-5de01ddebe35') Unexpected error (task:870)*
> *Traceback (most recent call last):*
> *  File "/usr/share/vdsm/storage/task.py", line 877, in _run*
> *    return fn(*args, **kargs)*
> *  File "<string>", line 2, in clearTask*
> *  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in
> method*
> *    ret = func(*args, **kwargs)*
> *  File "/usr/share/vdsm/storage/hsm.py", line 2258, in clearTask*
> *    return self.taskMng.clearTask(taskID=taskID)*
> *  File "/usr/share/vdsm/storage/taskManager.py", line 175, in clearTask*
> *    t.clean()*
> *  File "/usr/share/vdsm/storage/task.py", line 1047, in clean*
> *    raise se.TaskStateError("can't clean in state %s" % self.state)*
> *TaskStateError: Operation is not allowed in this task state: ("can't
> clean in state running",)*
>
>
> *On Node 02 (is a 2 node cluster) I have other errors (I don't know if are
> related):*
>
> *2017-10-11 15:11:57,083+0200 INFO  (jsonrpc/7) [storage.LVM] Refreshing
> lvs: vg=b50c1f5c-aa2c-4a53-9f89-83517fa70d3b lvs=['leases'] (lvm:1291)*
> *2017-10-11 15:11:57,084+0200 INFO  (jsonrpc/7) [storage.LVM] Refreshing
> LVs (vg=b50c1f5c-aa2c-4a53-9f89-83517fa70d3b, lvs=['leases']) (lvm:1319)*
> *2017-10-11 15:11:57,124+0200 INFO  (jsonrpc/7) [storage.VolumeManifest]
> b50c1f5c-aa2c-4a53-9f89-83517fa70d3b/d42f671e-1745-46c1-9e1c-2833245675fc/c86afaa5-6ca8-4fcb-a27e-ffbe0133fe23
> info is {'status': 'OK', 'domain': 'b50c1f5c-aa2c-4a53-9f89-83517fa70d3b',
> 'voltype': 'LEAF', 'description': 'hosted-engine.metadata', 'parent':
> '00000000-0000-0000-0000-000000000000', 'format': 'RAW', 'generation': 0,
> 'image': 'd42f671e-1745-46c1-9e1c-2833245675fc', 'ctime': '1499437345',
> 'disktype': '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize':
> '134217728', 'children': [], 'pool': '', 'capacity': '134217728', 'uuid':
> u'c86afaa5-6ca8-4fcb-a27e-ffbe0133fe23', 'truesize': '134217728', 'type':
> 'PREALLOCATED', 'lease': {'owners': [], 'version': None}} (volume:272)*
> *2017-10-11 15:11:57,125+0200 INFO  (jsonrpc/7) [vdsm.api] FINISH
> getVolumeInfo return={'info': {'status': 'OK', 'domain':
> 'b50c1f5c-aa2c-4a53-9f89-83517fa70d3b', 'voltype': 'LEAF', 'description':
> 'hosted-engine.metadata', 'parent': '00000000-0000-0000-0000-000000000000',
> 'format': 'RAW', 'generation': 0, 'image':
> 'd42f671e-1745-46c1-9e1c-2833245675fc', 'ctime': '1499437345', 'disktype':
> '2', 'legality': 'LEGAL', 'mtime': '0', 'apparentsize': '134217728',
> 'children': [], 'pool': '', 'capacity': '134217728', 'uuid':
> u'c86afaa5-6ca8-4fcb-a27e-ffbe0133fe23', 'truesize': '134217728', 'type':
> 'PREALLOCATED', 'lease': {'owners': [], 'version': None}}} from=::1,56906
> (api:52)*
> *2017-10-11 15:11:57,126+0200 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer]
> RPC call Volume.getInfo succeeded in 0.05 seconds (__init__:539)*
> *2017-10-11 15:11:57,758+0200 INFO  (Reactor thread)
> [ProtocolDetector.AcceptorImpl] Accepted connection from ::1:56908
> (protocoldetector:72)*
> *2017-10-11 15:11:57,764+0200 INFO  (Reactor thread)
> [ProtocolDetector.Detector] Detected protocol stomp from ::1:56908
> (protocoldetector:127)*
> *2017-10-11 15:11:57,765+0200 INFO  (Reactor thread) [Broker.StompAdapter]
> Processing CONNECT request (stompreactor:103)*
> *2017-10-11 15:11:57,765+0200 INFO  (JsonRpc (StompReactor))
> [Broker.StompAdapter] Subscribe command received (stompreactor:130)*
> *2017-10-11 15:11:57,930+0200 INFO  (jsonrpc/0) [jsonrpc.JsonRpcServer]
> RPC call Host.getHardwareInfo succeeded in 0.01 seconds (__init__:539)*
> *2017-10-11 15:11:57,933+0200 INFO  (jsonrpc/1) [vdsm.api] START
> repoStats(options=None) from=::1,56908 (api:46)*
> *2017-10-11 15:11:57,933+0200 INFO  (jsonrpc/1) [vdsm.api] FINISH
> repoStats return={u'b50c1f5c-aa2c-4a53-9f89-83517fa70d3b': {'code': 0,
> 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000138003',
> 'lastCheck': '4.9', 'valid': True},
> u'b6730d64-2cf8-42a3-8f08-24b8cc2c0cd8': {'code': 200, 'actual': True,
> 'version': -1, 'acquired': False, 'delay': '0', 'lastCheck': '9.7',
> 'valid': False}, u'c7d32f1b-f32c-4a21-995b-2e3b415aae4e': {'code': 0,
> 'actual': True, 'version': 0, 'acquired': True, 'delay': '0.000618471',
> 'lastCheck': '1.4', 'valid': True},
> u'05ab1dd9-24bc-409b-80b8-6c5b00c52aa9': {'code': 0, 'actual': True,
> 'version': 4, 'acquired': True, 'delay': '0.00027591', 'lastCheck': '5.2',
> 'valid': True}} from=::1,56908 (api:52)*
> *2017-10-11 15:11:57,998+0200 INFO  (jsonrpc/1) [jsonrpc.JsonRpcServer]
> RPC call Host.getStats succeeded in 0.06 seconds (__init__:539)*
> *2017-10-11 15:11:58,253+0200 ERROR (monitor/b6730d6) [storage.Monitor]
> Setting up monitor for b6730d64-2cf8-42a3-8f08-24b8cc2c0cd8 failed
> (monitor:329)*
> *Traceback (most recent call last):*
> *  File "/usr/share/vdsm/storage/monitor.py", line 326, in _setupLoop*
> *    self._setupMonitor()*
> *  File "/usr/share/vdsm/storage/monitor.py", line 349, in _setupMonitor*
> *    self._produceDomain()*
> *  File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 401, in
> wrapper*
> *    value = meth(self, *a, **kw)*
> *  File "/usr/share/vdsm/storage/monitor.py", line 367, in _produceDomain*
> *    self.domain = sdCache.produce(self.sdUUID)*
> *  File "/usr/share/vdsm/storage/sdc.py", line 112, in produce*
> *    domain.getRealDomain()*
> *  File "/usr/share/vdsm/storage/sdc.py", line 53, in getRealDomain*
> *    return self._cache._realProduce(self._sdUUID)*
> *  File "/usr/share/vdsm/storage/sdc.py", line 136, in _realProduce*
> *    domain = self._findDomain(sdUUID)*
> *  File "/usr/share/vdsm/storage/sdc.py", line 153, in _findDomain*
> *    return findMethod(sdUUID)*
> *  File "/usr/share/vdsm/storage/nfsSD.py", line 126, in findDomain*
> *    return NfsStorageDomain(NfsStorageDomain.findDomainPath(sdUUID))*
> *  File "/usr/share/vdsm/storage/fileSD.py", line 359, in __init__*
> *    manifest = self.manifestClass(domainPath)*
> *  File "/usr/share/vdsm/storage/fileSD.py", line 171, in __init__*
> *    sd.StorageDomainManifest.__init__(self, sdUUID, domaindir, metadata)*
> *  File "/usr/share/vdsm/storage/sd.py", line 332, in __init__*
> *    self._domainLock = self._makeDomainLock()*
> *  File "/usr/share/vdsm/storage/sd.py", line 526, in _makeDomainLock*
> *    domVersion = self.getVersion()*
> *  File "/usr/share/vdsm/storage/sd.py", line 403, in getVersion*
> *    return self.getMetaParam(DMDK_VERSION)*
> *  File "/usr/share/vdsm/storage/sd.py", line 400, in getMetaParam*
> *    return self._metadata[key]*
> *  File "/usr/lib/python2.7/site-packages/vdsm/storage/persistent.py",
> line 91, in __getitem__*
> *    return dec(self._dict[key])*
> *  File "/usr/lib/python2.7/site-packages/vdsm/storage/persistent.py",
> line 203, in __getitem__*
> *    raise KeyError(key)*
> *KeyError: 'VERSION'*
>
>
> Can you help me?
>
> Restart hosted engine don't solve the problem
>
> Thank you
>
>
> p.s. Related question: tasks above are the same/related reported by the
> engine in the screenshot here? https://snag.gy/XDmoUt.jpg ... How Can I
> clear also these tasks from engine?
>
> _______________________________________________
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>


-- 
Adam Litke
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to