On Tue, Mar 06, 2012 at 10:57:49AM +0530, Deepak C Shetty wrote:
> On 03/06/2012 04:21 AM, Dan Kenigsberg wrote:
> >On Mon, Mar 05, 2012 at 12:04:36AM +0530, Deepak C Shetty wrote:
> >>On 03/02/2012 11:54 PM, Deepak C Shetty wrote:
> >>>On 03/02/2012 11:27 PM, Deepak C Shetty wrote:
> >>>>Hi,
> >>>>    In my simple experiment, i connected to a SHAREDFS storage
> >>>>server and then created a data domain
> >>>>But the createStorageDomain failed with code 351, which just
> >>>>says "Error creating a storage domain".
> >>>>
> >>>>How to find out what the real reason behind the failure.
> >>>>
> >>>>Surprisingly, the domain dir structure does get created, so
> >>>>looks like it worked, but still it gives
> >>>>failure as the return result, why ?
> >>>>
> >>>>>>Sample code...
> >>>>#!/usr/bin/python
> >>>># GPLv2+
> >>>>
> >>>>import sys
> >>>>import uuid
> >>>>import time
> >>>>
> >>>>sys.path.append('/usr/share/vdsm')
> >>>>
> >>>>import vdscli
> >>>>from storage.sd import SHAREDFS_DOMAIN, DATA_DOMAIN, ISO_DOMAIN
> >>>>from storage.volume import COW_FORMAT, SPARSE_VOL, LEAF_VOL, BLANK_UUID
> >>>>spUUID = str(uuid.uuid4())
> >>>>sdUUID = str(uuid.uuid4())
> >>>>imgUUID = str(uuid.uuid4())
> >>>>volUUID = str(uuid.uuid4())
> >>>>
> >>>>print "spUUID = %s"%spUUID
> >>>>print "sdUUID = %s"%sdUUID
> >>>>print "imgUUID = %s"%imgUUID
> >>>>print "volUUID = %s"%volUUID
> >>>>
> >>>>gluster_conn = "llm65.in.ibm.com:myvol"
> >>>>
> >>>>s = vdscli.connect()
> >>>>
> >>>>masterVersion = 1
> >>>>hostID = 1
> >>>>
> >>>>def vdsOK(d):
> >>>>    print d
> >>>>    if d['status']['code']:
> >>>>    raise Exception(str(d))
> >>>>    return d
> >>>>
> >>>>def waitTask(s, taskid):
> >>>>    while
> >>>>vdsOK(s.getTaskStatus(taskid))['taskStatus']['taskState'] !=
> >>>>'finished':
> >>>>        time.sleep(3)
> >>>>    vdsOK(s.clearTask(taskid))
> >>>>
> >>>>vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster
> >>>>mount", [dict(id=1, spec=gluster_conn, vfs_type="glusterfs",
> >>>>mnt_options="")]))
> >>>>
> >>>>vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster
> >>>>domain", gluster_conn, DATA_DOMAIN, 0))
> >>>>
> >>>>>>Output...
> >>>>./dpk-sharedfs-vm.py
> >>>>spUUID = 852110d5-c3d2-456e-ae75-b72e929e9bae
> >>>>sdUUID = 1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
> >>>>imgUUID = c29100e7-19cd-4a27-adc6-4c35cc5e690c
> >>>>volUUID = 1d074f24-8bf0-4b68-8a35-40c3f2c33723
> >>>>{'status': {'message': 'OK', 'code': 0}, 'statuslist':
> >>>>[{'status': 0, 'id': 1}]}
> >>>>{'status': {'message': "Error creating a storage domain:
> >>>>('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe,
> >>>>domainName=my gluster domain, domClass=1,
> >>>>typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code':
> >>>>351}}
> >>>>Traceback (most recent call last):
> >>>>  File "./dpk-sharedfs-vm.py", line 74, in<module>
> >>>>    vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my
> >>>>gluster domain", gluster_conn, DATA_DOMAIN, 0))
> >>>>  File "./dpk-sharedfs-vm.py", line 62, in vdsOK
> >>>>    raise Exception(str(d))
> >>>>Exception: {'status': {'message': "Error creating a storage
> >>>>domain: ('storageType=6,
> >>>>sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my
> >>>>gluster domain, domClass=1,
> >>>>typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code':
> >>>>351}}
> >>>>
> >>>>>>But it did create the dir structure...
> >>>>]# find /rhev/data-center/mnt/llm65.in.ibm.com\:myvol/
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
> >>>>
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md
> >>>>
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata
> >>>>
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/leases
> >>>>
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/outbox
> >>>>
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/inbox
> >>>>
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/ids
> >>>>
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/images
> >>>>
> >>>>
> >>>># mount | grep gluster
> >>>>llm65.in.ibm.com:myvol on
> >>>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol type fuse.glusterfs 
> >>>>(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)
> >>>>
> >>>>
> >>>Attaching the vdsm.log....
> >>>
> >>>Thread-46::INFO::2012-03-03
> >>>04:49:16,092::nfsSD::64::Storage.StorageDomain::(create)
> >>>sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe domainName=my gluster
> >>>domain remotePath=llm65.in.ibm.com:myvol domClass=1
> >>>Thread-46::DEBUG::2012-03-03 
> >>>04:49:16,111::persistentDict::175::Storage.PersistentDict::(__init__)
> >>>Created a persistant dict with FileMetadataRW backend
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,113::persistentDict::216::Storage.PersistentDict::(refresh)
> >>>read lines (FileMetadataRW)=[]
> >>>Thread-46::WARNING::2012-03-03
> >>>04:49:16,113::persistentDict::238::Storage.PersistentDict::(refresh)
> >>>data has no embedded checksum - trust it as it is
> >>>Thread-46::DEBUG::2012-03-03 
> >>>04:49:16,113::persistentDict::152::Storage.PersistentDict::(transaction)
> >>>Starting transaction
> >>>Thread-46::DEBUG::2012-03-03 
> >>>04:49:16,114::persistentDict::158::Storage.PersistentDict::(transaction)
> >>>Flushing changes
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,114::persistentDict::277::Storage.PersistentDict::(flush)
> >>>about to write lines (FileMetadataRW)=['CLASS=Data',
> >>>'DESCRIPTION=my gluster domain', 'IOOPTIMEOUTSEC=1',
> >>>'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=',
> >>>'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=',
> >>>'REMOTE_PATH=llm65.in.ibm.com:myvol', 'ROLE=Regular',
> >>>'SDUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'TYPE=SHAREDFS',
> >>>'VERSION=0',
> >>>'_SHA_CKSUM=c8ba67889d4b62ccd9fd368c584501404e8ee84e']
> >>>Thread-46::DEBUG::2012-03-03 
> >>>04:49:16,118::persistentDict::160::Storage.PersistentDict::(transaction)
> >>>Finished transaction
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,120::fileSD::98::Storage.StorageDomain::(__init__)
> >>>Reading domain in path 
> >>>/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
> >>>Thread-46::DEBUG::2012-03-03 
> >>>04:49:16,120::persistentDict::175::Storage.PersistentDict::(__init__)
> >>>Created a persistant dict with FileMetadataRW backend
> >>>Thread-46::ERROR::2012-03-03
> >>>04:49:16,121::task::855::TaskManager.Task::(_setError)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Unexpected error
> >>>Traceback (most recent call last):
> >>>  File "/usr/share/vdsm/storage/task.py", line 863, in _run
> >>>    return fn(*args, **kargs)
> >>>  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
> >>>    res = f(*args, **kwargs)
> >>>  File "/usr/share/vdsm/storage/hsm.py", line 1922, in
> >>>createStorageDomain
> >>>    typeSpecificArg, storageType, domVersion)
> >>>  File "/usr/share/vdsm/storage/nfsSD.py", line 87, in create
> >>>    fsd = cls(os.path.join(mntPoint, sdUUID))
> >>>  File "/usr/share/vdsm/storage/fileSD.py", line 104, in __init__
> >>>    sdUUID = metadata[sd.DMDK_SDUUID]
> >>>  File "/usr/share/vdsm/storage/persistentDict.py", line 75, in
> >>>__getitem__
> >>>    return dec(self._dict[key])
> >>>  File "/usr/share/vdsm/storage/persistentDict.py", line 183, in
> >>>__getitem__
> >>>    with self._accessWrapper():
> >>>  File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
> >>>    return self.gen.next()
> >>>  File "/usr/share/vdsm/storage/persistentDict.py", line 137, in
> >>>_accessWrapper
> >>>    self.refresh()
> >>>  File "/usr/share/vdsm/storage/persistentDict.py", line 214, in refresh
> >>>    lines = self._metaRW.readlines()
> >>>  File "/usr/share/vdsm/storage/fileSD.py", line 71, in readlines
> >>>    return misc.stripNewLines(self._oop.directReadLines(self._metafile))
> >>>  File "/usr/share/vdsm/storage/processPool.py", line 53, in wrapper
> >>>    return self.runExternally(func, *args, **kwds)
> >>>  File "/usr/share/vdsm/storage/processPool.py", line 64, in
> >>>runExternally
> >>>    return self._procPool.runExternally(*args, **kwargs)
> >>>  File "/usr/share/vdsm/storage/processPool.py", line 154, in
> >>>runExternally
> >>>    raise err
> >>>
> >>>OSError: [Errno 22] Invalid argument: 
> >>>'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,129::task::874::TaskManager.Task::(_run)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._run:
> >>>9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0 (6,
> >>>'1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'my gluster domain',
> >>>'llm65.in.ibm.com:myvol', 1, 0) {} failed - stopping task
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,130::task::1201::TaskManager.Task::(stop)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::stopping in state
> >>>preparing (force False)
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,130::task::980::TaskManager.Task::(_decref)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 1 aborting True
> >>>Thread-46::INFO::2012-03-03
> >>>04:49:16,130::task::1159::TaskManager.Task::(prepare)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::aborting: Task is
> >>>aborted: "[Errno 22] Invalid argument: 
> >>>'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'"
> >>>- code 100
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,130::task::1164::TaskManager.Task::(prepare)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Prepare: aborted:
> >>>[Errno 22] Invalid argument: 
> >>>'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,130::task::980::TaskManager.Task::(_decref)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 0 aborting True
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,131::task::915::TaskManager.Task::(_doAbort)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._doAbort: force
> >>>False
> >>>Thread-46::DEBUG::2012-03-03 
> >>>04:49:16,131::resourceManager::841::ResourceManager.Owner::(cancelAll)
> >>>Owner.cancelAll requests {}
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,131::task::588::TaskManager.Task::(_updateState)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state
> >>>preparing ->  state aborting
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,131::task::537::TaskManager.Task::(__state_aborting)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::_aborting: recover
> >>>policy none
> >>>Thread-46::DEBUG::2012-03-03
> >>>04:49:16,132::task::588::TaskManager.Task::(_updateState)
> >>>Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state
> >>>aborting ->  state failed
> >>>Thread-46::DEBUG::2012-03-03 
> >>>04:49:16,132::resourceManager::806::ResourceManager.Owner::(releaseAll)
> >>>Owner.releaseAll requests {} resources {}
> >>>Thread-46::DEBUG::2012-03-03 
> >>>04:49:16,132::resourceManager::841::ResourceManager.Owner::(cancelAll)
> >>>Owner.cancelAll requests {}
> >>>Thread-46::ERROR::2012-03-03
> >>>04:49:16,132::dispatcher::93::Storage.Dispatcher.Protect::(run)
> >>>[Errno 22] Invalid argument: 
> >>>'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
> >>>Traceback (most recent call last):
> >>>  File "/usr/share/vdsm/storage/dispatcher.py", line 85, in run
> >>>    result = ctask.prepare(self.func, *args, **kwargs)
> >>>  File "/usr/share/vdsm/storage/task.py", line 1166, in prepare
> >>>    raise self.error
> >>>OSError: [Errno 22] Invalid argument: 
> >>>'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
> >>>
> >>>
> >>Hi Saggie,
> >>     Wondering if you could offer some help here...
> >>
> >>I did some more debug and figured that the metafile (for which the
> >>above excp is being thrown) is opened
> >>in O_DIRECT|O_RDONLY mode in fileUtils.py. A sample python code i
> >>tried, throws the same excp (Errno 22)
> >>when trying to read any file using os.read(f,100) that is opened is
> >>O_DIRECT mode.
> >>
> >>Going deeper into vdsm code, I see that in
> >>DirectFile.read()/readall(), libc.read is being used and not
> >>os.read.. libc.read is being fed the aligned buffers, so wondering
> >>why the Errno 22 is still coming ?
> >Could it be that O_DIRECT is simply not supported on your gluster mount?
> >Would the following script explode, too?
> >
> >import storage.fileUtils
> >
> >f = storage.fileUtils.open_ex('/gluster/mounted/file', 'dr')
> >s = f.read()
> >
> 
> Will try to figure whether gluster mount supports O_DIRECT.
> Until then, i workaround by using readLines instead of directReadLines...
> that helps me get past the issue, now createStorageDomain seems successfull
> from the vdsOK print i see as below...
> 
> {'status': {'message': 'OK', 'code': 0}}
> 
> But, in vdsm.log, i see this...
> 
> Thread-31::DEBUG::2012-03-06
> 16:21:24,676::safelease::54::Storage.Misc.excCmd::(initLock) FAILED:
> <err> = 'sudo: sorry, a password is required to run sudo\n'; <rc> =
> 1
> Thread-31::WARNING::2012-03-06
> 16:21:24,676::safelease::56::ClusterLock::(initLock) could not
> initialise spm lease (1): []
> Thread-31::WARNING::2012-03-06
> 16:21:24,677::sd::328::Storage.StorageDomain::(initSPMlease) lease
> did not initialize successfully
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/sd.py", line 324, in initSPMlease
>     safelease.ClusterLock.initLock(self._getLeasesFilePath())
>   File "/usr/share/vdsm/storage/safelease.py", line 57, in initLock
>     raise se.ClusterLockInitError()
> ClusterLockInitError: Could not initialize cluster lock: ()
> Thread-31::INFO::2012-03-06
> 16:21:24,677::logUtils::39::dispatcher::(wrapper) Run and protect:
> createStorageDomain, Return response: None

taking the SPM lock requires O_DIRECT, too. Let's start by understanding how to
enable it over gluster.

> 
> So wondering why the lock init is failing.. in fact if i try to
> createStoragePool, i get more issues.
> 
> Lastly, i figured that vdsm code does not use os.read but uses
> libc.read by passing a aligned
> buffer, in which case Errno 22 should have been avoided, rite ?

O_DIRECT is notoriously fragile. Someone has to debug the issue and understand
why it fails for you... hint, hint ;-)
_______________________________________________
vdsm-devel mailing list
[email protected]
https://fedorahosted.org/mailman/listinfo/vdsm-devel

Reply via email to