On 03/06/2012 01:22 PM, Dan Kenigsberg wrote:
On Tue, Mar 06, 2012 at 10:57:49AM +0530, Deepak C Shetty wrote:
On 03/06/2012 04:21 AM, Dan Kenigsberg wrote:
On Mon, Mar 05, 2012 at 12:04:36AM +0530, Deepak C Shetty wrote:
On 03/02/2012 11:54 PM, Deepak C Shetty wrote:
On 03/02/2012 11:27 PM, Deepak C Shetty wrote:
Hi,
    In my simple experiment, i connected to a SHAREDFS storage
server and then created a data domain
But the createStorageDomain failed with code 351, which just
says "Error creating a storage domain".

How to find out what the real reason behind the failure.

Surprisingly, the domain dir structure does get created, so
looks like it worked, but still it gives
failure as the return result, why ?

Sample code...
#!/usr/bin/python
# GPLv2+

import sys
import uuid
import time

sys.path.append('/usr/share/vdsm')

import vdscli
>from storage.sd import SHAREDFS_DOMAIN, DATA_DOMAIN, ISO_DOMAIN
>from storage.volume import COW_FORMAT, SPARSE_VOL, LEAF_VOL, BLANK_UUID
spUUID = str(uuid.uuid4())
sdUUID = str(uuid.uuid4())
imgUUID = str(uuid.uuid4())
volUUID = str(uuid.uuid4())

print "spUUID = %s"%spUUID
print "sdUUID = %s"%sdUUID
print "imgUUID = %s"%imgUUID
print "volUUID = %s"%volUUID

gluster_conn = "llm65.in.ibm.com:myvol"

s = vdscli.connect()

masterVersion = 1
hostID = 1

def vdsOK(d):
    print d
    if d['status']['code']:
    raise Exception(str(d))
    return d

def waitTask(s, taskid):
    while
vdsOK(s.getTaskStatus(taskid))['taskStatus']['taskState'] !=
'finished':
        time.sleep(3)
    vdsOK(s.clearTask(taskid))

vdsOK(s.connectStorageServer(SHAREDFS_DOMAIN, "my gluster
mount", [dict(id=1, spec=gluster_conn, vfs_type="glusterfs",
mnt_options="")]))

vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my gluster
domain", gluster_conn, DATA_DOMAIN, 0))

Output...
./dpk-sharedfs-vm.py
spUUID = 852110d5-c3d2-456e-ae75-b72e929e9bae
sdUUID = 1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
imgUUID = c29100e7-19cd-4a27-adc6-4c35cc5e690c
volUUID = 1d074f24-8bf0-4b68-8a35-40c3f2c33723
{'status': {'message': 'OK', 'code': 0}, 'statuslist':
[{'status': 0, 'id': 1}]}
{'status': {'message': "Error creating a storage domain:
('storageType=6, sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe,
domainName=my gluster domain, domClass=1,
typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code':
351}}
Traceback (most recent call last):
  File "./dpk-sharedfs-vm.py", line 74, in<module>
    vdsOK(s.createStorageDomain(SHAREDFS_DOMAIN, sdUUID, "my
gluster domain", gluster_conn, DATA_DOMAIN, 0))
  File "./dpk-sharedfs-vm.py", line 62, in vdsOK
    raise Exception(str(d))
Exception: {'status': {'message': "Error creating a storage
domain: ('storageType=6,
sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe, domainName=my
gluster domain, domClass=1,
typeSpecificArg=llm65.in.ibm.com:myvol domVersion=0',)", 'code':
351}}

But it did create the dir structure...
]# find /rhev/data-center/mnt/llm65.in.ibm.com\:myvol/
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe

/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md

/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata

/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/leases

/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/outbox

/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/inbox

/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/ids

/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/images


# mount | grep gluster
llm65.in.ibm.com:myvol on
/rhev/data-center/mnt/llm65.in.ibm.com:myvol type fuse.glusterfs 
(rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)


Attaching the vdsm.log....

Thread-46::INFO::2012-03-03
04:49:16,092::nfsSD::64::Storage.StorageDomain::(create)
sdUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe domainName=my gluster
domain remotePath=llm65.in.ibm.com:myvol domClass=1
Thread-46::DEBUG::2012-03-03 
04:49:16,111::persistentDict::175::Storage.PersistentDict::(__init__)
Created a persistant dict with FileMetadataRW backend
Thread-46::DEBUG::2012-03-03
04:49:16,113::persistentDict::216::Storage.PersistentDict::(refresh)
read lines (FileMetadataRW)=[]
Thread-46::WARNING::2012-03-03
04:49:16,113::persistentDict::238::Storage.PersistentDict::(refresh)
data has no embedded checksum - trust it as it is
Thread-46::DEBUG::2012-03-03 
04:49:16,113::persistentDict::152::Storage.PersistentDict::(transaction)
Starting transaction
Thread-46::DEBUG::2012-03-03 
04:49:16,114::persistentDict::158::Storage.PersistentDict::(transaction)
Flushing changes
Thread-46::DEBUG::2012-03-03
04:49:16,114::persistentDict::277::Storage.PersistentDict::(flush)
about to write lines (FileMetadataRW)=['CLASS=Data',
'DESCRIPTION=my gluster domain', 'IOOPTIMEOUTSEC=1',
'LEASERETRIES=3', 'LEASETIMESEC=5', 'LOCKPOLICY=',
'LOCKRENEWALINTERVALSEC=5', 'POOL_UUID=',
'REMOTE_PATH=llm65.in.ibm.com:myvol', 'ROLE=Regular',
'SDUUID=1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'TYPE=SHAREDFS',
'VERSION=0',
'_SHA_CKSUM=c8ba67889d4b62ccd9fd368c584501404e8ee84e']
Thread-46::DEBUG::2012-03-03 
04:49:16,118::persistentDict::160::Storage.PersistentDict::(transaction)
Finished transaction
Thread-46::DEBUG::2012-03-03
04:49:16,120::fileSD::98::Storage.StorageDomain::(__init__)
Reading domain in path 
/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe
Thread-46::DEBUG::2012-03-03 
04:49:16,120::persistentDict::175::Storage.PersistentDict::(__init__)
Created a persistant dict with FileMetadataRW backend
Thread-46::ERROR::2012-03-03
04:49:16,121::task::855::TaskManager.Task::(_setError)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 863, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1922, in
createStorageDomain
    typeSpecificArg, storageType, domVersion)
  File "/usr/share/vdsm/storage/nfsSD.py", line 87, in create
    fsd = cls(os.path.join(mntPoint, sdUUID))
  File "/usr/share/vdsm/storage/fileSD.py", line 104, in __init__
    sdUUID = metadata[sd.DMDK_SDUUID]
  File "/usr/share/vdsm/storage/persistentDict.py", line 75, in
__getitem__
    return dec(self._dict[key])
  File "/usr/share/vdsm/storage/persistentDict.py", line 183, in
__getitem__
    with self._accessWrapper():
  File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/usr/share/vdsm/storage/persistentDict.py", line 137, in
_accessWrapper
    self.refresh()
  File "/usr/share/vdsm/storage/persistentDict.py", line 214, in refresh
    lines = self._metaRW.readlines()
  File "/usr/share/vdsm/storage/fileSD.py", line 71, in readlines
    return misc.stripNewLines(self._oop.directReadLines(self._metafile))
  File "/usr/share/vdsm/storage/processPool.py", line 53, in wrapper
    return self.runExternally(func, *args, **kwds)
  File "/usr/share/vdsm/storage/processPool.py", line 64, in
runExternally
    return self._procPool.runExternally(*args, **kwargs)
  File "/usr/share/vdsm/storage/processPool.py", line 154, in
runExternally
    raise err

OSError: [Errno 22] Invalid argument: 
'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
Thread-46::DEBUG::2012-03-03
04:49:16,129::task::874::TaskManager.Task::(_run)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._run:
9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0 (6,
'1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe', 'my gluster domain',
'llm65.in.ibm.com:myvol', 1, 0) {} failed - stopping task
Thread-46::DEBUG::2012-03-03
04:49:16,130::task::1201::TaskManager.Task::(stop)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::stopping in state
preparing (force False)
Thread-46::DEBUG::2012-03-03
04:49:16,130::task::980::TaskManager.Task::(_decref)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 1 aborting True
Thread-46::INFO::2012-03-03
04:49:16,130::task::1159::TaskManager.Task::(prepare)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::aborting: Task is
aborted: "[Errno 22] Invalid argument: 
'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'"
- code 100
Thread-46::DEBUG::2012-03-03
04:49:16,130::task::1164::TaskManager.Task::(prepare)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Prepare: aborted:
[Errno 22] Invalid argument: 
'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
Thread-46::DEBUG::2012-03-03
04:49:16,130::task::980::TaskManager.Task::(_decref)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::ref 0 aborting True
Thread-46::DEBUG::2012-03-03
04:49:16,131::task::915::TaskManager.Task::(_doAbort)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::Task._doAbort: force
False
Thread-46::DEBUG::2012-03-03 
04:49:16,131::resourceManager::841::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-46::DEBUG::2012-03-03
04:49:16,131::task::588::TaskManager.Task::(_updateState)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state
preparing ->   state aborting
Thread-46::DEBUG::2012-03-03
04:49:16,131::task::537::TaskManager.Task::(__state_aborting)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::_aborting: recover
policy none
Thread-46::DEBUG::2012-03-03
04:49:16,132::task::588::TaskManager.Task::(_updateState)
Task=`9d108fc4-5fd4-4c88-8f4a-f44309ea0ce0`::moving from state
aborting ->   state failed
Thread-46::DEBUG::2012-03-03 
04:49:16,132::resourceManager::806::ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}
Thread-46::DEBUG::2012-03-03 
04:49:16,132::resourceManager::841::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-46::ERROR::2012-03-03
04:49:16,132::dispatcher::93::Storage.Dispatcher.Protect::(run)
[Errno 22] Invalid argument: 
'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 85, in run
    result = ctask.prepare(self.func, *args, **kwargs)
  File "/usr/share/vdsm/storage/task.py", line 1166, in prepare
    raise self.error
OSError: [Errno 22] Invalid argument: 
'/rhev/data-center/mnt/llm65.in.ibm.com:myvol/1c15bc91-f62b-43c8-b68a-fd2bd3ed18fe/dom_md/metadata'


Hi Saggie,
     Wondering if you could offer some help here...

I did some more debug and figured that the metafile (for which the
above excp is being thrown) is opened
in O_DIRECT|O_RDONLY mode in fileUtils.py. A sample python code i
tried, throws the same excp (Errno 22)
when trying to read any file using os.read(f,100) that is opened is
O_DIRECT mode.

Going deeper into vdsm code, I see that in
DirectFile.read()/readall(), libc.read is being used and not
os.read.. libc.read is being fed the aligned buffers, so wondering
why the Errno 22 is still coming ?
Could it be that O_DIRECT is simply not supported on your gluster mount?
Would the following script explode, too?

import storage.fileUtils

f = storage.fileUtils.open_ex('/gluster/mounted/file', 'dr')
s = f.read()

Will try to figure whether gluster mount supports O_DIRECT.
Until then, i workaround by using readLines instead of directReadLines...
that helps me get past the issue, now createStorageDomain seems successfull
from the vdsOK print i see as below...

{'status': {'message': 'OK', 'code': 0}}

But, in vdsm.log, i see this...

Thread-31::DEBUG::2012-03-06
16:21:24,676::safelease::54::Storage.Misc.excCmd::(initLock) FAILED:
<err>  = 'sudo: sorry, a password is required to run sudo\n';<rc>  =
1
Thread-31::WARNING::2012-03-06
16:21:24,676::safelease::56::ClusterLock::(initLock) could not
initialise spm lease (1): []
Thread-31::WARNING::2012-03-06
16:21:24,677::sd::328::Storage.StorageDomain::(initSPMlease) lease
did not initialize successfully
Traceback (most recent call last):
   File "/usr/share/vdsm/storage/sd.py", line 324, in initSPMlease
     safelease.ClusterLock.initLock(self._getLeasesFilePath())
   File "/usr/share/vdsm/storage/safelease.py", line 57, in initLock
     raise se.ClusterLockInitError()
ClusterLockInitError: Could not initialize cluster lock: ()
Thread-31::INFO::2012-03-06
16:21:24,677::logUtils::39::dispatcher::(wrapper) Run and protect:
createStorageDomain, Return response: None
taking the SPM lock requires O_DIRECT, too. Let's start by understanding how to
enable it over gluster.

Ah, didn't realise that, looks like i missed that in the code.

From #gluster i figure that fuse still does not support O_DIRECT
From linux-fsdevel, it looks like patches to enable O_DIRECT in fuse
are just getting in.

So wondering why the lock init is failing.. in fact if i try to
createStoragePool, i get more issues.

Lastly, i figured that vdsm code does not use os.read but uses
libc.read by passing a aligned
buffer, in which case Errno 22 should have been avoided, rite ?
O_DIRECT is notoriously fragile. Someone has to debug the issue and understand
why it fails for you... hint, hint ;-)



_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel

Reply via email to