On Mon, Nov 25, 2013 at 05:43:19PM +0900, MORITA Kazutaka wrote: > At Mon, 25 Nov 2013 15:03:46 +0800, > Robin Dong wrote: > > > > The present implementation of http/swift is not perfect, it can't create > > too much containers or objects. So we want to store all objects in one > > hyper volume vdi and use new structure 'obj-inode' to identify its offset > > and length in this vdi, just like some local file system. To achieve this, > > we need distributed locks to ensure that only one thread can create a new > > 'obj-inode' (or delete) in this vdi at a same time. > > > > This patch set is a try to implement the distributed lock. > > > > If we add code in sheep/cluster/zookeeper.c and use the framework of > > cluster to implement this distributed lock, then we have to add > > implementation for corosyncălocal and shepherd. That's too complicated. So > > what we need is adding lock.c in sheep/http/ and only use it in http > > interface. > > If possible, I don't like to see zookeeper specific codes out side of > sheep/cluster/zookeeper.c. Can we use a SD_OP_TYPE_CLUSTER operation > for your purpose? It works like a cluster-wide distributed lock. > > For example, vdi creation works like as follows. > > 1. When sheep receives a SD_OP_NEW_VDI operation, sheep calls > cdrv->block() to block all the other cluster operations. > > 2. Sheep calls cluster_new_vdi() in sd_block_handler(). It is > ensured that no other sheep call sd_block_handler() at the same > time. This is necessary here because sheepdog doesn't allow > concurrent vdi creation requests. > > 3. All the sheep in the cluster call post_cluster_new_vdi() in > sd_notify_handler(). It is usually used for notification or > cleanups. >
I don't think this approach is effecient though it is simpler because we can make use of exsiting mechanism, since: - it can't scale, meaning there is only one lock in the cluster. And every object creations from different containers will try to compete for this lock. - can be affected by operations even not related to http operations. For example, 'vdi create' will block the cluster, it means before it unblocks the cluster, we can't create/delete objects|container at all. I think a lock per operation is really needed. E.g, every container has a lock to achieve concurence of creating objects and won't interfere with other containers. Thanks Yuan -- sheepdog mailing list [email protected] http://lists.wpkg.org/mailman/listinfo/sheepdog
