Hi Sage,

We are trying to integrate a new proprietary key value store to CEPH. To 
integrate this KV-store, which is a closed source shared library, we propose a 
new class to CEPH called PropDBStore which does a dlopen and imports the 
required symbols. This framework will help in integrating vendor specific 
extensions to CEPH.

The gist of the implementation is as follows.

1. Implement a wrapper around the proprietary KVStore. Let us call it as 
KVExtension. This is a shared library which implements all interfaces required 
by CEPH KeyValueStore.
2. A new class is derived from KeyValueDB called PropDBStore, which honors the 
semantics of KeyvalueStore and KeyValueDB. This class acts as mediator between 
CEPH and KVExtension.  This class transforms bufferlist etc... to const char 
pointers or strings for the extension to understand.
3. PropDBStore, loads (dlopen) the KVExtension during OSD initialization.  Path 
to the KVExtension can be mentioned in ceph.conf.
4. Interfaces that needs to be implemented in KVExtension, which are imported 
by the PropDBStore are added in a new header called PropDBWrapper.h.  This 
header contains the signatures for the necessary interfaces like init(), 
close(), submit_transaction(), get() and get_iterator(). Similarly for Iterator 
functionality, PropDBIterator.h, which specifies the signatures of 
seek_to_first (), seek_to_last(), lower_bound() and upper_bound() etc...  
PropDBStore includes these headers to import the symbols, using dlsym().
5. Choosing the proprietary DB as Backend to the OSD is controlled/managed by 
config options of the ceph (/etc/ceph/ceph.conf) like rocksdb or leveldb.
6. Rest of the existing functionality is not disturbed by this change. Changing 
the osd backend option will change backend implementation. But this change is 
not dynamic. The type of the backend should be chosen at osd creation time and 
osd will continue use that backend till that osd is reformatted again.
7. The new KVStore we are trying to integrate works on a raw partition, so we 
divided the osd drive into two partitions. One partition is given to osd Meta 
data (super block, fsid etc...), and the other is given to the new db to manage 
it. OSD partition is now not the entire disk, but 2-4GB which needed for the 
metadata.

Please share your thoughts around this.
Thanks,
Varada



________________________________

PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to