Hi,

On 24/02/2015 17:13, Somnath Roy wrote:> Hi Loic,
> This is an effort to make ceph interface pluggable to any proprietary k/v db 
> available. The integrator has to implement a shim layer (dynamically 
> loadable) by implementing these interfaces. That shim layer can do specific 
> job for the k/v db of theirs.
> Now, regarding our k/v db, yes, it is written keeping in mind that backend 
> will be flash not HDD. This is the major difference between leveldb/rocksdb 
> etc. Our db reduces the flash WA dramatically and the performance also should 
> be similar or better than rocksdb. 
> Also, I think there should more of this proprietary dbs that people want to 
> integrate with Ceph as I don't think leveldb/rocksdb will not be able to 
> serve all kind of workload.

Thanks for sharing these details :-) Would this db be specific to a line of 
product, for instance by making ioctl calls that only a specific driver for a 
specific hardware would understand ? Or is this a db that is designed to 
optimize workloads for flash drives using only standard and documented API or 
system calls ?

> Thanks & Regards
> Somnath 
> 
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Loic Dachary
> Sent: Tuesday, February 24, 2015 6:30 AM
> To: Varada Kari; Ceph Development
> Subject: Re: Adding a proprietary key value store to CEPH
> 
> Hi,
> 
> I'm curious about the reasons why the key/value store you mention is not 
> published as Free Software. Is it because it implements a proprietary 
> interface to a specific hardware ? Because it has additional functionalities 
> comparied to rocksdb etc. ? Because it performs better under some workloads ?
> 
> Cheers
> 
> On 24/02/2015 14:20, Varada Kari wrote:
>> Hi Sage,
>>
>> We are trying to integrate a new proprietary key value store to CEPH. To 
>> integrate this KV-store, which is a closed source shared library, we propose 
>> a new class to CEPH called PropDBStore which does a dlopen and imports the 
>> required symbols. This framework will help in integrating vendor specific 
>> extensions to CEPH.
>>
>> The gist of the implementation is as follows.
>>
>> 1. Implement a wrapper around the proprietary KVStore. Let us call it as 
>> KVExtension. This is a shared library which implements all interfaces 
>> required by CEPH KeyValueStore.
>> 2. A new class is derived from KeyValueDB called PropDBStore, which honors 
>> the semantics of KeyvalueStore and KeyValueDB. This class acts as mediator 
>> between CEPH and KVExtension.  This class transforms bufferlist etc... to 
>> const char pointers or strings for the extension to understand.
>> 3. PropDBStore, loads (dlopen) the KVExtension during OSD initialization.  
>> Path to the KVExtension can be mentioned in ceph.conf.
>> 4. Interfaces that needs to be implemented in KVExtension, which are 
>> imported by the PropDBStore are added in a new header called 
>> PropDBWrapper.h.  This header contains the signatures for the necessary 
>> interfaces like init(), close(), submit_transaction(), get() and 
>> get_iterator(). Similarly for Iterator functionality, PropDBIterator.h, 
>> which specifies the signatures of seek_to_first (), seek_to_last(), 
>> lower_bound() and upper_bound() etc...  PropDBStore includes these headers 
>> to import the symbols, using dlsym().
>> 5. Choosing the proprietary DB as Backend to the OSD is controlled/managed 
>> by config options of the ceph (/etc/ceph/ceph.conf) like rocksdb or leveldb.
>> 6. Rest of the existing functionality is not disturbed by this change. 
>> Changing the osd backend option will change backend implementation. But this 
>> change is not dynamic. The type of the backend should be chosen at osd 
>> creation time and osd will continue use that backend till that osd is 
>> reformatted again.
>> 7. The new KVStore we are trying to integrate works on a raw partition, so 
>> we divided the osd drive into two partitions. One partition is given to osd 
>> Meta data (super block, fsid etc...), and the other is given to the new db 
>> to manage it. OSD partition is now not the entire disk, but 2-4GB which 
>> needed for the metadata.
>>
>> Please share your thoughts around this.
>> Thanks,
>> Varada
>>
>>
>>
>> ________________________________
>>
>> PLEASE NOTE: The information contained in this electronic mail message is 
>> intended only for the use of the designated recipient(s) named above. If the 
>> reader of this message is not the intended recipient, you are hereby 
>> notified that you have received this message in error and that any review, 
>> dissemination, distribution, or copying of this message is strictly 
>> prohibited. If you have received this communication in error, please notify 
>> the sender by telephone or e-mail (as shown above) immediately and destroy 
>> any and all copies of this message in your possession (whether hard copies 
>> or electronically stored copies).
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
>> in the body of a message to [email protected] More majordomo 
>> info at  http://vger.kernel.org/majordomo-info.html
>>
> 
> --
> Loïc Dachary, Artisan Logiciel Libre
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to