I suppose we just need to adhere to the scheme for indexing which is
followed by on disk key-value dbs.

Based on [1] MariaDB uses levelDB for it's internal storage engine. And
also they've explained how indexing is handled in levelDB for it's use.

Also i believe we don't need to stick with levelDB, we just need to
evaluate the possibilities of integrating with a key-value paired db as
it's being used by most of the broker vendors due to it's performance. The
goal would be to eventually develop a storage engine which will be tuned
specific to serve brokering domains.  We could also adopt some of the
techniques used by Kafka to allow fast message transmission to it's
consumers. AFAIK they leverage the use of java FileChannel transferTo API
to feed the file's data directly to it's consumers, instead of reading the
byte data to the application.

Also given that transactions are vital for a broker, IMO it would be better
to do a performance evaluation after implementing transactions. Given that
it's a part of the main part of the feature, we cannot omit transactional
commits.

[1] https://mariadb.com/kb/en/mariadb/leveldb/
[2] http://docs.oracle.com/javase/6/docs/api/java/nio/
channels/FileChannel.html#transferTo(long, long, java.nio.channels.
WritableByteChannel)

On Wed, Aug 16, 2017 at 12:08 PM, Asanka Abeyweera <asank...@wso2.com>
wrote:

> Hi Wishmitha,
>
> On Tue, Aug 15, 2017 at 5:34 PM, Wishmitha Mendis <wishmi...@wso2.com>
> wrote:
>
>> Hi Asanka,
>>
>> 1. We can initially use a Network File System as you mentioned for HA.
>> However data replication in LevelDB is used in ActiveMQ replicated store.
>> [1]
>>
>
> Then It is better to run the performance test with a NFS to really compare
> the results.
>
>
>> 2. Yes, the iterator should traverse all the related keys to get message
>> data. This is why the key schema is designed in a such way that all the
>> messages in a queue are stored successively to reduce the traversing time.
>> And as you said it is much more complex than the RDBMS as the data cannot
>> be retrieved by simply executing a query. However, LevelDB can still
>> provide much faster data retrieval rates than RDBMS due to its high
>> performances. Hence, even though the data retrieving operation is complex,
>> it is not a much of an overhead when it comes to overall performances.
>>
>
> What happens if we have lot of messages for a queue/queues which is stored
> before the interested queue. Shouldn't we traverse until we find the
> interested queue. If my understanding is correct that will result in
> consuming a considerable amount of resources (CPU, bancdwith, etc). RDBMS
> get around this by maintaining indexes. Do we have something similar in
> leveldb?
>
>
>>
>> (Additional : Most of the RDBMS engines use file based stores underneath.
>> As an example, LevelDB is used as a database engine in MariaDB. [2] Hence,
>> even when executing a query in RDBMS, these kind of traversing operations
>> may occur underneath.)
>>
>> 3. Data cannot be inserted while traversing. Actually traversing through
>> the keys is done by an iterator which should be eventually closed after the
>> operation is completed. A sample code is represented below. [3]
>>
>> DBIterator iterator = db.iterator();
>> try {
>>   for(iterator.seekToFirst(); iterator.hasNext(); iterator.next()) {
>>     String key = asString(iterator.peekNext().getKey());
>>     String value = asString(iterator.peekNext().getValue());
>>     System.out.println(key+" = "+value);
>>   }
>> } finally {
>>   // Make sure you close the iterator to avoid resource leaks.
>>   iterator.close();
>> }
>>
>> These iterators are mainly used in methods such as getMetaDataList() and
>> deleteMessages() in the implementation. The iterator should be closed in
>> those methods, as displayed in the above code.
>>
>
> Isn't that a bottleneck when there are concurrent consumers and
> publishers? Data writing threads will have to wait until the data reading
> thread is done which can result in lower performance numbers when there are
> multiple queues with multiple consumers and publishers.
>
>
>>
>> 4. Yes this will be a performance limitation. The throughput get
>> reasonably low when publishing/retrieving messages in multi-threaded
>> environment. Even though LevelDB is capable of providing higher throughput
>> than RDBMS even in a multi-threaded environment according to the test
>> results, that can be a bottleneck in concurrent access of DB. Main purpose
>> of this PoC is actually develop a generic key schema, so that we can switch
>> between and select the optimal file based store for the message broker.
>>
>> 5. LevelDB does not have an inbuilt transaction support. Therefore
>> transactions should implemented as a external layer within the application.
>> Currently I am working on this and exploring how the transactions are
>> implemented in ActiveMQ LevelDB store. [4] I will post a separate thread on
>> LevelDB transactions.
>>
>
> Maybe before implementing a transaction layer, we should do a proper
> performance test and decide on the path forward.
>
> WDYT?
>
>
>>
>>
>> [1] http://activemq.apache.org/replicated-leveldb-store.html
>> [2] https://mariadb.com/kb/en/mariadb/leveldb/
>> [3] https://github.com/fusesource/leveldbjni
>> [4] https://github.com/apache/activemq/tree/master/activemq-leveldb-store
>>
>> On Tue, Aug 15, 2017 at 11:21 AM, Asanka Abeyweera <asank...@wso2.com>
>> wrote:
>>
>>> Hi Wishmitha,
>>>
>>>    1. How are we going to support HA deployment with LevelDB? Are we
>>>    going to use a network file system or replicate data?
>>>    2. If we wanted to get a set of message matching a given queue ID,
>>>    do we have to traverse all messages to get that? In RDBMS this is easier 
>>> to
>>>    do with a where clause.
>>>    3. What happens if we insert data while traversing?
>>>    4. It seems "*only a single process (possibly multi-threaded) can
>>>    access a particular database at a time*"[1] in LevelDB. Will this be
>>>    a bottleneck when we need to access the DB concurrently?
>>>    5. How are the transactions handled in LevelDB? When
>>>    implementing distributed transactions feature we required row level
>>>    locking instead of table level locking. Does LevelDB support that?
>>>
>>> [1] https://github.com/google/leveldb
>>>
>>> On Tue, Aug 15, 2017 at 10:47 AM, Wishmitha Mendis <wishmi...@wso2.com>
>>> wrote:
>>>
>>>> Hi Sumedha,
>>>>
>>>> The Java library for LevelDB (leveldbjni) creates the database as
>>>> follows as mentioned in the docs. [1]
>>>>
>>>> Options options = new Options();
>>>> DB db = factory.open(new File("example"), options);
>>>>
>>>> This will create the database in a directory on a given path. And in
>>>> the library docs, it is mentioned that the library supports several
>>>> platforms if not specifically configured. Therefore using this library does
>>>> not require to ship LevelDB and it also won't take away platform
>>>> agnostic installation capability of MB. However the implementation is
>>>> currently only tested on Linux, I will test it on Windows and other
>>>> platforms and let you know.
>>>>
>>>> When considering the LevelDB architecture, it is already used as a
>>>> broker store in ActiveMQ. [2] [3] This proves that LevelDB has the
>>>> architectural capability to efficiently insert and delete messages in a
>>>> broker.
>>>>
>>>> [1] https://github.com/fusesource/leveldbjni
>>>> [2] http://activemq.apache.org/leveldb-store.html
>>>> [3] https://github.com/apache/activemq/tree/master/activemq-
>>>> leveldb-store
>>>>
>>>> Best Regards,
>>>>
>>>> On Tue, Aug 15, 2017 at 2:29 AM, Sumedha Rubasinghe <sume...@wso2.com>
>>>> wrote:
>>>>
>>>>> Hi Wishmitha,
>>>>> Would leveldb architecture be efficient for a message broker where
>>>>> removing delivered messages is very frequent?
>>>>>
>>>>> This requires WSO2 Message Broker to ship leveldb. leveldb (
>>>>> https://github.com/google/leveldb) has native distributions for
>>>>> platforms. AFAIC this will take away platform agnostic installation
>>>>> capability of MB.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Aug 15, 2017 at 2:20 AM, Wishmitha Mendis <wishmi...@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am working on a project to replace the current RDBMS based database
>>>>>> of the message broker store with a file based database system. Currently
>>>>>> the implementation is carried out in LevelDB which is a key-value based
>>>>>> data store. The following is an explanation of suggested key schema for 
>>>>>> the
>>>>>> data store with related design decisions.
>>>>>>
>>>>>> *Overview :*
>>>>>>
>>>>>> LevelDB is a key value based database where a value can be stored
>>>>>> under a certain unique key. This key-value mapping is one directional 
>>>>>> which
>>>>>> means a value only can be retrieved by accessing corresponding key. One 
>>>>>> of
>>>>>> the main features in LevelDB is that it stores keys in a 
>>>>>> lexicographically
>>>>>> (alphabetically) sorted order. All the keys and values are stored in byte
>>>>>> array format in the database store which should be accordingly converted 
>>>>>> to
>>>>>> string format within the application.
>>>>>>
>>>>>> For this LevelDB store implementation leveldbjni-1.8[1] is used which
>>>>>> provides a Java based API for LevelDB by providing following main
>>>>>> functionalities.
>>>>>>
>>>>>>
>>>>>>    1.
>>>>>>
>>>>>>    put(key,value) : stores given value under the provided key
>>>>>>    2.
>>>>>>
>>>>>>    get(key) : returns corresponding value to the key
>>>>>>    3.
>>>>>>
>>>>>>    delete(key) : deletes given key
>>>>>>    4.
>>>>>>
>>>>>>    batch() : provides atomicity for the operations
>>>>>>    5.
>>>>>>
>>>>>>    iterator() : traverse through the stored keys
>>>>>>
>>>>>>
>>>>>> When designing the key schema in Level DB the following factors are
>>>>>> mainly considered.
>>>>>>
>>>>>>
>>>>>>    1.
>>>>>>
>>>>>>    Lexicographical order of the stored keys
>>>>>>    2.
>>>>>>
>>>>>>    Traversing through the keys
>>>>>>    3.
>>>>>>
>>>>>>    Data organization
>>>>>>
>>>>>>
>>>>>> *Key Schema :*
>>>>>>
>>>>>> The key schema implementation was carried out for following tables of
>>>>>> the current RDBMS database.
>>>>>>
>>>>>> [image: Screenshot from 2017-08-14 01-13-33.png]
>>>>>>
>>>>>> The key schema is mainly designed by analyzing implemented queries
>>>>>> for data retrieval and inserting in the RDBMS. The key schema for above
>>>>>> three tables is represented below table.
>>>>>>
>>>>>>
>>>>>> [image: Screenshot from 2017-08-15 02-11-24.png]
>>>>>>
>>>>>> *Key : Value*
>>>>>>
>>>>>> *Purpose*
>>>>>>
>>>>>> MESSAGE.$message_id.QUEUE_ID : queue_id
>>>>>>
>>>>>> Stores queue id of the message.
>>>>>>
>>>>>> MESSAGE.$message_id.DLC_QUEUE_ID : dlc_queue_id
>>>>>>
>>>>>> Stores dlc queue id of the message.
>>>>>>
>>>>>> MESSAGE.$message_id.MESSAGE_METADATA : message_metadata
>>>>>>
>>>>>> Stores metadata of the message.
>>>>>>
>>>>>> MESSAGE.$message_id.$content_offset.MESSAGE_CONTENT : message_content
>>>>>>
>>>>>> Stores message content for a given message offset of the message.
>>>>>>
>>>>>> QUEUE.$queue_id.QUEUE_NAME : queue_name
>>>>>>
>>>>>> Stores name of the queue under the id.
>>>>>>
>>>>>> QUEUE.$queue_name.QUEUE_ID : queue_id
>>>>>>
>>>>>> Stores id of the queue under the name.
>>>>>>
>>>>>> QUEUE.$queue_name.message_id. MESSAGE_METADATA : message_metadata
>>>>>>
>>>>>> Stores metadata of the messages which belongs to the queue.
>>>>>>
>>>>>> LAST_MESSAGE_ID
>>>>>>
>>>>>> Stores last message id.
>>>>>>
>>>>>> LAST_QUEUE_ID
>>>>>>
>>>>>> Stores last queue id.
>>>>>>
>>>>>> As it can be seen some data repetition is higher when using this
>>>>>> schema. That is mainly due to one directional key-value mapping of 
>>>>>> LevelDB.
>>>>>> As an example two keys (QUEUE.$queue_id.QUEUE_NAME ,
>>>>>> QUEUE.$queue_name.QUEUE_ID) are required to build the bidirectional
>>>>>> relation (get queue name given queue id and get queue id given queue
>>>>>> name) between queue name and the queue id. As LevelDB has better
>>>>>> writing performances than RDBMS data repetition may not be an much of an
>>>>>> overhead in inserting data. Moreover batch operations can be used in
>>>>>> multiple insertions.
>>>>>>
>>>>>> The main purpose of using of prefixes like MESSAGE and QUEUE in keys
>>>>>> is to organize them properly. As LevelDB stores keys lexicographically
>>>>>> these prefixes will make sure that message related and queue related keys
>>>>>> are stored separately as displayed below. The following shows the keys of
>>>>>> the LevelDB store after publishing a JMS message to the broker. It can be
>>>>>> clearly seen that the keys are stored in lexicographical order.
>>>>>>
>>>>>> [image: Screenshot from 2017-08-14 19-57-13.png]
>>>>>>
>>>>>> Organize keys in such a manner also improves the efficiency of
>>>>>> traversing the keys using iterators when retrieving and deleting data. As
>>>>>> displayed in the diagram below, iterators traverse by starting from the
>>>>>> first stored key in the store. When iterator head reaches a key it can
>>>>>> either move to the next key or previous key. (similar to double
>>>>>> linked list) Hence storing related keys successively improves the
>>>>>> efficiency of traversing when retrieving and deleting data by reducing 
>>>>>> the
>>>>>> seeking time.
>>>>>>
>>>>>>
>>>>>> [image: Screenshot from 2017-08-15 02-11-40.png]
>>>>>>
>>>>>>
>>>>>> Basically these are the factors and decisions which have been taken
>>>>>> in implementing this key schema. And this schema should be extended to
>>>>>> provide functionalities like storing message expiration data etc. It 
>>>>>> would
>>>>>> be great to to have a feedback on the proposed schema specially regarding
>>>>>> how to reduce data repetition and improve efficiency furthermore.
>>>>>>
>>>>>> [1] https://github.com/fusesource/leveldbjni
>>>>>>
>>>>>>
>>>>>> Best Regards,
>>>>>> --
>>>>>>
>>>>>> *Wishmitha Mendis*
>>>>>>
>>>>>> *Intern - Software Engineering*
>>>>>> *WSO2*
>>>>>>
>>>>>> *Mobile : +94 777577706 <077%20757%207706>*
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Architecture mailing list
>>>>>> Architecture@wso2.org
>>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Architecture mailing list
>>>>> Architecture@wso2.org
>>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> *Wishmitha Mendis*
>>>>
>>>> *Intern - Software Engineering*
>>>> *WSO2*
>>>>
>>>> *Mobile : +94 777577706 <+94%2077%20757%207706>*
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Architecture mailing list
>>>> Architecture@wso2.org
>>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>>
>>>>
>>>
>>>
>>> --
>>> Asanka Abeyweera
>>> Senior Software Engineer
>>> WSO2 Inc.
>>>
>>> Phone: +94 712228648 <+94%2071%20222%208648>
>>> Blog: a5anka.github.io
>>>
>>> <https://wso2.com/signature>
>>>
>>> _______________________________________________
>>> Architecture mailing list
>>> Architecture@wso2.org
>>> https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
>>>
>>>
>>
>>
>> --
>>
>> *Wishmitha Mendis*
>>
>> *Intern - Software Engineering*
>> *WSO2*
>>
>> *Mobile : +94 777577706 <+94%2077%20757%207706>*
>>
>>
>>
>
>
> --
> Asanka Abeyweera
> Senior Software Engineer
> WSO2 Inc.
>
> Phone: +94 712228648 <071%20222%208648>
> Blog: a5anka.github.io
>
> <https://wso2.com/signature>
>



-- 
*Pamod Sylvester *

*WSO2 Inc.; http://wso2.com <http://wso2.com>*
cell: +94 77 7779495
_______________________________________________
Architecture mailing list
Architecture@wso2.org
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to