Re: [Dev] Integrating topics to MB

Asitha Nanayakkara Sun, 05 Oct 2014 07:32:36 -0700

Hi Pamod,

Ah yes I agree, Thought you were suggesting for DB operation without any
content duplication. Yes with per node content duplication we can do a
content clean up job at start-up and stick with in memory reference
counting. BTW depending on message count for topics the start-up time will
vary though.


+1 for in-memory reference counting with content clean up job at startup.

Thanks,

On Sun, Oct 5, 2014 at 2:29 PM, Pamod Sylvester <[email protected]> wrote:

> Hi Asitha,
>
> I agree the content should be written before the meat data. What i meant
> was not having a separate process to do the content clean up rather going
> with the solution which was proposed by Hasitha where the message count
> will be maintained in memory instead of the DB.
>
> Also if we're going to duplicate both the message content and meta data
> per node it should not affect, as it was initially mentioned. Instead of
> duplicating if we're going to share the content among all the nodes then we
> cannot maintain a local reference anyhow since even if the reference count
> goes 0 locally there will be other nodes who has subscribers referring to
> the same content.
>
> The solution i proposed was to address the problem of losing the in memory
> counts at a time where the node gets killed. If a node was killed and the
> in memory reference was lost when the node will be re started it will first
> check for the ids which have not being purged through comparison between
> the mata data and the content and will do the needful.
>
> Thanks,
> Pamod
>
>
> On Sun, Oct 5, 2014 at 1:01 PM, Asitha Nanayakkara <[email protected]>
> wrote:
>
>> Hi Pamod,
>>
>> In a clustered  set-up when some other nodes are running. They store
>> message content for topic first and then store message meta data. This is
>> not done atomically. While this is happening if we start another node with
>> a logic to scan the database and delete inconsistent content that will pick
>> some of the new topic messages that have stored content but in the process
>> of storing metadata. And the process will delete that content too. And this
>> will make database having messages with meta data but without any
>> corresponding content. I think there is a possibility of this happening if
>> There is a working cluster with topic messages being publish at a higher
>> rate with high concurrency(publishing) and new node is started at the same
>> time. Correct me if I'm wrong.
>>
>> Yes for each message we will have to store content, metadata and update
>> the reference count. But we can increment the reference count per message
>> not per duplicate metadata  (since we know how many duplicates of metadata
>> we need). If there is a bigger performance hit due to DB update call it's
>> better to go with in memory approach rather than trying to clean the
>> content at start-up I guess.
>>
>> Thanks.
>>
>> On Sun, Oct 5, 2014 at 12:20 PM, Pamod Sylvester <[email protected]> wrote:
>>
>>> HI,
>>>
>>> How would this approach impact on performance ? this will result in a DB
>>> operation each time the message is published as well the subscriber acks ?
>>>
>>> I agree with you on the fact that maintaining the counters in-memory
>>> could result in message content to be persisted in the DB and have no way
>>> of deleting them if the node gets killed.
>>>
>>> Also what will be the possibility to check the message content which
>>> needs to be deleted at the start up of the node. Where there should be a
>>> comparison between the meta data and the content column families, all the
>>> ids which are in the content table but not in the meta data CF should be
>>> purged ?
>>>
>>> {MessageContentCF} \ {MessageMetaData} = Message Content to be deleted.
>>>
>>> this can affect the start up time of the node, but IMO it will not
>>> affect the performance of the main flows.
>>>
>>> WDYT ?
>>>
>>> Thanks,
>>> Pamod
>>>
>>> On Sun, Oct 5, 2014 at 11:09 AM, Asitha Nanayakkara <[email protected]>
>>> wrote:
>>>
>>>> Hi Hasitha
>>>>
>>>> In this if a node with a reference count get killed, all the details
>>>> regarding reference counts are lost right? Is there a way to delete the
>>>> content?
>>>>
>>>> Btw what if we have the reference count in database. Something similar
>>>> to what we have for queue message counting now (We create a counter when a
>>>> queue is created and then increment/ decrement count when messages are
>>>> received and sent)
>>>>
>>>> What I suggest is when a topic message is created we add a reference
>>>> counter for the message (Via AndesContextStore a new method 
>>>> createReferenceCounter(long
>>>> messageID)) when meta data is duplicated we increment the counter when
>>>> acknowledgment is received we decrement the counter (two methods in context
>>>> store to increment/decrement counts). And we will have a scheduled task to
>>>> periodically check the reference count zero messages and delete the
>>>> content. This way by creating separate insert statement to create a ref
>>>> counter and separate statement to update count we can over come writing
>>>> vendor specific SQL queries for reference counting (For RDBMS). Since the
>>>> idea is to recommend Cassandra for MessageStore and a RDBMS
>>>> AndesContextStore we would be better off that way. Plus this will avoid the
>>>> need to track reference counts in memory avoiding losing the reference
>>>> counts when a node gets killed. WDYT?
>>>>
>>>>
>>>> Thanks
>>>>
>>>> On Sun, Oct 5, 2014 at 6:57 AM, Hasitha Hiranya <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi Team,
>>>>>
>>>>>
>>>>> Following is my vision on intregating topics to MB
>>>>>
>>>>> >> we duplicate metadata per subscriber. It will not create a big
>>>>> overhead.
>>>>> >> we do not duplicate content per subscriber, but we duplicate
>>>>> content per node
>>>>> >> I hereby assume that we do handle acks for topics. We need a
>>>>> reasearch on that.
>>>>>
>>>>> When a topic subscriber is created
>>>>> 1. qpid creates a temp queue
>>>>> 2. qpid creates a binding for that queue to topic exchange using topic
>>>>> name as binding key.
>>>>> 3. qpid creates a subscription for the temp queue.
>>>>>
>>>>> when a topic subscriber is closed qpid does above 3 things in reverse
>>>>> order.
>>>>>
>>>>> Adhering to this model,
>>>>>
>>>>> 1. We store metadata in the same way we use for normal queues.
>>>>> 2. We use the same SlotDelivery worker and the flusher. There is
>>>>> NOTHING called topic delivery worker
>>>>> 3. when show in UI we filter durable ones and show
>>>>> 4. when a subscriber closes, queue is deleted. We do same thing as for
>>>>> normal queues.
>>>>> 5. Whenever we insert metadata, we duplicate metadata for each temp
>>>>> queue (per subscriber). We know the nodes where subscriers lies, do we can
>>>>> duplicate content for those nodes (one copy for node).
>>>>> 6. We need to introduce a new tracking per subscriber in on flight
>>>>> message tracker, which is common for queues as well. when a metadata is
>>>>> inserted for a message id we increase a count.
>>>>>     When an ack came for that metadata we decrement it. If it is zero,
>>>>> content is ready to be removed. we do not track this count globally as we
>>>>> have a copy of content per node. Thus reference count do not need to be
>>>>> global. It is a local in-memory tracking.
>>>>> 7. queue change handler - if delete - execute normal delete (remove
>>>>> all metadata), decrement reference counts. Thread that delete content will
>>>>> detect that and will delete offline. This way only if all subscribers are
>>>>> gone, content is removed.
>>>>>
>>>>> 8. Should be careful abt hierarchical topics. We use our maps to
>>>>> identify queues bound to a topic. MQTT, AMQP confusion should be solved
>>>>> there.
>>>>>
>>>>> *Thanks *
>>>>>
>>>>>
>>>>> --
>>>>> *Hasitha Abeykoon*
>>>>> Senior Software Engineer; WSO2, Inc.; http://wso2.com
>>>>> *cell:* *+94 719363063*
>>>>> *blog: **abeykoon.blogspot.com* <http://abeykoon.blogspot.com>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> *Asitha Nanayakkara*
>>>> Software Engineer
>>>> WSO2, Inc. http://wso2.com/
>>>> Mob: + 94 77 85 30 682
>>>>
>>>>
>>>
>>>
>>> --
>>> *Pamod Sylvester *
>>>  *Senior Software Engineer *
>>> Integration Technologies Team, WSO2 Inc.; http://wso2.com
>>> email: [email protected] cell: +94 77 7779495
>>>
>>
>>
>>
>> --
>> *Asitha Nanayakkara*
>> Software Engineer
>> WSO2, Inc. http://wso2.com/
>> Mob: + 94 77 85 30 682
>>
>>
>
>
> --
> *Pamod Sylvester *
>  *Senior Software Engineer *
> Integration Technologies Team, WSO2 Inc.; http://wso2.com
> email: [email protected] cell: +94 77 7779495
>



-- 
*Asitha Nanayakkara*
Software Engineer
WSO2, Inc. http://wso2.com/
Mob: + 94 77 85 30 682

_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Re: [Dev] Integrating topics to MB

Reply via email to