Hi Pamod, Ah yes I agree, Thought you were suggesting for DB operation without any content duplication. Yes with per node content duplication we can do a content clean up job at start-up and stick with in memory reference counting. BTW depending on message count for topics the start-up time will vary though.
+1 for in-memory reference counting with content clean up job at startup. Thanks, On Sun, Oct 5, 2014 at 2:29 PM, Pamod Sylvester <[email protected]> wrote: > Hi Asitha, > > I agree the content should be written before the meat data. What i meant > was not having a separate process to do the content clean up rather going > with the solution which was proposed by Hasitha where the message count > will be maintained in memory instead of the DB. > > Also if we're going to duplicate both the message content and meta data > per node it should not affect, as it was initially mentioned. Instead of > duplicating if we're going to share the content among all the nodes then we > cannot maintain a local reference anyhow since even if the reference count > goes 0 locally there will be other nodes who has subscribers referring to > the same content. > > The solution i proposed was to address the problem of losing the in memory > counts at a time where the node gets killed. If a node was killed and the > in memory reference was lost when the node will be re started it will first > check for the ids which have not being purged through comparison between > the mata data and the content and will do the needful. > > Thanks, > Pamod > > > On Sun, Oct 5, 2014 at 1:01 PM, Asitha Nanayakkara <[email protected]> > wrote: > >> Hi Pamod, >> >> In a clustered set-up when some other nodes are running. They store >> message content for topic first and then store message meta data. This is >> not done atomically. While this is happening if we start another node with >> a logic to scan the database and delete inconsistent content that will pick >> some of the new topic messages that have stored content but in the process >> of storing metadata. And the process will delete that content too. And this >> will make database having messages with meta data but without any >> corresponding content. I think there is a possibility of this happening if >> There is a working cluster with topic messages being publish at a higher >> rate with high concurrency(publishing) and new node is started at the same >> time. Correct me if I'm wrong. >> >> Yes for each message we will have to store content, metadata and update >> the reference count. But we can increment the reference count per message >> not per duplicate metadata (since we know how many duplicates of metadata >> we need). If there is a bigger performance hit due to DB update call it's >> better to go with in memory approach rather than trying to clean the >> content at start-up I guess. >> >> Thanks. >> >> On Sun, Oct 5, 2014 at 12:20 PM, Pamod Sylvester <[email protected]> wrote: >> >>> HI, >>> >>> How would this approach impact on performance ? this will result in a DB >>> operation each time the message is published as well the subscriber acks ? >>> >>> I agree with you on the fact that maintaining the counters in-memory >>> could result in message content to be persisted in the DB and have no way >>> of deleting them if the node gets killed. >>> >>> Also what will be the possibility to check the message content which >>> needs to be deleted at the start up of the node. Where there should be a >>> comparison between the meta data and the content column families, all the >>> ids which are in the content table but not in the meta data CF should be >>> purged ? >>> >>> {MessageContentCF} \ {MessageMetaData} = Message Content to be deleted. >>> >>> this can affect the start up time of the node, but IMO it will not >>> affect the performance of the main flows. >>> >>> WDYT ? >>> >>> Thanks, >>> Pamod >>> >>> On Sun, Oct 5, 2014 at 11:09 AM, Asitha Nanayakkara <[email protected]> >>> wrote: >>> >>>> Hi Hasitha >>>> >>>> In this if a node with a reference count get killed, all the details >>>> regarding reference counts are lost right? Is there a way to delete the >>>> content? >>>> >>>> Btw what if we have the reference count in database. Something similar >>>> to what we have for queue message counting now (We create a counter when a >>>> queue is created and then increment/ decrement count when messages are >>>> received and sent) >>>> >>>> What I suggest is when a topic message is created we add a reference >>>> counter for the message (Via AndesContextStore a new method >>>> createReferenceCounter(long >>>> messageID)) when meta data is duplicated we increment the counter when >>>> acknowledgment is received we decrement the counter (two methods in context >>>> store to increment/decrement counts). And we will have a scheduled task to >>>> periodically check the reference count zero messages and delete the >>>> content. This way by creating separate insert statement to create a ref >>>> counter and separate statement to update count we can over come writing >>>> vendor specific SQL queries for reference counting (For RDBMS). Since the >>>> idea is to recommend Cassandra for MessageStore and a RDBMS >>>> AndesContextStore we would be better off that way. Plus this will avoid the >>>> need to track reference counts in memory avoiding losing the reference >>>> counts when a node gets killed. WDYT? >>>> >>>> >>>> Thanks >>>> >>>> On Sun, Oct 5, 2014 at 6:57 AM, Hasitha Hiranya <[email protected]> >>>> wrote: >>>> >>>>> Hi Team, >>>>> >>>>> >>>>> Following is my vision on intregating topics to MB >>>>> >>>>> >> we duplicate metadata per subscriber. It will not create a big >>>>> overhead. >>>>> >> we do not duplicate content per subscriber, but we duplicate >>>>> content per node >>>>> >> I hereby assume that we do handle acks for topics. We need a >>>>> reasearch on that. >>>>> >>>>> When a topic subscriber is created >>>>> 1. qpid creates a temp queue >>>>> 2. qpid creates a binding for that queue to topic exchange using topic >>>>> name as binding key. >>>>> 3. qpid creates a subscription for the temp queue. >>>>> >>>>> when a topic subscriber is closed qpid does above 3 things in reverse >>>>> order. >>>>> >>>>> Adhering to this model, >>>>> >>>>> 1. We store metadata in the same way we use for normal queues. >>>>> 2. We use the same SlotDelivery worker and the flusher. There is >>>>> NOTHING called topic delivery worker >>>>> 3. when show in UI we filter durable ones and show >>>>> 4. when a subscriber closes, queue is deleted. We do same thing as for >>>>> normal queues. >>>>> 5. Whenever we insert metadata, we duplicate metadata for each temp >>>>> queue (per subscriber). We know the nodes where subscriers lies, do we can >>>>> duplicate content for those nodes (one copy for node). >>>>> 6. We need to introduce a new tracking per subscriber in on flight >>>>> message tracker, which is common for queues as well. when a metadata is >>>>> inserted for a message id we increase a count. >>>>> When an ack came for that metadata we decrement it. If it is zero, >>>>> content is ready to be removed. we do not track this count globally as we >>>>> have a copy of content per node. Thus reference count do not need to be >>>>> global. It is a local in-memory tracking. >>>>> 7. queue change handler - if delete - execute normal delete (remove >>>>> all metadata), decrement reference counts. Thread that delete content will >>>>> detect that and will delete offline. This way only if all subscribers are >>>>> gone, content is removed. >>>>> >>>>> 8. Should be careful abt hierarchical topics. We use our maps to >>>>> identify queues bound to a topic. MQTT, AMQP confusion should be solved >>>>> there. >>>>> >>>>> *Thanks * >>>>> >>>>> >>>>> -- >>>>> *Hasitha Abeykoon* >>>>> Senior Software Engineer; WSO2, Inc.; http://wso2.com >>>>> *cell:* *+94 719363063* >>>>> *blog: **abeykoon.blogspot.com* <http://abeykoon.blogspot.com> >>>>> >>>>> >>>> >>>> >>>> -- >>>> *Asitha Nanayakkara* >>>> Software Engineer >>>> WSO2, Inc. http://wso2.com/ >>>> Mob: + 94 77 85 30 682 >>>> >>>> >>> >>> >>> -- >>> *Pamod Sylvester * >>> *Senior Software Engineer * >>> Integration Technologies Team, WSO2 Inc.; http://wso2.com >>> email: [email protected] cell: +94 77 7779495 >>> >> >> >> >> -- >> *Asitha Nanayakkara* >> Software Engineer >> WSO2, Inc. http://wso2.com/ >> Mob: + 94 77 85 30 682 >> >> > > > -- > *Pamod Sylvester * > *Senior Software Engineer * > Integration Technologies Team, WSO2 Inc.; http://wso2.com > email: [email protected] cell: +94 77 7779495 > -- *Asitha Nanayakkara* Software Engineer WSO2, Inc. http://wso2.com/ Mob: + 94 77 85 30 682
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
