Hi Malinga, This sounds great, and fixes look very good! I will come over, and lets discuss how we keep arch clean.
Very nice Job!! as you mention in other thread, now we need perf numbers. --Srinath On Sun, May 26, 2013 at 9:58 PM, Malinga Purnasiri <[email protected]>wrote: > Hi, > > When I'm working on the OOM issue in the MB, I have found that there is > some design imitations which will lead to it. Let me summaries my findings > in point form. > > ** Static Executor pool (org.wso2.andes.pool.AndesExecuter.java) issue,* > Unfortunately MB had static executor pool where we submit all the > runnables need to execute in parallel. Sooner when we send messages in a > burst pool will get exhausted (with runnables) and which will lead to > increase the internal queue backing the executor pool. Which will lead to > OOM. This was observed from the heap dump analysis with MAT. > > *Solution* : We need to group the runnables based on the task > nature and send them to different executor pools. > > ** The way we inserting data to the Cassandra.* > Currently most of the time we are creating mutators and execute on the > fly. I have done some Benchmarks which simulate this situation. Following > is small code block for that ... > > Sample .. > for (int i = 0; i < 1000000; i++) { > Mutator<String> messageMutator = HFactory.createMutator(keyspaceOperator, > stringSerializer); > messageMutator.addInsertion(..); > messageMutator.execute(); > } > > When we looping and doing the execution on the fly, after few thousands > latency it takes very long time than we expected to execute the insertion. > > According to the Cassandra documentations this will lead to > another side effect of sending so many messages over network, which will > exhaust network bandwidth too, > > *Solution : Solution will be operate in batch mode. ex : BatchMutation* > > ** Message accumulation in LinkedBlockingQueue (observed from heap dump > with MAT)* > Inside the CassendraMessageStore we have used LinkedBlockingQueue to store > the messages temporary till we insert messages to Cassandra. But in there > we have huge bottleneck, producer to the queue inserting so fast but > consumer end its very slow. So this will lead to increase the blocking > queue and crate OOM. > > *Solution : Add BatchMutation mode at the consumer end to make the > consumer fast. So Queue will have less messages in given time.* > > ** PublishMessageWriter run method is serial execution* > Inside the thread, we just take messages one by one and try to insert to > Cassandra, but when we have burst of messages this will create a > bottleneck. So we must introduce more parallelism to this. > > > --------- > > Any ideas on this ?, > > Note : I have done those above code changes and now MB can take messages > in a burst and work without OOM. But we need to design this and implement > this in a production quality mode. > > -- > Malinga Pathmal, > Technical Lead, WSO2, Inc. : http://wso2.com/ > Phone : (+94) 715335898 > -- ============================ Srinath Perera, Ph.D. Senior Software Architect, WSO2 Inc. Visiting Faculty, University of Moratuwa Member, Apache Software Foundation Research Scientist, Lanka Software Foundation Blog: http://srinathsview.blogspot.com/ Photos: http://www.flickr.com/photos/hemapani/ Phone: 0772360902
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
