Hi Sinthuja, I have used MySQL in RDS. And I have used a indexing disabled version of smart home CApp to isolate issues. Here I have attached it. So I could not see any error in DAS side and that may be the low CPU usage in DAS than in publisher comparing to your setup as we discussed offline.
Thanks. *Maninda Edirisooriya* Senior Software Engineer *WSO2, Inc.*lean.enterprise.middleware. *Blog* : http://maninda.blogspot.com/ *E-mail* : [email protected] *Skype* : @manindae *Twitter* : @maninda On Tue, Sep 1, 2015 at 8:06 AM, Sinthuja Ragendran <[email protected]> wrote: > Hi Maninda, > > I tested this locally now, and I was able to see some hickups when > publishing. So at the point when the publisher is kind of paused > publishing, I started a new publisher, and that also succeeded only upto > the publisher's event queue becomes full, and then that publisher also > stopped pushing. Can you confirm that same behaviour was observed in > publisher? I think this have made you to think the publisher has become > hang state, but actually the receiver queue was full and it stops accepting > the events further. > > And during that time, I was able to see multiple error logs in the DAS > side. Therefore I think the event persisting thread has become very slow, > and hence the this behaviour was observed. I have attached the DAS > threaddump, and I could see many threads are in blocked state on H2 > database. What is the database that you are using to test? I think better > you try with MySQL, some other production recommended databases. > > [1] > > [2015-08-31 19:17:04,359] ERROR > {org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer} - > Error in processing index batch operations: [-1000:__INDEX_DATA__] does not > exist > org.wso2.carbon.analytics.datasource.commons.exception.AnalyticsTableNotAvailableException: > [-1000:__INDEX_DATA__] does not exist > at > org.wso2.carbon.analytics.datasource.rdbms.RDBMSAnalyticsRecordStore.get(RDBMSAnalyticsRecordStore.java:319) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.loadIndexOperationRecords(AnalyticsDataIndexer.java:588) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.processIndexOperations(AnalyticsDataIndexer.java:391) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.processIndexOperations(AnalyticsDataIndexer.java:381) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.access$100(AnalyticsDataIndexer.java:130) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer$IndexWorker.run(AnalyticsDataIndexer.java:1791) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > org.wso2.carbon.analytics.datasource.commons.exception.AnalyticsException: > Error in deleting records: Timeout trying to lock table "ANX___8GIVT7RC_"; > SQL statement: > DELETE FROM ANX___8GIvT7Rc_ WHERE record_id IN > (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) [50200-140] > at > org.wso2.carbon.analytics.datasource.rdbms.RDBMSAnalyticsRecordStore.delete(RDBMSAnalyticsRecordStore.java:519) > at > org.wso2.carbon.analytics.datasource.rdbms.RDBMSAnalyticsRecordStore.delete(RDBMSAnalyticsRecordStore.java:491) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.deleteIndexRecords(AnalyticsDataIndexer.java:581) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.processIndexOperations(AnalyticsDataIndexer.java:414) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.processIndexOperations(AnalyticsDataIndexer.java:381) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.access$100(AnalyticsDataIndexer.java:130) > at > org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer$IndexWorker.run(AnalyticsDataIndexer.java:1791) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.h2.jdbc.JdbcSQLException: Timeout trying to lock table > "ANX___8GIVT7RC_"; SQL statement: > DELETE FROM ANX___8GIvT7Rc_ WHERE record_id IN > (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) [50200-140] > at org.h2.message.DbException.getJdbcSQLException(DbException.java:327) > at org.h2.message.DbException.get(DbException.java:167) > at org.h2.message.DbException.get(DbException.java:144) > at org.h2.table.RegularTable.doLock(RegularTable.java:466) > at org.h2.table.RegularTable.lock(RegularTable.java:404) > at org.h2.command.dml.Delete.update(Delete.java:50) > at org.h2.command.CommandContainer.update(CommandContainer.java:70) > at org.h2.command.Command.executeUpdate(Command.java:199) > at > org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:141) > at > org.h2.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:127) > at > org.wso2.carbon.analytics.datasource.rdbms.RDBMSAnalyticsRecordStore.delete(RDBMSAnalyticsRecordStore.java:514) > ... 9 more > > > On Mon, Aug 31, 2015 at 10:01 AM, Sinthuja Ragendran <[email protected]> > wrote: > >> Hi maninda, >> >> Ok, thanks for information. I'll do the test locally and get back to you. >> >> Thanks, >> Sinthuja. >> >> On Mon, Aug 31, 2015 at 9:53 AM, Maninda Edirisooriya <[email protected]> >> wrote: >> >>> Hi Sinthuja, >>> >>> I tested with smart-home sample in latest DAS with [1] config and DAS >>> with the attached config directory. (There data-bridge-config.xml is as [2]) >>> I did the test on EC2 instances with MySQL RDS instance as DBs. >>> This issue was always reproducible when 10M events are published with >>> the sample. For some time events get published and then it will suddenly >>> stop receiving events. But you can see the client is busy with the CPU >>> usage while DAS is almost idling. >>> No debug or logging was enabled. >>> >>> [1] >>> >>> <Agent> >>> <Name>Thrift</Name> >>> >>> <DataEndpointClass>org.wso2.carbon.databridge.agent.endpoint.thrift.ThriftDataEndpoint</DataEndpointClass> >>> <TrustSore>src/main/resources/client-truststore.jks</TrustSore> >>> <TrustSorePassword>wso2carbon</TrustSorePassword> >>> <QueueSize>32768</QueueSize> >>> <BatchSize>200</BatchSize> >>> <CorePoolSize>5</CorePoolSize> >>> <MaxPoolSize>10</MaxPoolSize> >>> <KeepAliveTimeInPool>20</KeepAliveTimeInPool> >>> <ReconnectionInterval>30</ReconnectionInterval> >>> <MaxTransportPoolSize>250</MaxTransportPoolSize> >>> <MaxIdleConnections>250</MaxIdleConnections> >>> <EvictionTimePeriod>5500</EvictionTimePeriod> >>> <MinIdleTimeInPool>5000</MinIdleTimeInPool> >>> <SecureMaxTransportPoolSize>250</SecureMaxTransportPoolSize> >>> <SecureMaxIdleConnections>250</SecureMaxIdleConnections> >>> <SecureEvictionTimePeriod>5500</SecureEvictionTimePeriod> >>> <SecureMinIdleTimeInPool>5000</SecureMinIdleTimeInPool> >>> </Agent> >>> >>> [2] >>> >>> <dataBridgeConfiguration> >>> >>> <workerThreads>10</workerThreads> >>> <eventBufferCapacity>1000</eventBufferCapacity> >>> <clientTimeoutMin>30</clientTimeoutMin> >>> >>> <dataReceiver name="Thrift"> >>> <config name="tcpPort">7611</config> >>> <config name="sslPort">7711</config> >>> </dataReceiver> >>> >>> <dataReceiver name="Binary"> >>> <config name="tcpPort">9611</config> >>> <config name="sslPort">9711</config> >>> <config name="sslReceiverThreadPoolSize">100</config> >>> <config name="tcpReceiverThreadPoolSize">100</config> >>> </dataReceiver> >>> >>> </dataBridgeConfiguration> >>> >>> Thanks. >>> >>> >>> *Maninda Edirisooriya* >>> Senior Software Engineer >>> >>> *WSO2, Inc.*lean.enterprise.middleware. >>> >>> *Blog* : http://maninda.blogspot.com/ >>> *E-mail* : [email protected] >>> *Skype* : @manindae >>> *Twitter* : @maninda >>> >>> On Mon, Aug 31, 2015 at 8:08 PM, Sinthuja Ragendran <[email protected]> >>> wrote: >>> >>>> Are you running with debug mode in logging? And can you constantly >>>> reproduce this? Or it's intermittent? >>>> >>>> Please provide the publisher and receiver side configs to test this and >>>> see. As I have already tested more than 10M records, I'm not sure what is >>>> the case here. >>>> >>>> Thanks, >>>> Sinthuja. >>>> >>>> >>>> On Monday, August 31, 2015, Maninda Edirisooriya <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> When I started a 10M load test from Smart Home sample in DAS it runs >>>>> for some time and stops receiving events suddenly. >>>>> But publisher in client was running in higher CPU usage when DAS was >>>>> running with very low CPU. >>>>> When another data agent was spawned it started to publish correctly >>>>> which was confirming that the issue is with the client side. >>>>> We analyzed the thread dump and found the highest using thread is with >>>>> the following stack traces when we analyzed it twice. >>>>> >>>>> 1. >>>>> "main" prio=10 tid=0x00007f85ec00a800 nid=0x7843 runnable >>>>> [0x00007f85f250f000] >>>>> java.lang.Thread.State: RUNNABLE >>>>> at >>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup$EventQueue.put(DataEndpointGroup.java:148) >>>>> at >>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup$EventQueue.access$300(DataEndpointGroup.java:97) >>>>> at >>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup.publish(DataEndpointGroup.java:94) >>>>> at >>>>> org.wso2.carbon.databridge.agent.DataPublisher.publish(DataPublisher.java:183) >>>>> at >>>>> org.wso2.carbon.das.smarthome.sample.SmartHomeAgent.publishLogEvents(Unknown >>>>> Source) >>>>> at >>>>> org.wso2.carbon.das.smarthome.sample.SmartHomeAgent.main(Unknown Source) >>>>> >>>>> 2. >>>>> "main" prio=10 tid=0x00007f85ec00a800 nid=0x7843 runnable >>>>> [0x00007f85f250f000] >>>>> java.lang.Thread.State: RUNNABLE >>>>> at org.apache.log4j.Category.callAppenders(Category.java:202) >>>>> at org.apache.log4j.Category.forcedLog(Category.java:391) >>>>> at org.apache.log4j.Category.log(Category.java:856) >>>>> at >>>>> org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:177) >>>>> at >>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup.isActiveDataEndpointExists(DataEndpointGroup.java:264) >>>>> at >>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup.access$400(DataEndpointGroup.java:46) >>>>> at >>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup$EventQueue.put(DataEndpointGroup.java:155) >>>>> at >>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup$EventQueue.access$300(DataEndpointGroup.java:97) >>>>> at >>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup.publish(DataEndpointGroup.java:94) >>>>> at >>>>> org.wso2.carbon.databridge.agent.DataPublisher.publish(DataPublisher.java:183) >>>>> at >>>>> org.wso2.carbon.das.smarthome.sample.SmartHomeAgent.publishLogEvents(Unknown >>>>> Source) >>>>> at >>>>> org.wso2.carbon.das.smarthome.sample.SmartHomeAgent.main(Unknown Source) >>>>> >>>>> >>>>> We suspect that *isActiveDataEndpointExists()* method is called in >>>>> *org.wso2.carbon.analytics.eventsink.internal.queue.DataEndpointGroup* >>>>> class repeatedly because the disruptor ring buffer is filled in client >>>>> side. Not sure why this happens. >>>>> >>>>> >>>>> *Maninda Edirisooriya* >>>>> Senior Software Engineer >>>>> >>>>> *WSO2, Inc.*lean.enterprise.middleware. >>>>> >>>>> *Blog* : http://maninda.blogspot.com/ >>>>> *E-mail* : [email protected] >>>>> *Skype* : @manindae >>>>> *Twitter* : @maninda >>>>> >>>> >>>> >>>> -- >>>> Sent from iPhone >>>> >>> >>> >> >> >> -- >> *Sinthuja Rajendran* >> Associate Technical Lead >> WSO2, Inc.:http://wso2.com >> >> Blog: http://sinthu-rajan.blogspot.com/ >> Mobile: +94774273955 >> >> >> > > > -- > *Sinthuja Rajendran* > Associate Technical Lead > WSO2, Inc.:http://wso2.com > > Blog: http://sinthu-rajan.blogspot.com/ > Mobile: +94774273955 > > >
Smart_Home.car
Description: Binary data
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
