Re: [Dev] New data publisher is hanging

Maninda Edirisooriya Mon, 31 Aug 2015 20:47:34 -0700

Hi Sinthuja,

I have used MySQL in RDS. And I have used a indexing disabled version of
smart home CApp to isolate issues. Here I have attached it. So I could not
see any error in DAS side and that may be the low CPU usage in DAS than in
publisher comparing to your setup as we discussed offline.


Thanks.


*Maninda Edirisooriya*
Senior Software Engineer

*WSO2, Inc.*lean.enterprise.middleware.

*Blog* : http://maninda.blogspot.com/
*E-mail* : [email protected]
*Skype* : @manindae
*Twitter* : @maninda

On Tue, Sep 1, 2015 at 8:06 AM, Sinthuja Ragendran <[email protected]>
wrote:

> Hi Maninda,
>
> I tested this locally now, and I was able to see some hickups when
> publishing. So at the point when the publisher is kind of paused
> publishing, I started a new publisher, and that also succeeded only upto
> the publisher's event queue becomes full, and then that publisher also
> stopped pushing. Can you confirm that same behaviour was observed in
> publisher? I think this have made you to think the publisher has become
> hang state, but actually the receiver queue was full and it stops accepting
> the events further.
>
> And during that time, I was able to see multiple error logs in the DAS
> side. Therefore I think the event persisting thread has become  very slow,
> and hence the this behaviour was observed. I have attached the DAS
> threaddump, and I could see many threads are in blocked state on H2
> database. What is the database that you are using to test? I think better
> you try with MySQL, some other production recommended databases.
>
> [1]
>
> [2015-08-31 19:17:04,359] ERROR
> {org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer} -
> Error in processing index batch operations: [-1000:__INDEX_DATA__] does not
> exist
> org.wso2.carbon.analytics.datasource.commons.exception.AnalyticsTableNotAvailableException:
> [-1000:__INDEX_DATA__] does not exist
>     at
> org.wso2.carbon.analytics.datasource.rdbms.RDBMSAnalyticsRecordStore.get(RDBMSAnalyticsRecordStore.java:319)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.loadIndexOperationRecords(AnalyticsDataIndexer.java:588)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.processIndexOperations(AnalyticsDataIndexer.java:391)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.processIndexOperations(AnalyticsDataIndexer.java:381)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.access$100(AnalyticsDataIndexer.java:130)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer$IndexWorker.run(AnalyticsDataIndexer.java:1791)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> org.wso2.carbon.analytics.datasource.commons.exception.AnalyticsException:
> Error in deleting records: Timeout trying to lock table "ANX___8GIVT7RC_";
> SQL statement:
> DELETE FROM ANX___8GIvT7Rc_ WHERE record_id IN
> (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) [50200-140]
>     at
> org.wso2.carbon.analytics.datasource.rdbms.RDBMSAnalyticsRecordStore.delete(RDBMSAnalyticsRecordStore.java:519)
>     at
> org.wso2.carbon.analytics.datasource.rdbms.RDBMSAnalyticsRecordStore.delete(RDBMSAnalyticsRecordStore.java:491)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.deleteIndexRecords(AnalyticsDataIndexer.java:581)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.processIndexOperations(AnalyticsDataIndexer.java:414)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.processIndexOperations(AnalyticsDataIndexer.java:381)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer.access$100(AnalyticsDataIndexer.java:130)
>     at
> org.wso2.carbon.analytics.dataservice.indexing.AnalyticsDataIndexer$IndexWorker.run(AnalyticsDataIndexer.java:1791)
>     at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>     at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>     at java.lang.Thread.run(Thread.java:745)
> Caused by: org.h2.jdbc.JdbcSQLException: Timeout trying to lock table
> "ANX___8GIVT7RC_"; SQL statement:
> DELETE FROM ANX___8GIvT7Rc_ WHERE record_id IN
> (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) [50200-140]
>     at org.h2.message.DbException.getJdbcSQLException(DbException.java:327)
>     at org.h2.message.DbException.get(DbException.java:167)
>     at org.h2.message.DbException.get(DbException.java:144)
>     at org.h2.table.RegularTable.doLock(RegularTable.java:466)
>     at org.h2.table.RegularTable.lock(RegularTable.java:404)
>     at org.h2.command.dml.Delete.update(Delete.java:50)
>     at org.h2.command.CommandContainer.update(CommandContainer.java:70)
>     at org.h2.command.Command.executeUpdate(Command.java:199)
>     at
> org.h2.jdbc.JdbcPreparedStatement.executeUpdateInternal(JdbcPreparedStatement.java:141)
>     at
> org.h2.jdbc.JdbcPreparedStatement.executeUpdate(JdbcPreparedStatement.java:127)
>     at
> org.wso2.carbon.analytics.datasource.rdbms.RDBMSAnalyticsRecordStore.delete(RDBMSAnalyticsRecordStore.java:514)
>     ... 9 more
>
>
> On Mon, Aug 31, 2015 at 10:01 AM, Sinthuja Ragendran <[email protected]>
> wrote:
>
>> Hi maninda,
>>
>> Ok, thanks for information. I'll do the test locally and get back to you.
>>
>> Thanks,
>> Sinthuja.
>>
>> On Mon, Aug 31, 2015 at 9:53 AM, Maninda Edirisooriya <[email protected]>
>> wrote:
>>
>>> Hi Sinthuja,
>>>
>>> I tested with smart-home sample in latest DAS with [1] config and DAS
>>> with the attached config directory. (There data-bridge-config.xml is as [2])
>>> I did the test on EC2 instances with MySQL RDS instance as DBs.
>>> This issue was always reproducible when 10M events are published with
>>> the sample. For some time events get published and then it will suddenly
>>> stop receiving events. But you can see the client is busy with the CPU
>>> usage while DAS is almost idling.
>>> No debug or logging was enabled.
>>>
>>> [1]
>>>
>>>     <Agent>
>>>         <Name>Thrift</Name>
>>>
>>> <DataEndpointClass>org.wso2.carbon.databridge.agent.endpoint.thrift.ThriftDataEndpoint</DataEndpointClass>
>>>         <TrustSore>src/main/resources/client-truststore.jks</TrustSore>
>>>         <TrustSorePassword>wso2carbon</TrustSorePassword>
>>>         <QueueSize>32768</QueueSize>
>>>         <BatchSize>200</BatchSize>
>>>         <CorePoolSize>5</CorePoolSize>
>>>         <MaxPoolSize>10</MaxPoolSize>
>>>         <KeepAliveTimeInPool>20</KeepAliveTimeInPool>
>>>         <ReconnectionInterval>30</ReconnectionInterval>
>>>         <MaxTransportPoolSize>250</MaxTransportPoolSize>
>>>         <MaxIdleConnections>250</MaxIdleConnections>
>>>         <EvictionTimePeriod>5500</EvictionTimePeriod>
>>>         <MinIdleTimeInPool>5000</MinIdleTimeInPool>
>>>         <SecureMaxTransportPoolSize>250</SecureMaxTransportPoolSize>
>>>         <SecureMaxIdleConnections>250</SecureMaxIdleConnections>
>>>         <SecureEvictionTimePeriod>5500</SecureEvictionTimePeriod>
>>>         <SecureMinIdleTimeInPool>5000</SecureMinIdleTimeInPool>
>>>     </Agent>
>>>
>>> [2]
>>>
>>> <dataBridgeConfiguration>
>>>
>>>     <workerThreads>10</workerThreads>
>>>     <eventBufferCapacity>1000</eventBufferCapacity>
>>>     <clientTimeoutMin>30</clientTimeoutMin>
>>>
>>>     <dataReceiver name="Thrift">
>>>         <config name="tcpPort">7611</config>
>>>         <config name="sslPort">7711</config>
>>>     </dataReceiver>
>>>
>>>     <dataReceiver name="Binary">
>>>         <config name="tcpPort">9611</config>
>>>         <config name="sslPort">9711</config>
>>>         <config name="sslReceiverThreadPoolSize">100</config>
>>>         <config name="tcpReceiverThreadPoolSize">100</config>
>>>     </dataReceiver>
>>>
>>> </dataBridgeConfiguration>
>>>
>>> Thanks.
>>>
>>>
>>> *Maninda Edirisooriya*
>>> Senior Software Engineer
>>>
>>> *WSO2, Inc.*lean.enterprise.middleware.
>>>
>>> *Blog* : http://maninda.blogspot.com/
>>> *E-mail* : [email protected]
>>> *Skype* : @manindae
>>> *Twitter* : @maninda
>>>
>>> On Mon, Aug 31, 2015 at 8:08 PM, Sinthuja Ragendran <[email protected]>
>>> wrote:
>>>
>>>> Are you running with debug mode in logging? And can you constantly
>>>> reproduce this? Or it's intermittent?
>>>>
>>>> Please provide the publisher and receiver side configs to test this and
>>>> see. As I have already tested more than 10M records, I'm not sure what is
>>>> the case here.
>>>>
>>>> Thanks,
>>>> Sinthuja.
>>>>
>>>>
>>>> On Monday, August 31, 2015, Maninda Edirisooriya <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> When I started a 10M load test from Smart Home sample in DAS it runs
>>>>> for some time and stops receiving events suddenly.
>>>>> But publisher in client was running in higher CPU usage when DAS was
>>>>> running with very low CPU.
>>>>> When another data agent was spawned it started to publish correctly
>>>>> which was confirming that the issue is with the client side.
>>>>> We analyzed the thread dump and found the highest using thread is with
>>>>> the following stack traces when we analyzed it twice.
>>>>>
>>>>> 1.
>>>>> "main" prio=10 tid=0x00007f85ec00a800 nid=0x7843 runnable
>>>>> [0x00007f85f250f000]
>>>>>    java.lang.Thread.State: RUNNABLE
>>>>>     at
>>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup$EventQueue.put(DataEndpointGroup.java:148)
>>>>>     at
>>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup$EventQueue.access$300(DataEndpointGroup.java:97)
>>>>>     at
>>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup.publish(DataEndpointGroup.java:94)
>>>>>     at
>>>>> org.wso2.carbon.databridge.agent.DataPublisher.publish(DataPublisher.java:183)
>>>>>     at
>>>>> org.wso2.carbon.das.smarthome.sample.SmartHomeAgent.publishLogEvents(Unknown
>>>>> Source)
>>>>>     at
>>>>> org.wso2.carbon.das.smarthome.sample.SmartHomeAgent.main(Unknown Source)
>>>>>
>>>>> 2.
>>>>> "main" prio=10 tid=0x00007f85ec00a800 nid=0x7843 runnable
>>>>> [0x00007f85f250f000]
>>>>>    java.lang.Thread.State: RUNNABLE
>>>>>         at org.apache.log4j.Category.callAppenders(Category.java:202)
>>>>>         at org.apache.log4j.Category.forcedLog(Category.java:391)
>>>>>         at org.apache.log4j.Category.log(Category.java:856)
>>>>>         at
>>>>> org.apache.commons.logging.impl.Log4JLogger.debug(Log4JLogger.java:177)
>>>>>         at
>>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup.isActiveDataEndpointExists(DataEndpointGroup.java:264)
>>>>>         at
>>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup.access$400(DataEndpointGroup.java:46)
>>>>>         at
>>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup$EventQueue.put(DataEndpointGroup.java:155)
>>>>>         at
>>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup$EventQueue.access$300(DataEndpointGroup.java:97)
>>>>>         at
>>>>> org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup.publish(DataEndpointGroup.java:94)
>>>>>         at
>>>>> org.wso2.carbon.databridge.agent.DataPublisher.publish(DataPublisher.java:183)
>>>>>         at
>>>>> org.wso2.carbon.das.smarthome.sample.SmartHomeAgent.publishLogEvents(Unknown
>>>>> Source)
>>>>>         at
>>>>> org.wso2.carbon.das.smarthome.sample.SmartHomeAgent.main(Unknown Source)
>>>>>
>>>>>
>>>>> We suspect that *isActiveDataEndpointExists()* method is called in
>>>>> *org.wso2.carbon.analytics.eventsink.internal.queue.DataEndpointGroup*
>>>>> class repeatedly because the disruptor ring buffer is filled in client
>>>>> side. Not sure why this happens.
>>>>>
>>>>>
>>>>> *Maninda Edirisooriya*
>>>>> Senior Software Engineer
>>>>>
>>>>> *WSO2, Inc.*lean.enterprise.middleware.
>>>>>
>>>>> *Blog* : http://maninda.blogspot.com/
>>>>> *E-mail* : [email protected]
>>>>> *Skype* : @manindae
>>>>> *Twitter* : @maninda
>>>>>
>>>>
>>>>
>>>> --
>>>> Sent from iPhone
>>>>
>>>
>>>
>>
>>
>> --
>> *Sinthuja Rajendran*
>> Associate Technical Lead
>> WSO2, Inc.:http://wso2.com
>>
>> Blog: http://sinthu-rajan.blogspot.com/
>> Mobile: +94774273955
>>
>>
>>
>
>
> --
> *Sinthuja Rajendran*
> Associate Technical Lead
> WSO2, Inc.:http://wso2.com
>
> Blog: http://sinthu-rajan.blogspot.com/
> Mobile: +94774273955
>
>
>

Smart_Home.car
Description: Binary data

_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Re: [Dev] New data publisher is hanging

Reply via email to