Re: [Dev] ISSUE with load on analytics worker

Bernard Paris Tue, 19 Mar 2019 08:13:10 -0700

Hi,

I dropped all Analytics tables in the database in order to re-use these DBs to 
run/test the WSO2 Stream Processor 4.4.0-alpha just released this week.


I can see now the behavior of this new software is totally different:  when 
sending lot of requests to EI the CPU load from analytics grows for one or two 
percents (instead of 100%) then decreases to almost nothing when traffic stops  
(instead of staying at 100%)
This sounds very good !

So this new SP version is really a good thing.

Bernard



Le 14 mars 2019 à 11:22, Bernard Paris 
<[email protected]<mailto:[email protected]>> a écrit :


We have 2 remote ESB EI 6.4.0 instances in a cluster sending datas to analytics 
worker.
Here is what I can reproduce

>  start Analytics worker (EI package 6.4.0) with options  -Xms6G -Xmx6G
I tested the worker with both java versions "1.8.0_162"  and Java(TM) SE 
Runtime Environment (build 1.8.0_192-ea-b02)

>  first I send some very low data trafic  to ESBs …  analytics CPU load keeps 
> very low  -> ok

> (repeat this 4 or 5 times) sending some more datas to EI-6.4.0 ESBs;  
> analytics CPU load grows to 100% when receiving datas from ESBs then go back 
> to very low when the trafic is stopped  —> pretty good

Running jstat with interval 60s:

Timestamp         S0     S1     E      O      M     CCS    YGC     YGCT    FGC  
  FGCT     GCT
           36.8 100.00   0.00  31.50   5.00  88.04  80.26      4    0.648     3 
   0.306    0.955
           96.9 100.00   0.00  38.30   5.00  88.04  80.26      4    0.648     3 
   0.306    0.955
          156.9 100.00   0.00  40.12   5.00  88.04  80.26      4    0.648     3 
   0.306    0.955


> now sending lot of trafic to EI-6.4.0
>  analytics CPU load grows to 100%
>  works ok for some minutes
>  then problems are coming:  first, the drift connection is lost on both EI 
> ESB (the worker no more responding)

[2019-03-14 10:47:35,457] [-1] [] 
[DataBridge-ReconnectionService-pool-5-thread-1]  WARN 
{org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup} -  No receiver is 
reachable at reconnection, will try to reconnect every 30 sec
[2019-03-14 10:48:58,795] [-1] [] 
[DataBridge-tcp://10.1.3.12:7612-pool-7-thread-1] ERROR 
{org.wso2.carbon.databridge.agent.endpoint.DataEndpoint} -  Unable to send 
events to the endpoint.
org.wso2.carbon.databridge.agent.exception.DataEndpointException: Cannot send 
Events
at 
org.wso2.carbon.databridge.agent.endpoint.thrift.ThriftDataEndpoint.send(ThriftDataEndpoint.java:88)
at 
org.wso2.carbon.databridge.agent.endpoint.DataEndpoint$EventPublisher.publish(DataEndpoint.java:314)



> I stop the trafic to  EI-6.4.0, analytics CPU load  do NOT decrease anymore, 
> it stays around 100%

<PastedGraphic-2.png>

> ESB cannot reconnect anymore to the worker

[2019-03-14 11:14:06,337] [-1] [] 
[DataBridge-ReconnectionService-pool-5-thread-1]  WARN 
{org.wso2.carbon.databridge.agent.endpoint.DataEndpointGroup} -  No receiver is 
reachable at reconnection, will try to reconnect every 30 sec
[2019-03-14 11:14:36,384] [-1] [] 
[DataBridge-tcp://10.1.3.12:7612-pool-7-thread-1] ERROR 
{org.wso2.carbon.databridge.agent.endpoint.DataEndpoint} -  Unable to send 
events to the endpoint.
org.wso2.carbon.databridge.agent.exception.DataEndpointException: Cannot send 
Events
at 
org.wso2.carbon.databridge.agent.endpoint.thrift.ThriftDataEndpoint.send(ThriftDataEndpoint.java:88)
at 
org.wso2.carbon.databridge.agent.endpoint.DataEndpoint$EventPublisher.publish(DataEndpoint.java:314)
at 
org.wso2.carbon.databridge.agent.endpoint.DataEndpoint$EventPublisher.run(DataEndpoint.java:272)
at java.lang.Thread.run(Thread.java:748)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.thrift.transport.TTransportException: 
java.net.SocketTimeoutException: Read timed out




In this ugling state  jstat with interval 60s gives

Timestamp         S0     S1     E      O      M     CCS    YGC     YGCT    FGC  
  FGCT     GCT
         4837.0   0.00   9.22  10.78  45.72  88.60  80.09     27   10.479     3 
   0.306   10.785
         4897.0   0.00   9.22  37.64  45.72  88.60  80.09     27   10.479     3 
   0.306   10.785
         4957.0   0.00   9.22  59.97  45.72  88.60  80.09     27   10.479     3 
   0.306   10.785
         5017.0   0.00   9.22  85.25  45.72  88.60  80.09     27   10.479     3 
   0.306   10.785
         5077.0   9.60   0.00   9.03  47.27  88.65  80.09     28   10.728     3 
   0.306   11.034
         5137.0   9.60   0.00  32.90  47.27  88.65  80.09     28   10.728     3 
   0.306   11.034
         5197.0   9.60   0.00  55.66  47.27  88.65  80.09     28   10.728     3 
   0.306   11.034
         5257.0   9.60   0.00  76.92  47.27  88.65  80.09     28   10.728     3 
   0.306   11.034

Reading this JSTAT information I understand the GC is not working so much that 
it could explain the 100% CPU load.
The drift connections from both ESB keep lost, BUT the interface at 
https://worker_host:7443/stores/query'    keeps responding (slow but it is 
responding)


Bernard

_______________________________________________
Dev mailing list
[email protected]
http://wso2.org/cgi-bin/mailman/listinfo/dev

Re: [Dev] ISSUE with load on analytics worker

Reply via email to