Hong,
    Nice link to the parallel threads issue - very timely and useful as we just 
put in the replicaSet workaround to 3 yesterday to fix an issue running only on 
one core.
    Will look more into the logstash config as well - the issue is we baseline 
at 30 logs/sec on an idle system now - so CPU usage is unavoidable - a more 
granular VM cluster will help to a point.
    That fix for the replicaCount:3 will not be sufficient - it would need an 
autoscaler and cpu limiter in the yaml
    A better fix is a switch to a DaemonSet - 1 per vm - this is in review as 
of last night.
https://jira.onap.org/browse/LOG-376
https://jira.onap.org/browse/LOG-181
https://gerrit.onap.org/r/#/c/48139/

    This hogging of all available CPUs on a particular host is also a problem 
for a couple other applications in onap - each one will require similar 
resource tuning currently occurring in the log pods.
     My main CD cluster is still 4 x 64g but a move to 9 x 16g also helps with 
the cpu granularity of the pods - the same system Gary's CD system runs.
https://git.onap.org/logging-analytics/tree/deploy/rancher
https://git.onap.org/integration/tree/deployment/heat/onap-oom

     The nodjs issue is separate from this though right.

Thank you
/michael


From: GUAN, HONG [mailto:[email protected]]
Sent: Friday, May 18, 2018 9:16 AM
To: Michael O'Brien <[email protected]>; [email protected]; 
[email protected]
Subject: RE: OOM Beijing CPU utilization

FYI

Below are what we found out about CPU Management of Logstash. 
https://discuss.elastic.co/t/cpu-management-of-logstash/99487

Before deploy 'log'(CPU 6%)

[centos@server-k8s-cluster-1node-kubernetes-master-host-afxat7 kubernetes]$ 
kubectl top node
NAME                                                     CPU(cores)   CPU%      
MEMORY(bytes)   MEMORY%
server-k8s-cluster-1node-kubernetes-node-host-645o52     312m         3%        
12273Mi         77%
server-k8s-cluster-1node-kubernetes-node-host-s891z4     1586m        19%       
4082Mi          25%
server-k8s-cluster-1node-kubernetes-node-host-6v5ip2     531m         6%        
2278Mi          14%
server-k8s-cluster-1node-kubernetes-master-host-afxat7   124m         1%        
2933Mi          18%
server-k8s-cluster-1node-kubernetes-node-host-vpsi6z     197m         2%        
12344Mi         78%

After deploy 'log' (CPU 97%)
[centos@server-k8s-cluster-1node-kubernetes-master-host-afxat7 kubernetes]$ 
kubectl get pod -n onap -o wide
NAME                                            READY     STATUS    RESTARTS   
AGE       IP           NODE
onap-appc-appc-0                                2/2       Running   0          
15h       10.47.0.8    server-k8s-cluster-1node-kubernetes-node-host-645o52
onap-appc-appc-cdt-7878d75dd8-nmhld             1/1       Running   0          
15h       10.36.0.3    server-k8s-cluster-1node-kubernetes-node-host-s891z4
onap-appc-appc-db-0                             2/2       Running   0          
15h       10.42.0.4    server-k8s-cluster-1node-kubernetes-node-host-6v5ip2
onap-appc-appc-dgbuilder-989bc9898-prbzg        1/1       Running   0          
15h       10.36.0.4    server-k8s-cluster-1node-kubernetes-node-host-s891z4
onap-consul-consul-6d9946f754-2qv8g             1/1       Running   0          
15h       10.42.0.5    server-k8s-cluster-1node-kubernetes-node-host-6v5ip2
onap-consul-consul-server-0                     1/1       Running   0          
15h       10.36.0.5    server-k8s-cluster-1node-kubernetes-node-host-s891z4
onap-consul-consul-server-1                     1/1       Running   0          
15h       10.42.0.6    server-k8s-cluster-1node-kubernetes-node-host-6v5ip2
onap-consul-consul-server-2                     1/1       Running   0          
15h       10.47.0.9    server-k8s-cluster-1node-kubernetes-node-host-645o52
onap-log-log-elasticsearch-f4cdbb4b8-d8kgd      1/1       Running   0          
5m        10.36.0.8    server-k8s-cluster-1node-kubernetes-node-host-s891z4
onap-log-log-kibana-9f8768474-pps9r             1/1       Running   0          
5m        10.42.0.8    server-k8s-cluster-1node-kubernetes-node-host-6v5ip2
onap-log-log-logstash-7dd49fd4d-7vhhs           1/1       Running   0          
5m        10.42.0.9    server-k8s-cluster-1node-kubernetes-node-host-6v5ip2
onap-log-log-logstash-7dd49fd4d-l5thf           1/1       Running   0          
5m        10.36.0.7    server-k8s-cluster-1node-kubernetes-node-host-s891z4
onap-log-log-logstash-7dd49fd4d-sllqv           1/1       Running   0          
5m        10.47.0.11   server-k8s-cluster-1node-kubernetes-node-host-645o52
onap-msb-kube2msb-69b4cfb74d-sxc47              1/1       Running   0          
15h       10.42.0.3    server-k8s-cluster-1node-kubernetes-node-host-6v5ip2
onap-msb-msb-consul-b946c8486-dcbm9             1/1       Running   0          
15h       10.36.0.1    server-k8s-cluster-1node-kubernetes-node-host-s891z4

[centos@server-k8s-cluster-1node-kubernetes-master-host-afxat7 kubernetes]$ 
kubectl top node
NAME                                                     CPU(cores)   CPU%      
MEMORY(bytes)   MEMORY%
server-k8s-cluster-1node-kubernetes-node-host-645o52     971m         12%       
12452Mi         78%
server-k8s-cluster-1node-kubernetes-node-host-s891z4     825m         10%       
5182Mi          32%
server-k8s-cluster-1node-kubernetes-node-host-6v5ip2     7807m        97%       
4354Mi          27%
server-k8s-cluster-1node-kubernetes-master-host-afxat7   158m         1%        
2952Mi          18%
server-k8s-cluster-1node-kubernetes-node-host-vpsi6z     213m         2%        
12461Mi         78%
[centos@server-k8s-cluster-1node-kubernetes-master-host-afxat7 kubernetes]$

Thanks,
Hong

From: 
[email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of OBRIEN, FRANK MICHAEL
Sent: Friday, May 18, 2018 12:03 AM
To: [email protected]<mailto:[email protected]>; 
[email protected]<mailto:[email protected]>
Subject: Re: [onap-discuss] OOM Beijing CPU utilization

Hi,
   I have seen this 3 times from Dec to March - tracking this nodejs issue via 
OOM-834 (not an OOM issue) - last saw it 27th March under 1.8.10 (current 
version) - but running helm 2.6.1 (current version 2.8.2)
   
https://jira.onap.org/browse/OOM-834<https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.onap.org_browse_OOM-2D834&d=DwMFAg&c=LFYZ-o9_HUMeMTSQicvjIg&r=bUW1yd5b4djZ_J3L_jlK2A&m=L0JQnOxKvCvyKzAvkkzLD91rQxughYCQ5gUi3H9258c&s=obo4K1OVBv0H0CsRoXCG0T10rOeUddAbX9jRKXDr4nM&e=>

   Something in the infrastructure is causing this - as I have seen it on an 
idle kubernetes cluster (no onap pods installed)
   Will look again through the k8s jiras

   You are correct - it is not the .ru crypto miner that targets 10250/pods or 
the new one that targets a cluster without oauth lockdown
    Tracking anti-crypto here
    
https://jira.onap.org/browse/LOG-353<https://urldefense.proofpoint.com/v2/url?u=https-3A__jira.onap.org_browse_LOG-2D353&d=DwMFAg&c=LFYZ-o9_HUMeMTSQicvjIg&r=bUW1yd5b4djZ_J3L_jlK2A&m=L0JQnOxKvCvyKzAvkkzLD91rQxughYCQ5gUi3H9258c&s=h__wOfz1ALTKTbE7kUVCnNWH0kWksFXAj8A7nB-eZBQ&e=>

    I think I will ask for 5 min to go over the lockdown of clusters with the 
security subcommittee - the oauth lockdown will cover off 10249-10255 as well.

   /michael

From: 
[email protected]<mailto:[email protected]> 
[mailto:[email protected]] On Behalf Of 
[email protected]<mailto:[email protected]>
Sent: Thursday, May 17, 2018 7:28 PM
To: [email protected]<mailto:[email protected]>
Subject: [onap-discuss] OOM Beijing CPU utilization

Hi,
I have a running OOM ONAP Beijing deployment on 2 nodes.

After a few days running OK, i noticed around 100% CPU on all 16 vCPUs on the 
1st node.

I see a process nodejs running with 815% CPU as shown below.

What is this process doing ?

I checked for mining, and there's none, and I have port 10250 blocked, I don't 
see any suspicious processes.

I had to kill the nodejs process in order to regain interactivity with my onap 
deployment.

Thanks.

root@olc-oom-bjng:~# top
top - 22:58:14 up 13 days,  1:28,  1 user,  load average: 53.66, 49.26, 48.69
Tasks: 1181 total,   1 running, 1175 sleeping,   0 stopped,   5 zombie
%Cpu(s): 84.0 us, 14.9 sy,  0.1 ni,  0.4 id,  0.0 wa,  0.0 hi,  0.2 si,  0.3 st
KiB Mem : 10474657+total,  1037308 free, 88390000 used, 15319272 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 14242952 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
20119 root      20   0 1431420  65876   1124 S 815.4  0.1  34465:04 nodjs -c 
/bin/config.json


_________________________________________________________________________________________________________________________



Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc

pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler

a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,

Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.



This message and its attachments may contain confidential or privileged 
information that may be protected by law;

they should not be distributed, used or copied without authorisation.

If you have received this email in error, please notify the sender and delete 
this message and its attachments.

As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.

Thank you.
This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,
you may review at 
https://www.amdocs.com/about/email-disclaimer<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.amdocs.com_about_email-2Ddisclaimer&d=DwMFAg&c=LFYZ-o9_HUMeMTSQicvjIg&r=bUW1yd5b4djZ_J3L_jlK2A&m=L0JQnOxKvCvyKzAvkkzLD91rQxughYCQ5gUi3H9258c&s=Vih3vcwvszxLxdm1rniV2a1QyQyfBd_5TXeUbBcY3NM&e=>
This message and the information contained herein is proprietary and 
confidential and subject to the Amdocs policy statement,

you may review at https://www.amdocs.com/about/email-disclaimer 
<https://www.amdocs.com/about/email-disclaimer>
_______________________________________________
onap-discuss mailing list
[email protected]
https://lists.onap.org/mailman/listinfo/onap-discuss

Reply via email to