Hi,
Could you post your running pod list - from the logs it looks like you may
have some stuck container deletions - these usually require manual deletion for
now as a workaround for a kubernetes issue in the current release of docker
relating to PV's
I checked my 2 long lived CD systems and posted some log stats in the jira
below - I don't see over 12Mb of syslog data - but we should definitely watch
the FS size as this will usually be our first point of failure. I have in the
past however seen a fully saturated master once.
Usually the cluster runs about 5G for the /dockerdata-nfs share, the master
will need under 60G and each cluster host around 100G to run for days under 40G
of docker downloads. The issue is also things like ONAP log files saturating
the HD - which we are keenly interested in - usually the first point of failure
of a system - a 100% used HD - so we should watch this via a JIRA
https://jira.onap.org/browse/LOG-453
on cluster.onap.info I only see 12Mb
{noformat}
[email protected]
ubuntu@ip-172-31-28-156:~$ df
Filesystem 1K-blocks Used
Available Use% Mounted on
udev 15691140 0
15691140 0% /dev
tmpfs 3139820 351036
2788784 12% /run
/dev/xvda1 81254044 65222160
16015500 81% /
tmpfs 15699096 6892
15692204 1% /dev/shm
tmpfs 5120 0
5120 0% /run/lock
tmpfs 15699096 0
15699096 0% /sys/fs/cgroup
fs-023adc1b.efs.us-west-1.amazonaws.com:/ 9007199254739968 3729408
9007199251010560 1% /dockerdata-nfs
tmpfs 3139820 0
3139820 0% /run/user/1000
-rw-r----- 1 syslog adm 8292567 Jun 4 21:07 syslog
-rw-r----- 1 syslog adm 12395747 Jun 4 06:25 syslog.1
-rw-r----- 1 syslog adm 710414 Jun 3 06:25 syslog.2.gz
-rw-r----- 1 syslog adm 700347 Jun 2 06:25 syslog.3.gz
-rw-r----- 1 syslog adm 721147 Jun 1 06:25 syslog.4.gz
-rw-r----- 1 syslog adm 636081 May 31 06:25 syslog.5.gz
-rw-r----- 1 syslog adm 373696 May 30 06:25 syslog.6.gz
-rw-r----- 1 syslog adm 109797 May 29 06:25 syslog.7.gz
Thanks for bringing this up
/michael
From: [email protected]
[mailto:[email protected]] On Behalf Of
[email protected]
Sent: Monday, June 4, 2018 6:27 AM
To: [email protected]
Subject: Re: [onap-discuss] /var/log/syslog taking up a lot of space on OOM
Beijing
Hi,
After 4 days, the installation is still stable, however, still the
var/log/syslog and /var/log/syslog.1 have reached 44 GB space again.
-rw-r----- 1 syslog adm 7043982261 Jun 4 10:23 syslog
-rw-r----- 1 syslog adm 37589664321 Jun 4 06:25 syslog.1
-rw-r----- 1 syslog adm 99405 Jun 3 06:25 syslog.2.gz
-rw-r----- 1 syslog adm 79388 Jun 2 06:25 syslog.3.gz
-rw-r----- 1 syslog adm 155344 Jun 1 06:25 syslog.4.gz
Abdelmuhaimen Seaudi
Orange Labs Egypt
Email: [email protected]<mailto:[email protected]>
Mobile: +2012 84644 733
From: SEAUDI Abdelmuhaimen OBS/CSO
Sent: Saturday, June 2, 2018 1:01 PM
To: [email protected]<mailto:[email protected]>
Subject: /var/log/syslog taking up a lot of space on OOM Beijing
Hi,
I have an OOM Beijing instance running on 1 VM for Rancher Server, and 3 VMs
Rancher hosts, each with 8 vCPUs, 52 GB RAM, 50 GB Root Parition, and 100 GB
2nd Partition for /var/lib/docker/.
After running for 2 days, the installation is stable so far, only OOM-SNIRO
gives FAIL in robot health check, and only pod onap-oof is failing, since the
installation.
However, I noticed one of the nodes taking a lot of root storage space, and I
found out it's /var/log/syslog and /var/log/syslog.1, which are taking ~43 GB
of space.
What is the reason for this behaviour ?
root@olc-bjng-2:~# free -h
total used free shared buff/cache available
Mem: 51G 25G 12G 557M 13G 24G
Swap: 0B 0B 0B
root@olc-bjng-2:~# df -h /dev/vda1 /dev/vdb
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 49G 44G 5.0G 90% / <<<<<<<<<<<<<<<<<<<< a lot
of space, not from /var/lib/docker
/dev/vdb 99G 30G 64G 32% /mnt
root@olc-bjng-2:~#
root@olc-bjng-2:/var/log# ls -l
total 42153316
...
-rw-r----- 1 syslog adm 9811749726 Jun 2 10:40 syslog
-rw-r----- 1 syslog adm 33351210056 Jun 2 06:25 syslog.1
-rw-r----- 1 syslog adm 149784 Jun 1 06:25 syslog.2.gz
I see the following lines near the top of syslog.1
Jun 1 08:58:44 olc-bjng-2 dockerd[9856]: time="2018-06-01T08:58:44.599081414Z"
level=warning msg="Unknown healthcheck type 'NONE' (expected 'CMD') in
container 6b3b60d6dde71c0ea16b698b1b7a964e53fb3dd54bdf1e198a49d51f117a2c40"
Jun 1 08:58:45 olc-bjng-2 dockerd[9856]: time="2018-06-01T08:58:45.907586780Z"
level=error msg="Handler for GET
/v1.22/containers/952333fb19fec201d9adf226847d8a3e21045d831fe73c10082fe1124271f31a/json
returned error: No such container:
952333fb19fec201d9adf226847d8a3e21045d831fe73c10082fe1124271f31a"
Jun 1 08:59:43 olc-bjng-2 dockerd[9856]: time="2018-06-01T08:59:43.372667000Z"
level=warning msg="failed to close stdin: rpc error: code = 2 desc = write
/var/run/docker/libcontainerd/containerd/4780e1142fbea7740b0eded42057be2ab17cc0998b2319596654a9a3606b9045/6f635acd5a26750d0f84ee506ed28c4b18b39a64c8994966636d138a88db8e6c/control:
bad file descriptor"
Jun 1 09:02:28 olc-bjng-2 dockerd[9856]: time="2018-06-01T09:02:28.564875759Z"
level=error msg="Handler for DELETE
/v1.27/images/sha256:b7fa6b9cb097d4be9c482f44a2ab2d84d0067b598f80dacfedc11b30feaf2fc6
returned error: conflict: unable to delete b7fa6b9cb097 (cannot be forced) -
image is being used by running container c55cddbe095c"
Jun 1 09:02:28 olc-bjng-2 dockerd[9856]: time="2018-06-01T09:02:28.568346021Z"
level=error msg="Handler for DELETE
/v1.27/images/sha256:14de771cc17886ac2b6e6eace825c8c67afb59b88804944421a5c2dbebe1ddaf
returned error: conflict: unable to delete 14de771cc178 (cannot be forced) -
image is being used by running container 1f3008e7bc83"
Jun 1 09:02:28 olc-bjng-2 dockerd[9856]: time="2018-06-01T09:02:28.570397502Z"
level=error msg="Handler for DELETE
/v1.27/images/sha256:bd33f8c865b1cefab6e876006de8542892e21205f33ecd0de82c778be72a2b39
returned error: conflict: unable to delete bd33f8c865b1 (cannot be forced) -
image is being used by running container 82a6677eaf0f"
And i see the following lines near the bottom of syslog.1
Jun 2 06:25:01 olc-bjng-2 dockerd[9856]: time="2018-06-02T06:25:01.634823245Z"
level=error msg="Failed to log msg \"\\tat
org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:419)\<file://tat%20org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:419)/>"
for logger json-file: write
/mnt/containers/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a-json.log:
no space left on device"
Jun 2 06:25:01 olc-bjng-2 dockerd[9856]: time="2018-06-02T06:25:01.635142774Z"
level=error msg="Failed to log msg \"\\tat
org.glassfish.jersey.client.InboundJaxrsResponse.readEntity(InboundJaxrsResponse.java:108)\<file://tat%20org.glassfish.jersey.client.InboundJaxrsResponse.readEntity(InboundJaxrsResponse.java:108)/>"
for logger json-file: write
/mnt/containers/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a-json.log:
no space left on device"
Jun 2 06:25:01 olc-bjng-2 dockerd[9856]: time="2018-06-02T06:25:01.635522112Z"
level=error msg="Failed to log msg \"\\tat
org.onap.usecaseui.server.util.DmaapSubscriber.getDMaaPData(DmaapSubscriber.java:112)\<file://tat%20org.onap.usecaseui.server.util.DmaapSubscriber.getDMaaPData(DmaapSubscriber.java:112)/>"
for logger json-file: write
/mnt/containers/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a-json.log:
no space left on device"
Jun 2 06:25:01 olc-bjng-2 dockerd[9856]: time="2018-06-02T06:25:01.636113243Z"
level=error msg="Failed to log msg \"\\tat
org.onap.usecaseui.server.util.DmaapSubscriber.subscribe(DmaapSubscriber.java:79)\<file://tat%20org.onap.usecaseui.server.util.DmaapSubscriber.subscribe(DmaapSubscriber.java:79)/>"
for logger json-file: write
/mnt/containers/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a-json.log:
no space left on device"
Jun 2 06:25:01 olc-bjng-2 dockerd[9856]: time="2018-06-02T06:25:01.636429653Z"
level=error msg="Failed to log msg \"\\tat
org.onap.usecaseui.server.util.DmaapSubscriber.run(DmaapSubscriber.java:136)\<file://tat%20org.onap.usecaseui.server.util.DmaapSubscriber.run(DmaapSubscriber.java:136)/>"
for logger json-file: write
/mnt/containers/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a-json.log:
no space left on device"
Jun 2 06:25:01 olc-bjng-2 dockerd[9856]: time="2018-06-02T06:25:01.636802049Z"
level=error msg="Failed to log msg \"\\tat
org.onap.usecaseui.server.UsecaseuiServerApplication.main(UsecaseuiServerApplication.java:44)\<file://tat%20org.onap.usecaseui.server.UsecaseuiServerApplication.main(UsecaseuiServerApplication.java:44)/>"
for logger json-file: write
/mnt/containers/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a/36441e295ace461c6c1e5a2a60f8004272c056619d381846349ce47c8900385a-json.log:
no space left on device"
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou
falsifie. Merci.
This message and its attachments may contain confidential or privileged
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been
modified, changed or falsified.
Thank you.
This message and the information contained herein is proprietary and
confidential and subject to the Amdocs policy statement,
you may review at https://www.amdocs.com/about/email-disclaimer
<https://www.amdocs.com/about/email-disclaimer>
_______________________________________________
onap-discuss mailing list
[email protected]
https://lists.onap.org/mailman/listinfo/onap-discuss