Re: Logging seems to be working, but no logs are collected

Tim Dudgeon Tue, 31 Oct 2017 11:29:22 -0700

On 31/10/2017 18:15, Rich Megginson wrote:

Very strange. It would appear that fluentd was not able to keep upwith the log rate to the journal for such an extent that the fluentdcurrent cursor position was rotated away . . .

That would be strange - the nodes (4 of them) and fluentd have beenrunning for about 3 days and have been 99.9% idle over that period.

You can "reset" fluentd by shutting it down, then removing that cursorfile.

Do you mean shut down the pods? Won't the daemon set immediately re-create?

That will tell fluentd to start reading from the tail of the journal.but NOTE - THAT WILL LOSE ALL RECORDS CURRENTLY IN THE JOURNAL. If youwant to try to recover everything in the journal, then oc set envds/logging-fluentd JOURNAL_READ_FROM_HEAD=true - but note that thismay take several hours until you have recent records in Elasticsearch,depending on what is the log rate to the journal and how fast fluentdcan keep up.
If you go the JOURNAL_READ_FROM_HEAD=true route, setting the envshould trigger a redeployment of fluentd, so you should not have torestart/relabel.
oc label node --all --overwrite logging-infra-fluentd-
... wait for oc pods to report no logging-fluentd pods ...
rm -f /var/log/journal.pos
oc label node --all --overwrite logging-infra-fluentd=true

Then, monitor fluentd like this:
https://github.com/openshift/origin-aggregated-logging/blob/master/hack/testing/entrypoint.sh#L56
and monitor the journald log rate (number of logs/minute) like this:
https://github.com/openshift/origin-aggregated-logging/blob/master/hack/testing/entrypoint.sh#L70

Will try that. This is just a test system so I'm not concerned aboutkeeping the logfile data, but I might try both approaches to gainexperience.


Thanks for your help.


On 10/31/2017 11:57 AM, Tim Dudgeon wrote:

$ sudo docker info | grep -i log
WARNING: Usage of loopback devices is strongly discouraged forproduction use. Use `--storage-opt dm.thinpooldev` to specify acustom block storage device.
Logging Driver: journald
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled

$ journalctl -r -n 1 --show-cursor
-- Logs begin at Sun 2017-10-29 03:04:42 UTC, end at Tue 2017-10-3117:54:37 UTC. --Oct 31 17:54:37 worker-1.openstacklocal dockerd-current[6135]:{"type":"response","@timestamp":"2017-10-31T17:54:37Z","tags":[],"pid":8,"-- cursor:s=f746c7090d724f5ab0ece0d13683fc53;i=a54f2;b=93b6daa912044dd9ae9f05521c603efc;m=55116ad995;t=55cdb72d7c92d;x=5a16032caedc4423
On 31/10/2017 17:31, Rich Megginson wrote:
# docker info | grep -i log

# journalctl -r -n 1 --show-cursor


On 10/31/2017 11:12 AM, Tim Dudgeon wrote:
Thanks. Those links are useful.
It looks to me like its a problem at the fluentd level. This iswhat I see on on of the fluentd pods:
sh-4.2# cat /var/log/es-containers.log.pos
cat: /var/log/es-containers.log.pos: No such file or directory
sh-4.2# cat /var/log/journal.pos
s=52fdd277f90749b0a442c78739b1efa7;i=50d69;b=2a3f1736a1a1486d83f95db719fdc281;m=5465b53fd1;t=55cdac4738846;x=85596f3f5f5a27e4sh-4.2#
sh-4.2# journalctl -c `cat /var/log/journal.pos`
No journal files were found.
-- No entries --
Which might sort of explain why everything is running but no logsare being processed.
This is based on a centos7 image with only the necessary openshiftpackages installed and then openshift installed using ansible. Thelogging setup in the inventory file is this:
openshift_hosted_logging_deployer_version=v3.6.0
openshift_hosted_logging_deploy=true
openshift_hosted_logging_storage_kind=nfs
openshift_hosted_logging_storage_access_modes=['ReadWriteOnce']
openshift_hosted_logging_storage_nfs_directory=/exports
openshift_hosted_logging_storage_nfs_options='*(rw,root_squash)'
openshift_hosted_logging_storage_volume_name=logging
openshift_hosted_logging_storage_volume_size=10Gi
openshift_hosted_logging_storage_labels={'storage': 'logging'}


Tim


On 31/10/2017 16:37, Jeff Cantrill wrote:
Please provide additional information, logs, etc or post theoutput of [1] someplace for review. Additionally, considerreviewing [2].
[1]https://github.com/openshift/origin-aggregated-logging/blob/master/hack/logging-dump.sh[2]https://github.com/openshift/origin-aggregated-logging/blob/master/docs/checking-efk-health.md
On Tue, Oct 31, 2017 at 11:47 AM, Tim Dudgeon<tdudgeon...@gmail.com <mailto:tdudgeon...@gmail.com>> wrote:
    Hi All,

    I've deployed logging using the ansible installer (v3.6.0) for a
    fairly simple openshift setup and everything appears to running:

    NAME              READY STATUS RESTARTS   AGE
    logging-curator-1-gvh73              1/1 Running 24         3d
logging-es-data-master-xz0e7a0c-1-deploy 0/1 Error0 3d logging-es-data-master-xz0e7a0c-4-deploy 0/1 Error0 3d logging-es-data-master-xz0e7a0c-5-deploy 0/1 Error0 3d logging-es-data-master-xz0e7a0c-7-t4xpf 1/1 Running0 3d
    logging-fluentd-4rm2w              1/1 Running 0 3d
    logging-fluentd-8h944              1/1 Running 0 3d
    logging-fluentd-n00bn              1/1 Running 0 3d
    logging-fluentd-vt8hh              1/1 Running 0 3d
    logging-kibana-1-g7l4z              2/2 Running 0 3d

    (the failed pods were related to getting elasticsearch running,
    but that was resolved).

    The problem is that I don't see any logs in Kibana. When I look
    in the fluentd pod logs I see lots of stuff like this:

    2017-10-31 13:53:15 +0000 [warn]: no patterns matched
    tag="journal.system"
    2017-10-31 13:58:02 +0000 [warn]: no patterns matched
    tag="kubernetes.journal.container"
    2017-10-31 14:02:18 +0000 [warn]: no patterns matched
    tag="journal.system"
    2017-10-31 14:07:15 +0000 [warn]: no patterns matched
    tag="journal.system"
    2017-10-31 14:11:20 +0000 [warn]: no patterns matched
    tag="journal.system"
    2017-10-31 14:15:16 +0000 [warn]: no patterns matched
    tag="journal.system"
    2017-10-31 14:19:58 +0000 [warn]: no patterns matched
    tag="journal.system"

    Is this the cause, and if so what is wrong?
    If not how to debug this?

    Tim



    _______________________________________________
    users mailing list
    users@lists.openshift.redhat.com
    <mailto:users@lists.openshift.redhat.com>
http://lists.openshift.redhat.com/openshiftmm/listinfo/users
<http://lists.openshift.redhat.com/openshiftmm/listinfo/users>




--
--
Jeff Cantrill
Senior Software Engineer, Red Hat Engineering
OpenShift Integration Services
Red Hat, Inc.
*Office*: 703-748-4420 | 866-546-8970 ext. 8162420
jcant...@redhat.com <mailto:jcant...@redhat.com>
http://www.redhat.com
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users



_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Re: Logging seems to be working, but no logs are collected

Reply via email to