Re: Issues with logging and metrics on Origin 3.7

2018-01-09 Thread Eric Wolinetz
On Mon, Jan 8, 2018 at 12:04 PM, Tim Dudgeon <tdudgeon...@gmail.com> wrote:

> Ah, so that makes more sense.
>
> So can I define the persistence properties (e.g. using nfs) in the
> inventory file, but specify 'openshift_metrics_install_metrics=false' and
> then run the byo/config.yml  playbook will that create the PVs, but not
> deploy metrics. Then I can later run the 
> byo/openshift-cluster/openshift-metrics.yml
> to actually deploy the metrics.
>

Correct!


> The reason I'm doing this in 2 stages is that I sometimes hit 'Unable to
> allocate memory' problems when trying to deploy everything with
> byo/config.yml (possibly due to the 'forks' setting in ansible.cfg).
>
>
>
> On 08/01/18 17:49, Eric Wolinetz wrote:
>
> I think the issue you're seeing stems from the fact that the logging and
> metrics playbooks to not create their own PVs. That is handled by the
> cluster install playbook.
> The logging and metrics playbooks only create the PVCs that their objects
> may require (unless ephemeral storage is configured).
>
> I admit, the naming of the variables makes that confusing however it is
> described in our docs umbrella'd under the advanced install section which
> uses the cluster playbook...
> https://docs.openshift.com/container-platform/3.7/install_config/install/
> advanced_install.html#advanced-install-cluster-metrics
>
> On Mon, Jan 8, 2018 at 11:22 AM, Tim Dudgeon <tdudgeon...@gmail.com>
> wrote:
>
>> On 08/01/18 16:51, Luke Meyer wrote:
>>
>>
>>
>> On Thu, Jan 4, 2018 at 10:39 AM, Tim Dudgeon <tdudgeon...@gmail.com>
>> wrote:
>>
>>> I'm hitting a number of issues with installing logging and metrics on
>>> Origin 3.7.
>>> This is using Centos7 hosts, the release-3.7 branch of openshift-ansible
>>> and NFS for persistent storage.
>>>
>>> I first do a minimal deploy with logging and metrics turned off.
>>> This goes fine. On the NFS server I see various volumes exported under
>>> /exports for logging, metrics, prometheus, even thought these are not
>>> deployed, but that's fine,  they are there if they become needed.
>>> As epxected there are no PVs related to metrics and logging.
>>>
>>> So I try to install metrics. I add this to the inventory file:
>>>
>>> openshift_metrics_install_metrics=true
>>> openshift_metrics_storage_kind=nfs
>>> openshift_metrics_storage_access_modes=['ReadWriteOnce']
>>> openshift_metrics_storage_nfs_directory=/exports
>>> openshift_metrics_storage_nfs_options='*(rw,root_squash)'
>>> openshift_metrics_storage_volume_name=metrics
>>> openshift_metrics_storage_volume_size=10Gi
>>> openshift_metrics_storage_labels={'storage': 'metrics'}
>>>
>>> and run:
>>>
>>> ansible-playbook openshift-ansible/playbooks/by
>>> o/openshift-cluster/openshift-metrics.yml
>>>
>>> All seems to install OK, but metrics can't start, and it turns out that
>>> no PV is created so the PVC needed by Casandra can't be satisfied.
>>> So I manually create the PV using this definition:
>>>
>>> apiVersion: v1
>>> kind: PersistentVolume
>>> metadata:
>>>   name: metrics-pv
>>>   labels:
>>> storage: metrics
>>> spec:
>>>   capacity:
>>> storage: 10Gi
>>>   accessModes:
>>> - ReadWriteOnce
>>>   persistentVolumeReclaimPolicy: Recycle
>>>   nfs:
>>> path: /exports/metrics
>>> server: nfsserver
>>>
>>> Now the PVC is satisfied and metrics can be started (though pods may
>>> need to be bounced because they have timed out).
>>>
>>> ISSUE 1: why does the metrics PV not get created?
>>>
>>>
>>> So now on to trying to install logging. The approach is similar. Add
>>> this to the inventory file:
>>>
>>> openshift_logging_install_logging=true
>>> openshift_logging_storage_kind=nfs
>>> openshift_logging_storage_access_modes=['ReadWriteOnce']
>>> openshift_logging_storage_nfs_directory=/exports
>>> openshift_logging_storage_nfs_options='*(rw,root_squash)'
>>> openshift_logging_storage_volume_name=logging
>>> openshift_logging_storage_volume_size=10Gi
>>> openshift_logging_storage_labels={'storage': 'logging'}
>>>
>>> and run:
>>> ansible-playbook openshift-ansible/playbooks/by
>>> o/openshift-cluster/openshift-logging.yml
>>>
>>> Logging installs fine, and is running fine. Kibana shows logs.
>>> But look at wha

Re: Can I exclude one project or one container to Origin-Aggregated-Logging system?

2017-06-05 Thread Eric Wolinetz
On Tue, May 30, 2017 at 2:55 PM, Office ME2Digtial e. U. <
al...@me2digital.eu> wrote:

> Hi Eric.
>
> Eric Wolinetz have written on Tue, 30 May 2017 11:47:32 -0500:
>
> > On Tue, May 30, 2017 at 10:46 AM, Aleksandar Lazic
> > <al...@me2digital.eu> wrote:
> >
> > > Hi.
> > >
> > > Afasik there is no option for this.
> > >
> > > Best regards
> > > Aleks
> > >
> > > "Stéphane Klein" <cont...@stephane-klein.info> schrieb am
> > > 30.05.2017:
> > >> HI,
> > >>
> > >> I just read origin-aggregated-logging
> > >> <https://github.com/openshift/origin-aggregated-logging>
> > >> documentation and I don't found if I can exclude one project or
> > >> one container to logging system.
> > >>
> > >
> > You can update your Fluentd configmap to drop the records so that they
> > aren't sent to ES.
> >
> > In the fluent.conf section you can add in the highlighted section:
> >
> > Please note the "**_" before and "_**" after the project names, this
> > is to correctly match the record pattern.
> >
> > ...
> > 
> >   
> > @type null
> >   
> > ## filters
> > ...
> >
> > You can also specify multiple projects on this match if you so desire
> > by separating the patterns with spaces:
> >   
>
> Ah you are referring to
> http://docs.fluentd.org/v0.12/articles/config-file#2-
> ldquomatchrdquo-tell-fluentd-what-to-do
>
>
Correct. Depending on your version of OpenShift logging that has been
deployed, you should be able to edit the fluent.conf file section within
the logging-fluentd configmap.
$ oc edit configmap/logging-fluentd


> thanks
>
> > >> Is it possible with a container labels? or other system?
> > >>
> > >> Best regards,
> > >> Stéphane
> > >> --
> > >> Stéphane Klein <cont...@stephane-klein.info>
> > >> blog: http://stephane-klein.info
> > >> cv : http://cv.stephane-klein.info
> > >> Twitter: http://twitter.com/klein_stephane
>
> --
> Best Regards
> Aleksandar Lazic - ME2Digital e. U.
> https://me2digital.online/
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Can I exclude one project or one container to Origin-Aggregated-Logging system?

2017-05-30 Thread Eric Wolinetz
On Tue, May 30, 2017 at 10:46 AM, Aleksandar Lazic 
wrote:

> Hi.
>
> Afasik there is no option for this.
>
> Best regards
> Aleks
>
> "Stéphane Klein"  schrieb am 30.05.2017:
>
>> HI,
>>
>> I just read origin-aggregated-logging
>>  documentation
>> and I don't found if I can exclude one project or one container to logging
>> system.
>>
>
You can update your Fluentd configmap to drop the records so that they
aren't sent to ES.

In the fluent.conf section you can add in the highlighted section:

Please note the "**_" before and "_**" after the project names, this is to
correctly match the record pattern.

...

  
@type null
  
## filters
...

You can also specify multiple projects on this match if you so desire by
separating the patterns with spaces:
  


>> Is it possible with a container labels? or other system?
>>
>> Best regards,
>> Stéphane
>> --
>> Stéphane Klein 
>> blog: http://stephane-klein.info
>> cv : http://cv.stephane-klein.info
>> Twitter: http://twitter.com/klein_stephane
>>
>> --
>>
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Kibana Logs Empty

2016-08-16 Thread Eric Wolinetz
Realized I never replied-all... Re-adding users_list

On Mon, Aug 15, 2016 at 10:58 AM, Eric Wolinetz <ewoli...@redhat.com> wrote:

> Fluentd tries to connect to both "logging-es" | "logging-es-ops" in the
> logging namespace (if you're using the ops deployment) and "kubernetes" in
> the default namespace.  I think in this case it is having trouble
> connecting to the kubernetes service to look up metadata for your
> containers.
>
>
> On Mon, Aug 15, 2016 at 10:54 AM, Frank Liauw <fr...@vsee.com> wrote:
>
>> Oh stupid me; I was confused by my own namespaces; was looking at the
>> wrong namespace, thinking that's the one with pods that have an active log
>> stream. The logs are ingested fine, thanks for your assistance! :)
>>
>> On the possible DNS issue of fluentd on one of my nodes, what hostname is
>> fluentd trying to reach when starting up? We did perform some network
>> changes to this particular node to aid public routing, but as far as the
>> routing table is concerned, it should not have made a difference for local
>> traffic.
>>
>> Normal functioning node without public routing changes
>>
>> [root@node1 network-scripts]# route -n
>> Kernel IP routing table
>> Destination Gateway Genmask Flags Metric RefUse
>> Iface
>> 0.0.0.0 10.10.0.5   0.0.0.0 UG10000
>> ens160
>> 10.1.0.00.0.0.0 255.255.0.0 U 0  00
>> tun0
>> 10.10.0.0   0.0.0.0 255.255.0.0 U 10000
>> ens160
>> 172.30.0.0  0.0.0.0 255.255.0.0 U 0  00
>> tun0
>>
>> Malfunctioning node with public routing changes
>>
>> [root@node2 network-scripts]# route -n
>> Kernel IP routing table
>> Destination Gateway Genmask Flags Metric RefUse
>> Iface
>> 0.0.0.0 199.27.105.10.0.0.0 UG10000
>> ens192
>> 10.0.0.010.10.0.5   255.0.0.0   UG10000
>> ens160
>> 10.1.0.00.0.0.0 255.255.0.0 U 0  00
>> tun0
>> 10.10.0.0   0.0.0.0 255.255.0.0 U 10000
>> ens160
>> 172.30.0.0  0.0.0.0 255.255.0.0 U 0  00
>> tun0
>> 199.27.105.00.0.0.0 255.255.255.128 U 10000
>> ens192
>>
>> Frank
>> Systems Engineer
>>
>> VSee: fr...@vsee.com <http://vsee.com/u/tmd4RB> | Cell: +65 9338 0035
>>
>> Join me on VSee for Free <http://vsee.com/u/tmd4RB>
>>
>>
>>
>>
>> On Mon, Aug 15, 2016 at 11:23 PM, Eric Wolinetz <ewoli...@redhat.com>
>> wrote:
>>
>>> Correct, the way Fluentd pulls in the logs for your other containers is
>>> the same pipeline used for collecting logs for the below shown Kibana pod.
>>>
>>> Going back to your ES logs, can you verify the date portion of a
>>> microsvc index line?
>>> We can then update time range in the upper-right corner of Kibana to
>>> change from the last hour to something like the last month (something that
>>> would encompass the date for the index).
>>>
>>>
>>> On Mon, Aug 15, 2016 at 10:15 AM, Frank Liauw <fr...@vsee.com> wrote:
>>>
>>>> Screencap is as follows:
>>>>
>>>>
>>>> The query is as simple as it gets, *. I see my namespaces / projects as
>>>> indexes.
>>>>
>>>> I see logs for logging project just fine:
>>>>
>>>>
>>>>
>>>> Fluentd is not ingesting the logs for pods in my namespaces. I'm yet to
>>>> pull apart how fluentd does that, though there's no reason why logs for my
>>>> other pods aren't getting indexed whereas kibana logs are if they are both
>>>> ingested by fluentd, assuming that kibana logs use the same pipeline as all
>>>> other pod logs.
>>>>
>>>> Frank
>>>> Systems Engineer
>>>>
>>>> VSee: fr...@vsee.com <http://vsee.com/u/tmd4RB> | Cell: +65 9338 0035
>>>>
>>>> Join me on VSee for Free <http://vsee.com/u/tmd4RB>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Aug 15, 2016 at 10:59 PM, Eric Wolinetz <ewoli...@redhat.com>
>>>> wrote:
>>>>
>>>>> Can you either send a screencap of your Kibana console? Or describe
>>>>> how you are accessing Kibana a

Re: logging-es errors: shards failed

2016-07-15 Thread Eric Wolinetz
The logging-ops instance will contain the logs from /var/log/messages* and
the "default", "openshift" and "openshift-infra" name spaces only.

On Fri, Jul 15, 2016 at 3:28 PM, Alex Wauck <alexwa...@exosite.com> wrote:

> I also tried to fetch the logs from our logging-ops ES instance.  That
> also met with failure.  Searching for "kubernetes_namespace_name: logging"
> there lead to "No results found".
>
> On Fri, Jul 15, 2016 at 2:48 PM, Peter Portante <pport...@redhat.com>
> wrote:
>
>> Well, we don't send ES logs to itself.  I think you can create a
>> feedback loop that breaks the whole thing down.
>> -peter
>>
>> On Fri, Jul 15, 2016 at 3:39 PM, Luke Meyer <lme...@redhat.com> wrote:
>> > They surely do. Although it would probably be easiest here to just get
>> them
>> > from `oc logs` against the ES pod, especially if we can't trust ES
>> storage.
>> >
>> > On Fri, Jul 15, 2016 at 3:26 PM, Peter Portante <pport...@redhat.com>
>> wrote:
>> >>
>> >> Eric, Luke,
>> >>
>> >> Do the logs from the ES instance itself flow into that ES instance?
>> >>
>> >> -peter
>> >>
>> >> On Fri, Jul 15, 2016 at 12:14 PM, Alex Wauck <alexwa...@exosite.com>
>> >> wrote:
>> >> > I'm not sure that I can.  I clicked the "Archive" link for the
>> >> > logging-es
>> >> > pod and then changed the query in Kibana to
>> "kubernetes_container_name:
>> >> > logging-es-cycd8veb && kubernetes_namespace_name: logging".  I got no
>> >> > results, instead getting this error:
>> >> >
>> >> > Index:
>> unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.12
>> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
>> (queue
>> >> > capacity 1000) on
>> >> >
>> >> >
>> org.elasticsearch.search.action.SearchServiceTransportAction$23@6b1f2699]
>> >> > Index:
>> unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.14
>> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
>> (queue
>> >> > capacity 1000) on
>> >> >
>> >> >
>> org.elasticsearch.search.action.SearchServiceTransportAction$23@66b9a5fb]
>> >> > Index:
>> unrelated-project.92c37428-11f6-11e6-9c83-020b5091df01.2016.07.15
>> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
>> (queue
>> >> > capacity 1000) on
>> >> >
>> org.elasticsearch.search.action.SearchServiceTransportAction$23@512820e]
>> >> > Index:
>> unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.29
>> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
>> (queue
>> >> > capacity 1000) on
>> >> >
>> >> >
>> org.elasticsearch.search.action.SearchServiceTransportAction$23@3dce96b9]
>> >> > Index:
>> unrelated-project.f38ac6ff-3e42-11e6-ab71-020b5091df01.2016.06.30
>> >> > Shard: 2 Reason: EsRejectedExecutionException[rejected execution
>> (queue
>> >> > capacity 1000) on
>> >> >
>> >> >
>> org.elasticsearch.search.action.SearchServiceTransportAction$23@2f774477]
>> >> >
>> >> > When I initially clicked the "Archive" link, I saw a lot of messages
>> >> > with
>> >> > the kubernetes_container_name "logging-fluentd", which is not what I
>> >> > expected to see.
>> >> >
>> >> >
>> >> > On Fri, Jul 15, 2016 at 10:44 AM, Peter Portante <
>> pport...@redhat.com>
>> >> > wrote:
>> >> >>
>> >> >> Can you go back further in the logs to the point where the errors
>> >> >> started?
>> >> >>
>> >> >> I am thinking about possible Java HEAP issues, or possibly ES
>> >> >> restarting for some reason.
>> >> >>
>> >> >> -peter
>> >> >>
>> >> >> On Fri, Jul 15, 2016 at 11:37 AM, Lukáš Vlček <lvl...@redhat.com>
>> >> >> wrote:
>> >> >> > Also looking at this.
>> >> >> > Alex, is it possible to investigate if you were having some kind
>> of
>> >> >> > network 

Re: logging-es errors: shards failed

2016-07-15 Thread Eric Wolinetz
Adding Lukas and Peter

On Fri, Jul 15, 2016 at 8:07 AM, Luke Meyer  wrote:

> I believe the "queue capacity" there is the number of parallel searches
> that can be queued while the existing search workers operate. It sounds
> like it has plenty of capacity there and it has a different reason for
> rejecting the query. I would guess the data requested is missing given it
> couldn't fetch shards it expected to.
>
> The number of shards is a multiple (for redundancy) of the number of
> indices, and there is an index created per project per day. So even for a
> small cluster this doesn't sound out of line.
>
> Can you give a little more information about your logging deployment? Have
> you deployed multiple ES nodes for redundancy, and what are you using for
> storage? Could you attach full ES logs? How many OpenShift nodes and
> projects do you have? Any history of events that might have resulted in
> lost data?
>
> On Thu, Jul 14, 2016 at 4:06 PM, Alex Wauck  wrote:
>
>> When doing searches in Kibana, I get error messages similar to "Courier
>> Fetch: 919 of 2020 shards failed".  Deeper inspection reveals errors like
>> this: "EsRejectedExecutionException[rejected execution (queue capacity
>> 1000) on
>> org.elasticsearch.search.action.SearchServiceTransportAction$23@14522b8e
>> ]".
>>
>> A bit of investigation lead me to conclude that our Elasticsearch server
>> was not sufficiently powerful, but I spun up a new one with four times the
>> CPU and RAM of the original one, but the queue capacity is still only
>> 1000.  Also, 2020 seems like a really ridiculous number of shards.  Any
>> idea what's going on here?
>>
>> --
>>
>> Alex Wauck // DevOps Engineer
>>
>> *E X O S I T E*
>> *www.exosite.com *
>>
>> Making Machines More Human.
>>
>>
>> ___
>> users mailing list
>> users@lists.openshift.redhat.com
>> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>>
>>
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: ENABLE_OPS_CLUSTER

2016-06-15 Thread Eric Wolinetz
On Wed, Jun 15, 2016 at 4:01 PM, Srinivas Naga Kotaru (skotaru) <
skot...@cisco.com> wrote:

> Hi
>
>
>
> While deploying EFK stack, I didn’t toggled ENABLE_OPS_CLUSTER to true.
> Default value if “false”
>
>
>
> Now my EFK stack is fully installed and working fine. Is there any way we
> can enable OPS logs without deleting whole stack and re create ?
>

If you do not need to have physical separation of your operations logs and
your application logs you can leave it with ENABLE_OPS_CLUSTER as false.
Setting that to true don't add any extra logs, it just creates a second
Elasticsearch cluster (the ops cluster) an Ops Kibana instance to serve up
the logs within the Elasticsearch ops cluster and tells Fluentd that the
operations logs that it is processing go to this new cluster instead.

To be honest, I would recommend reinstalling with ENABLE_OPS_CLUSTER=true
and tricking Fluentd to reprocess all your logs as if it were a new
installation.  You are missing the ops templates for the different
components which will come in handy especially when you want to later scale
up the number of ES nodes for a cluster.

Also you have the added benefit that some of your operations logs aren't in
the same ES cluster as your application logs (the main benefit for using
this deployment option)

You can trick Fluentd into reprocessing logs on its node by
1. Stop Fluentd on that node
2. Delete the "/var/log/es-containers.log.pos" and "/var/log/node.log.pos"
files on that node
3. Start Fluentd on that node again, it will act as if it had not processed
any log files yet


>
>
>
>
> --
>
> *Srinivas Kotaru*
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: OpenShift Origin: Kibana show indexes with additional string

2016-06-03 Thread Eric Wolinetz
Hi Den,

This is to address a security bug where if a user created an project and
deleted it, if another user recreated the project of the same name they
would be able to view the previous project's logs.

Due to the nature of Kibana and how our ACLs are enforced, we now include
the project's UUID as part of the project name.

It was described, albeit hidden, in the manual upgrade steps of Aggregated
logging here:
https://docs.openshift.org/latest/install_config/upgrading/manual_upgrades.html#manual-upgrading-efk-logging-stack

On Fri, Jun 3, 2016 at 1:26 AM, Den Cowboy  wrote:

> I've set up aggregated logging on OpenShift Origin.
> It works fine but my Kibana show my projects in the following way:
>
>
>- .all
>- .operations.*
>- dev-proj1.1dd68e1e-1e8c-11e6-baf9-064081126234.*
>- dev-proj2.3cdda625-1da8-11e6-baf9-064081126234.*
>- dev-proj3.5e94ef86-1cf0-11e6-baf9-064081126234.*
>- dev-proj4.6cfe3919-28c8-11e6-8b8f-064081126234.*
>- dev-proj5.728abf75-019c-11e6-8b8f-064081126234.*
>
>
> While it was shown this way (when we were using origin 1.1)
>
>
>- .all
>- .operations.*
>- dev-proj1.*
>- dev-proj2.*
>- dev-proj3.*
>- dev-proj4.*
>- dev-proj5.*
>
>
> What's the reason for this behaviour?
>
> Thanks in advance.
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: aggregate logging: kibana error

2016-04-20 Thread Eric Wolinetz
Hi Sebastian,

Your Elasticsearch instance does not seem to have started up completely
within the pod you showed logs for.  Kibana will fail to start up if it is
unable to reach its Elasticsearch instance after a certain period of time.

Can you send some more of your Elasticsearch logs?  It looks like its
currently recovering/initializing.  Do you see any different ERROR messages
within there?

Your Fluentd errors looks to be something else. What does the following
look like?
$ oc describe pod -l component=fluentd

On Wed, Apr 20, 2016 at 2:54 AM, Sebastian Wieseler <
sebast...@myrepublic.com.sg> wrote:

> Dear community,
> I followed the guide
> https://docs.openshift.org/latest/install_config/aggregate_logging.html
>
> NAME  READY STATUSRESTARTS   AGE
> logging-kibana-1-uwob11/2   Error 12 43m
>
>
> $ oc logs logging-kibana-1-uwob1  -c kibana
> {"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":50,"err":{"message":"Request
> Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after
> 5000ms\nat null.
> (/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
>   at Timer.listOnTimeout [as ontimeout]
> (timers.js:112:15)"},"msg":"","time":"2016-04-20T07:16:15.760Z","v":0}
> {"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":60,"err":{"message":"Request
> Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after
> 5000ms\nat null.
> (/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
>   at Timer.listOnTimeout [as ontimeout]
> (timers.js:112:15)"},"msg":"","time":"2016-04-20T07:16:15.762Z","v":0}
> [root@MRNZ-TS8-OC-MASTER-01 glusterfs]# oc logs
> logging-kibana-1-uwob1  -c kibana
> {"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":50,"err":{"message":"Request
> Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after
> 5000ms\nat null.
> (/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
>   at Timer.listOnTimeout [as ontimeout]
> (timers.js:112:15)"},"msg":"","time":"2016-04-20T07:38:40.789Z","v":0}
> {"name":"Kibana","hostname":"logging-kibana-1-uwob1","pid":7,"level":60,"err":{"message":"Request
> Timeout after 5000ms","name":"Error","stack":"Error: Request Timeout after
> 5000ms\nat null.
> (/opt/app-root/src/src/node_modules/elasticsearch/src/lib/transport.js:282:15)\n
>   at Timer.listOnTimeout [as ontimeout]
> (timers.js:112:15)"},"msg":"","time":"2016-04-20T07:38:40.790Z","v":0}
>
>
> Elastic search pod is running, but the log shows:
> [2016-04-20
> 06:57:03,910][ERROR][io.fabric8.elasticsearch.plugin.acl.DynamicACLFilter]
> [Baphomet] Exception encountered when seeding initial ACL
> org.elasticsearch.cluster.block.ClusterBlockException: blocked by:
> [SERVICE_UNAVAILABLE/1/state not recovered / initialized];
> at
> org.elasticsearch.cluster.block.ClusterBlocks.globalBlockedException(ClusterBlocks.java:151)
>
>   at 
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction.checkGlobalBlock(TransportShardSingleOperationAction.java:103)
> at
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:132)
> at
> org.elasticsearch.action.support.single.shard.TransportShardSingleOperationAction$AsyncSingleAction.(TransportShardSingleOperationAction.java:116)
>
> Fluent pod is running too, but the log shows:
> 2016-04-20 07:47:18 + [error]: fluentd main process died unexpectedly.
> restarting.
> 2016-04-20 07:47:48 + [error]: unexpected error error="getaddrinfo:
> Name or service not known"
>   2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:878:in
> `initialize'
>   2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:878:in
> `open'
>   2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:878:in
> `block in connect'
>   2016-04-20 07:47:48 + [error]: /usr/share/ruby/timeout.rb:52:in
> `timeout'
>   2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:877:in
> `connect'
>   2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:862:in
> `do_start'
>   2016-04-20 07:47:48 + [error]: /usr/share/ruby/net/http.rb:851:in
> `start'
>   2016-04-20 07:47:48 + [error]:
> /opt/app-root/src/gems/rest-client-1.8.0/lib/restclient/request.rb:413:in
> `transmit'
>   2016-04-20 07:47:48 + [error]:
> /opt/app-root/src/gems/rest-client-1.8.0/lib/restclient/request.rb:176:in
> `execute'
>   2016-04-20 07:47:48 + [error]:
> /opt/app-root/src/gems/rest-client-1.8.0/lib/restclient/request.rb:41:in
> `execute'
>   2016-04-20 07:47:48 + [error]:
> /opt/app-root/src/gems/rest-client-1.8.0/lib/restclient/resource.rb:51:in
> `get'
>   2016-04-20 07:47:48 + [error]:
> /opt/app-root/src/gems/kubeclient-1.1.2/lib/kubeclient/common.rb:310:in
> 

Re: Aggregating container logs using Kibana

2016-04-14 Thread Eric Wolinetz
Just a heads up, the latest deployer image on Dockerhub has an updated
fluentd template that already contains the change for Fluentd to run in the
privileged security context.

On Wed, Apr 13, 2016 at 11:24 AM, Eric Wolinetz <ewoli...@redhat.com> wrote:

>
>
> On Wed, Apr 13, 2016 at 3:16 AM, Lorenz Vanthillo <
> lorenz.vanthi...@outlook.com> wrote:
>
>> I saw on https://github.com/openshift/origin/issues/8358:
>>
>>
>> $ oc debug pod/logging-fluentd-80xzt -- cat /proc/self/attr/current
>> Debugging with pod/debug-logging-fluentd-80xzt, original command: > entrypoint>
>> Waiting for pod to start ...
>> system_u:system_r:svirt_lxc_net_t:s0:c216,c576
>>
>> Removing debug pod ...
>>
>>
>> Yup. The problem was what I thought: it's being run under the
>> svirt_lsc_net_t SELinux type, which doesn't have access to var_log_t. If
>> you don't want to disable SELinux, you'll need to follow the instructions
>> for creating a new SELinux type that I posted above.
>>
>> So I understand what's wrong but I don't see why the workaround (changing
>> the service account permissions from anyuid to privileged) isn't working
>> for me + I don't want to create a new selinuxtype.
>>
>
> Sorry about that, we had missed a step.  You'll need to delete your
> daemonset, edit your logging-fluentd-template to add a property to your
> container spec and recreate your daemonset to let it properly run as
> privileged to escape the SELinux enforcing.
>
> $ oc delete daemonset logging-fluentd
>
> $ oc edit template/logging-fluentd-template
>
>
> # Please edit the object below. Lines beginning with a '#' will be ignored,
> # and an empty file will abort the edit. If an error occurs while saving
> this file will be
> # reopened with the relevant failures.
> #
> apiVersion: v1
> kind: Template
> labels:
>   component: fluentd
> . . .
> objects:
> - apiVersion: extensions/v1beta1
>   kind: DaemonSet
> . . .
> spec:
> selector:
>   matchLabels:
> component: fluentd
> provider: openshift
> template:
>   metadata:
> labels:
>   component: fluentd
>   provider: openshift
> name: fluentd-elasticsearch
>   spec:
> containers:
> . . .
>   name: fluentd-elasticsearch
>
> # insert below here
>   securityContext:
> privileged: true
> # insert above here
>
>   resources:
> limits:
>   cpu: 100m
> . . .
>
> $ oc process logging-fluentd-template | oc create -f -
>
>
>> --
>> From: lorenz.vanthi...@outlook.com
>> To: ewoli...@redhat.com
>> CC: users@lists.openshift.redhat.com
>> Subject: RE: Aggregating container logs using Kibana
>> Date: Wed, 13 Apr 2016 09:30:48 +0200
>>
>>
>> Fixed the issue with nodeselectormismatching:
>> So now I have 3 fluentd pods on my 2 normal nodes and my infranode:
>> But still the same permission issue:
>> NAME  READY STATUS  RESTARTS   AGE
>> logging-curator-1-j7mz0   1/1   Running 0  17m
>> logging-deployer-39qcz0/1   Completed   0  47m
>> logging-es-605u5g7g-1-36owl   1/1   Running 0  17m
>> logging-fluentd-4uqx1 1/1   Running 0  46m
>> logging-fluentd-dez5r 1/1   Running 0  2m
>> logging-fluentd-m50nj 1/1   Running 0  46m
>> logging-kibana-1-wfog22/2   Running 0  16m
>>
>> --
>> From: lorenz.vanthi...@outlook.com
>> To: ewoli...@redhat.com
>> CC: users@lists.openshift.redhat.com
>> Subject: RE: Aggregating container logs using Kibana
>> Date: Wed, 13 Apr 2016 09:21:47 +0200
>>
>> Hi Eric,
>>
>> Thanks for your reply and the follow up of this issue.
>> I've created a new origin 1.1.6 cluster (2 days ago) but still have the
>> same issue:
>> My environment is one master (with node) non schedulable, 2 'normal'
>> nodes and one infra node.
>> I still got the permission denied (The documentation is up to date so I
>> even don't had to perform the workaround manually).
>> - system:serviceaccount:logging:aggregated-logging-fluentd is in scc
>> privileged by default.
>>
>> The logging-deployer-template creates services and 2 pods of fluentd (on
>> the normal nodes).
>> The pods appear after performing this command:
>>
>> oc label nodes --all logging-infra-fluentd=true
>>
>>

Re: Aggregating container logs using Kibana

2016-04-13 Thread Eric Wolinetz
On Wed, Apr 13, 2016 at 3:16 AM, Lorenz Vanthillo <
lorenz.vanthi...@outlook.com> wrote:

> I saw on https://github.com/openshift/origin/issues/8358:
>
>
> $ oc debug pod/logging-fluentd-80xzt -- cat /proc/self/attr/current
> Debugging with pod/debug-logging-fluentd-80xzt, original command:  entrypoint>
> Waiting for pod to start ...
> system_u:system_r:svirt_lxc_net_t:s0:c216,c576
>
> Removing debug pod ...
>
>
> Yup. The problem was what I thought: it's being run under the
> svirt_lsc_net_t SELinux type, which doesn't have access to var_log_t. If
> you don't want to disable SELinux, you'll need to follow the instructions
> for creating a new SELinux type that I posted above.
>
> So I understand what's wrong but I don't see why the workaround (changing
> the service account permissions from anyuid to privileged) isn't working
> for me + I don't want to create a new selinuxtype.
>

Sorry about that, we had missed a step.  You'll need to delete your
daemonset, edit your logging-fluentd-template to add a property to your
container spec and recreate your daemonset to let it properly run as
privileged to escape the SELinux enforcing.

$ oc delete daemonset logging-fluentd

$ oc edit template/logging-fluentd-template


# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving
this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: Template
labels:
  component: fluentd
. . .
objects:
- apiVersion: extensions/v1beta1
  kind: DaemonSet
. . .
spec:
selector:
  matchLabels:
component: fluentd
provider: openshift
template:
  metadata:
labels:
  component: fluentd
  provider: openshift
name: fluentd-elasticsearch
  spec:
containers:
. . .
  name: fluentd-elasticsearch

# insert below here
  securityContext:
privileged: true
# insert above here

  resources:
limits:
  cpu: 100m
. . .

$ oc process logging-fluentd-template | oc create -f -


> --
> From: lorenz.vanthi...@outlook.com
> To: ewoli...@redhat.com
> CC: users@lists.openshift.redhat.com
> Subject: RE: Aggregating container logs using Kibana
> Date: Wed, 13 Apr 2016 09:30:48 +0200
>
>
> Fixed the issue with nodeselectormismatching:
> So now I have 3 fluentd pods on my 2 normal nodes and my infranode:
> But still the same permission issue:
> NAME  READY STATUS  RESTARTS   AGE
> logging-curator-1-j7mz0   1/1   Running 0  17m
> logging-deployer-39qcz0/1   Completed   0  47m
> logging-es-605u5g7g-1-36owl   1/1   Running 0  17m
> logging-fluentd-4uqx1 1/1   Running 0  46m
> logging-fluentd-dez5r 1/1   Running 0  2m
> logging-fluentd-m50nj 1/1   Running 0  46m
> logging-kibana-1-wfog22/2   Running 0  16m
>
> --
> From: lorenz.vanthi...@outlook.com
> To: ewoli...@redhat.com
> CC: users@lists.openshift.redhat.com
> Subject: RE: Aggregating container logs using Kibana
> Date: Wed, 13 Apr 2016 09:21:47 +0200
>
> Hi Eric,
>
> Thanks for your reply and the follow up of this issue.
> I've created a new origin 1.1.6 cluster (2 days ago) but still have the
> same issue:
> My environment is one master (with node) non schedulable, 2 'normal' nodes
> and one infra node.
> I still got the permission denied (The documentation is up to date so I
> even don't had to perform the workaround manually).
> - system:serviceaccount:logging:aggregated-logging-fluentd is in scc
> privileged by default.
>
> The logging-deployer-template creates services and 2 pods of fluentd (on
> the normal nodes).
> The pods appear after performing this command:
>
> oc label nodes --all logging-infra-fluentd=true
>
> So my nodes got that label. also the unschedulable node on my master. So
> that's normal that it failed but why it fails on my infra-node I don't
> know. (I defined in my master-config that projects are by default on the
> other 2 nodes, maybe that's why but I don't know it's relevant for my
> issue).
> I also don't really understand why 'oc process logging-support-tempalte |
> oc create -f -' is only be cited at the troubleshooting part.
> Still the error: [error]: unexpected error error_class=Errno::EACCES
> error=#
>
> oc get is
> NAMEDOCKER REPO
> TAGSUPDATED
> logging-auth-proxy  docker.io/openshift/origin-logging-auth-proxy
> latest,v0.0.1   4 minutes ago
> logging-curator docker.io/openshift/origin-logging-curator
> latest  4 minutes ago
> logging-elasticsearch   docker.io/openshift/origin-logging-elasticsearch
> latest  4 minutes ago
> logging-fluentd docker.io/openshift/origin-logging-fluentd
> latest  4 minutes ago
> 

Re: Set up logging: Kibana

2016-03-21 Thread Eric Wolinetz
It looks like its failing to process the fluentd template for some reason.

Can you send the output from 'oc get template logging-fluentd-template -o
yaml' ?

On Mon, Mar 21, 2016 at 7:31 AM, Den Cowboy  wrote:

> Some logs: the pod returns fails (after executing oc process
> logging-deployer-template -n openshift ...
>
> (Re-)Creating deployed objects
> No resources found
> + oc process logging-support-pre-template
> + oc create -f -
> serviceaccount "aggregated-logging-kibana" created
> serviceaccount "aggregated-logging-elasticsearch" created
> serviceaccount "aggregated-logging-fluentd" created
> serviceaccount "aggregated-logging-curator" created
> service "logging-es" created
> service "logging-es-cluster" created
> service "logging-es-ops" created
> service "logging-es-ops-cluster" created
> service "logging-kibana" created
> service "logging-kibana-ops" created
> + oc delete dc,rc,pod --selector logging-infra=curator
> No resources found
> + oc delete dc,rc,pod --selector logging-infra=kibana
> No resources found
> + oc delete dc,rc,pod --selector logging-infra=fluentd
> No resources found
> + oc delete dc,rc,pod --selector logging-infra=elasticsearch
> No resources found
> + (( n=0 ))
> + (( n<1 ))
> + oc process logging-es-template
> + oc create -f -
> deploymentconfig "logging-es-6jldefop" created
> + (( n++ ))
> + (( n<1 ))
> + oc process logging-fluentd-template
> + oc create -f -
> json: cannot unmarshal object into Go value of type string
>
> --
> From: dencow...@hotmail.com
> To: users@lists.openshift.redhat.com
> Subject: Set up logging: Kibana
> Date: Mon, 21 Mar 2016 12:22:01 +
>
>
> I try to set up the logging system of Kibana:
> https://docs.openshift.org/latest/install_config/aggregate_logging.html
>
> I'm able to perform the steps till
>
> oc process logging-deployer-template -n openshift \
>-v 
> KIBANA_HOSTNAME=kibana.example.com,ES_CLUSTER_SIZE=1,PUBLIC_MASTER_URL=https://localhost:8443
>  \
>| oc create -f -
>
> This creates a deploymentpod an it creates some services + 2 pods (logging-es 
> and logging-es-cluster).
> Then I perform:
> oc process logging-support-template | oc create -f -
>
> This creates the following:
> oauthclient "kibana-proxy" created
> route "kibana" created
> route "kibana-ops" created
> imagestream "logging-auth-proxy" created
> imagestream "logging-elasticsearch" created
> imagestream "logging-fluentd" created
> imagestream "logging-kibana" created
> imagestream "logging-curator" created
>
> But I seem to miss deploymentconfigs? I'm unable to scale my fluentd or 
> Kibana.
>
> Ps: the documentation is also a bit confusing at:
> $ oc policy add-role-to-user edit \
> system:serviceaccount:default:logging-deployer
>
> Because it's using project default instead of project logging (like in the 
> other steps).
> https://docs.openshift.org/latest/install_config/aggregate_logging.html
>
>
> ___ users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users