subject:"Re\: \[openstack\-dev\] \[vitrage\] I have some problems with Prometheus alarms in vitrage."

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-11-29 Thread Won

Hi,

I checked that both of the methods you propose work well.
After I add 'should_delete_outdated_entities' function to InstanceDriver,
it took about 10 minutes to clear the old Instance.
And I added two sentences you said to Nova-cpu.conf, so the vitrage
collector get notifications well.

Thank you for your help.

Best regards,
Won

2018년 11월 22일 (목) 오후 9:35, Ifat Afek 님이 작성:

> Hi,
>
> A deleted instance should be removed from Vitrage in one of two ways:
> 1. By reacting to a notification from Nova
> 2. If no notification is received, then after a while the instance vertex
> in Vitrage is considered "outdated" and is deleted
>
> Regarding #1, it is clear from your logs that you don't get notifications
> from Nova on the second compute.
> Do you have on one of your nodes, in addition to nova.conf, also a
> nova-cpu.conf? if so, please make the same change in this file:
>
> notification_topics = notifications,vitrage_notifications
>
> notification_driver = messagingv2
>
> And please make sure to restart nova compute service on that node.
>
> Regarding #2, as a second-best solution, the instances should be deleted
> from the graph after not being updated for a while.
> I realized that we have a bug in this area and I will push a fix to gerrit
> later today. In the meantime, you can add to
> InstanceDriver class the following function:
>
> @staticmethod
> def should_delete_outdated_entities():
> return True
>
> Let me know if it solved your problem,
> Ifat
>
>
> On Wed, Nov 21, 2018 at 1:50 PM Won  wrote:
>
>> I attached four log files.
>> I collected the logs from about 17:14 to 17:42. I created an instance of
>> 'deltesting3' at 17:17. 7minutes later, at 17:24, the entity graph showed
>> the dentesting3 and vitrage colletor and graph logs are appeared.
>> When creating an instance in ubuntu server, it appears immediately in the
>> entity graph and logs, but when creating an instance in computer1 (multi
>> node), it appears about 5~10 minutes later.
>> I deleted an instance of 'deltesting3' around 17:26.
>>
>>
>>> After ~20minutes, there was only Apigateway. Does it make sense? did you
>>> delete the instances on ubuntu, in addition to deltesting?
>>>
>>
>> I only deleted 'deltesting'. After that, only the logs from 'apigateway'
>> and 'kube-master' were collected. But other instances were working well. I
>> don't know why only two instances are collected in the log.
>> NOV 19 In this log, 'agigateway' and 'kube-master' were continuously
>> collected in a short period of time, but other instances were sometimes
>> collected in long periods.
>>
>> In any case, I would expect to see the instances deleted from the graph
>>> at this stage, since they were not returned by get_all.
>>> Can you please send me the log of vitrage-graph at the same time (Nov
>>> 15, 16:35-17:10)?
>>>
>>
>> Information  'deldtesting3' that has already been deleted continues to be
>> collected in vitrage-graph.service.
>>
>> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-11-22 Thread Ifat Afek

Hi,

A deleted instance should be removed from Vitrage in one of two ways:
1. By reacting to a notification from Nova
2. If no notification is received, then after a while the instance vertex
in Vitrage is considered "outdated" and is deleted

Regarding #1, it is clear from your logs that you don't get notifications
from Nova on the second compute.
Do you have on one of your nodes, in addition to nova.conf, also a
nova-cpu.conf? if so, please make the same change in this file:

notification_topics = notifications,vitrage_notifications

notification_driver = messagingv2

And please make sure to restart nova compute service on that node.

Regarding #2, as a second-best solution, the instances should be deleted
from the graph after not being updated for a while.
I realized that we have a bug in this area and I will push a fix to gerrit
later today. In the meantime, you can add to
InstanceDriver class the following function:

@staticmethod
def should_delete_outdated_entities():
return True

Let me know if it solved your problem,
Ifat

On Wed, Nov 21, 2018 at 1:50 PM Won  wrote:

> I attached four log files.
> I collected the logs from about 17:14 to 17:42. I created an instance of
> 'deltesting3' at 17:17. 7minutes later, at 17:24, the entity graph showed
> the dentesting3 and vitrage colletor and graph logs are appeared.
> When creating an instance in ubuntu server, it appears immediately in the
> entity graph and logs, but when creating an instance in computer1 (multi
> node), it appears about 5~10 minutes later.
> I deleted an instance of 'deltesting3' around 17:26.
>
>
>> After ~20minutes, there was only Apigateway. Does it make sense? did you
>> delete the instances on ubuntu, in addition to deltesting?
>>
>
> I only deleted 'deltesting'. After that, only the logs from 'apigateway'
> and 'kube-master' were collected. But other instances were working well. I
> don't know why only two instances are collected in the log.
> NOV 19 In this log, 'agigateway' and 'kube-master' were continuously
> collected in a short period of time, but other instances were sometimes
> collected in long periods.
>
> In any case, I would expect to see the instances deleted from the graph at
>> this stage, since they were not returned by get_all.
>> Can you please send me the log of vitrage-graph at the same time (Nov 15,
>> 16:35-17:10)?
>>
>
> Information  'deldtesting3' that has already been deleted continues to be
> collected in vitrage-graph.service.
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-11-15 Thread Ifat Afek

On Thu, Nov 15, 2018 at 10:28 AM Won  wrote:

> Looking at the logs, I see two issues:
>> 1. On ubuntu server, you get a notification about the vm deletion, while
>> on compute1 you don't get it.
>> Please make sure that Nova sends notifications to 'vitrage_notifications'
>> - it should be configured in /etc/nova/nova.conf.
>> 2. Once in 10 minutes (by default) nova.instance datasource queries all
>> instances. The deleted vm is supposed to be deleted in Vitrage at this
>> stage, even if the notification was lost.
>> Please check in your collector log for the a message of "novaclient.v2.client
>> [-] RESP BODY" before and after the deletion, and send me its content.
>
>
>  I attached two log files. I created a VM in computer1 which is a computer
> node and deleted it a few minutes later. Log for 30 minutes from VM
> creation.
> The first is the log of the vitrage-collect that grep instance name.
> The second is the noovaclient.v2.clinet [-] RESP BODY log.
> After I deleted the VM, no log of the instance appeared in the collector
> log no matter how long I waited.
>
> I added the following to Nova.conf on the computer1 node.(attached file
> 'compute_node_local_conf.txt')
> notification_topics = notifications,vitrage_notifications
> notification_driver = messagingv2
> vif_plugging_timeout = 300
> notify_on_state_change = vm_and_task_state
> instance_usage_audit_period = hour
> instance_usage_audit = True
>

Hi,

>From the collector log RESP BODY messages I understand that in the
beginning there were the following servers:
compute1: deltesting
ubuntu: Apigateway, KubeMaster and others

After ~20minutes, there was only Apigateway. Does it make sense? did you
delete the instances on ubuntu, in addition to deltesting?
In any case, I would expect to see the instances deleted from the graph at
this stage, since they were not returned by get_all.
Can you please send me the log of vitrage-graph at the same time (Nov 15,
16:35-17:10)?

There is still the question of why we don't see a notification from Nova,
but let's try to solve the issues one by one.

Thanks,
Ifat
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-11-08 Thread Ifat Afek

Hi,

We solved the timestamp bug. There are two patches for master [1] and
stable/rocky [2].
I'll check the other issues next week.

Regards,
Ifat

[1] https://review.openstack.org/#/c/616468/
[2] https://review.openstack.org/#/c/616469/


On Wed, Oct 31, 2018 at 10:59 AM Won  wrote:

>
 [image: image.png]
 The time stamp is recorded well in log(vitrage-graph,collect etc), but
 in vitrage-dashboard it is marked 2001-01-01.
 However, it seems that the time stamp is recognized well internally
 because the alarm can be resolved and is recorded well in log.

>>>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-11-01 Thread Ifat Afek

Hi,

On Wed, Oct 31, 2018 at 11:00 AM Won  wrote:

> Hi,
>
>>
>> This is strange. I would expect your original definition to work as well,
>> since the alarm key in Vitrage is defined by a combination of the alert
>> name and the instance. We will check it again.
>> BTW,  we solved a different bug related to Prometheus alarms not being
>> cleared [1]. Could it be related?
>>
>
> Using the original definition, no matter how different the instances are,
> the alarm names are recognized as the same alarm in vitrage.
> And I tried to install the rocky version and the master version on the new
> server and retest but the problem was not solved. The latest bugfix seems
> irrelevant.
>

Ok. We will check this issue. For now your workaround is ok, right?

> Does the wrong timestamp appear if you run 'vitrage alarm list' cli
>> command? please try running 'vitrage alarm list --debug' and send me the
>> output.
>>
>
> I have attached 'vitrage-alarm-list.txt.'
>

I believe that you attached the wrong file. It seems like another log of
vitrage-graph.

>
>
>> Please send me vitrage-collector.log and vitrage-graph.log from the time
>> that the problematic vm was created and deleted. Please also create and
>> delete a vm on your 'ubuntu' server, so I can check the differences in the
>> log.
>>
>
> I have attached 'vitrage_log_on_compute1.zip' and
> 'vitrage_log_on_ubuntu.zip' files.
> When creating a vm on computer1, a vitrage-collect log occurred, but no
> log occurred when it was removed.
>

Looking at the logs, I see two issues:
1. On ubuntu server, you get a notification about the vm deletion, while on
compute1 you don't get it.
Please make sure that Nova sends notifications to 'vitrage_notifications' -
it should be configured in /etc/nova/nova.conf.

2. Once in 10 minutes (by default) nova.instance datasource queries all
instances. The deleted vm is supposed to be deleted in Vitrage at this
stage, even if the notification was lost.
Please check in your collector log for the a message of "novaclient.v2.client
[-] RESP BODY" before and after the deletion, and send me its content.

Br,
Ifat
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-10-31 Thread Won

Hi,

>
> This is strange. I would expect your original definition to work as well,
> since the alarm key in Vitrage is defined by a combination of the alert
> name and the instance. We will check it again.
> BTW,  we solved a different bug related to Prometheus alarms not being
> cleared [1]. Could it be related?
>

Using the original definition, no matter how different the instances are,
the alarm names are recognized as the same alarm in vitrage.
And I tried to install the rocky version and the master version on the new
server and retest but the problem was not solved. The latest bugfix seems
irrelevant.

Does the wrong timestamp appear if you run 'vitrage alarm list' cli
> command? please try running 'vitrage alarm list --debug' and send me the
> output.
>

I have attached 'vitrage-alarm-list.txt.'


> Please send me vitrage-collector.log and vitrage-graph.log from the time
> that the problematic vm was created and deleted. Please also create and
> delete a vm on your 'ubuntu' server, so I can check the differences in the
> log.
>

I have attached 'vitrage_log_on_compute1.zip' and
'vitrage_log_on_ubuntu.zip' files.
When creating a vm on computer1, a vitrage-collect log occurred, but no log
occurred when it was removed.

Br,
Won



2018년 10월 30일 (화) 오전 1:28, Ifat Afek 님이 작성:

> Hi,
>
> On Fri, Oct 26, 2018 at 10:34 AM Won  wrote:
>
>>
>> I solved the problem of not updating the Prometheus alarm.
>> Alarms with the same Prometheus alarm name are recognized as the same
>> alarm in vitrage.
>>
>> --- alert.rules.yml
>> groups:
>> - name: alert.rules
>>   rules:
>>   - alert: InstanceDown
>> expr: up == 0
>> for: 60s
>> labels:
>>   severity: warning
>> annotations:
>>   description: '{{ $labels.instance }} of job {{ $labels.job }} has
>> been down
>> for more than 30 seconds.'
>>   summary: Instance {{ $labels.instance }} down
>> --
>> This is the contents of the alert.rules.yml file before I modify it.
>> This is a yml file that generates an alarm when the cardvisor
>> stops(instance down). Alarm is triggered depending on which instance is
>> down, but all alarms have the same name as 'instance down'. Vitrage
>> recognizes all of these alarms as the same alarm. Thus, until all 'instance
>> down' alarms were cleared, the 'instance down' alarm was recognized as
>> unresolved and the alarm was not extinguished.
>>
>
> This is strange. I would expect your original definition to work as well,
> since the alarm key in Vitrage is defined by a combination of the alert
> name and the instance. We will check it again.
> BTW,  we solved a different bug related to Prometheus alarms not being
> cleared [1]. Could it be related?
>
>
>> Can you please show me where you saw the 2001 timestamp? I didn't find it
>>> in the log.
>>>
>>
>> [image: image.png]
>> The time stamp is recorded well in log(vitrage-graph,collect etc), but in
>> vitrage-dashboard it is marked 2001-01-01.
>> However, it seems that the time stamp is recognized well internally
>> because the alarm can be resolved and is recorded well in log.
>>
>
> Does the wrong timestamp appear if you run 'vitrage alarm list' cli
> command? please try running 'vitrage alarm list --debug' and send me the
> output.
>
>
>> [image: image.png]
>> Host name ubuntu is my main server. I install openstack all in one in
>> this server and i install compute node in host name compute1.
>> When i create a new vm in nova(compute1) it immediately appear in the
>> entity graph. But in does not disappear in the entity graph when i delete
>> the vm. No matter how long i wait, it doesn't disappear.
>> Afther i execute 'vitrage-purge-data' command and reboot the
>> Openstack(execute reboot command in openstack server(host name ubuntu)), it
>> disappear. Only execute 'vitrage-purge-data' does not work. It need a
>> reboot to disappear.
>> When i create a new vm in nova(ubuntu) there is no problem.
>>
> Please send me vitrage-collector.log and vitrage-graph.log from the time
> that the problematic vm was created and deleted. Please also create and
> delete a vm on your 'ubuntu' server, so I can check the differences in the
> log.
>
> I implemented the web service of the micro service architecture and
>> applied the RCA. Attached file picture shows the structure of the web
>> service I have implemented. I wonder what data I receive and what can i do
>> when I link vitrage with kubernetes.
>> As i know, the vitrage graph does not present information about
>> containers or pods inside the vm. If that is correct, I would like to make
>> the information of the pod level appear on the entity graph.
>>
>> I follow (
>> https://docs.openstack.org/vitrage/latest/contributor/k8s_datasource.html)
>> this step. I attached the vitage.conf file and the kubeconfig file. The
>> contents of the Kubeconconfig file are copied from the contents of the
>> admin.conf file on the master node.
>> I want to check my settings are right and connected, but I don't know
>> ho

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-10-29 Thread Ifat Afek

Hi,

On Fri, Oct 26, 2018 at 10:34 AM Won  wrote:

>
> I solved the problem of not updating the Prometheus alarm.
> Alarms with the same Prometheus alarm name are recognized as the same
> alarm in vitrage.
>
> --- alert.rules.yml
> groups:
> - name: alert.rules
>   rules:
>   - alert: InstanceDown
> expr: up == 0
> for: 60s
> labels:
>   severity: warning
> annotations:
>   description: '{{ $labels.instance }} of job {{ $labels.job }} has
> been down
> for more than 30 seconds.'
>   summary: Instance {{ $labels.instance }} down
> --
> This is the contents of the alert.rules.yml file before I modify it.
> This is a yml file that generates an alarm when the cardvisor
> stops(instance down). Alarm is triggered depending on which instance is
> down, but all alarms have the same name as 'instance down'. Vitrage
> recognizes all of these alarms as the same alarm. Thus, until all 'instance
> down' alarms were cleared, the 'instance down' alarm was recognized as
> unresolved and the alarm was not extinguished.
>

This is strange. I would expect your original definition to work as well,
since the alarm key in Vitrage is defined by a combination of the alert
name and the instance. We will check it again.
BTW,  we solved a different bug related to Prometheus alarms not being
cleared [1]. Could it be related?


> Can you please show me where you saw the 2001 timestamp? I didn't find it
>> in the log.
>>
>
> [image: image.png]
> The time stamp is recorded well in log(vitrage-graph,collect etc), but in
> vitrage-dashboard it is marked 2001-01-01.
> However, it seems that the time stamp is recognized well internally
> because the alarm can be resolved and is recorded well in log.
>

Does the wrong timestamp appear if you run 'vitrage alarm list' cli
command? please try running 'vitrage alarm list --debug' and send me the
output.


> [image: image.png]
> Host name ubuntu is my main server. I install openstack all in one in this
> server and i install compute node in host name compute1.
> When i create a new vm in nova(compute1) it immediately appear in the
> entity graph. But in does not disappear in the entity graph when i delete
> the vm. No matter how long i wait, it doesn't disappear.
> Afther i execute 'vitrage-purge-data' command and reboot the
> Openstack(execute reboot command in openstack server(host name ubuntu)), it
> disappear. Only execute 'vitrage-purge-data' does not work. It need a
> reboot to disappear.
> When i create a new vm in nova(ubuntu) there is no problem.
>
Please send me vitrage-collector.log and vitrage-graph.log from the time
that the problematic vm was created and deleted. Please also create and
delete a vm on your 'ubuntu' server, so I can check the differences in the
log.

I implemented the web service of the micro service architecture and applied
> the RCA. Attached file picture shows the structure of the web service I
> have implemented. I wonder what data I receive and what can i do when I
> link vitrage with kubernetes.
> As i know, the vitrage graph does not present information about containers
> or pods inside the vm. If that is correct, I would like to make the
> information of the pod level appear on the entity graph.
>
> I follow (
> https://docs.openstack.org/vitrage/latest/contributor/k8s_datasource.html)
> this step. I attached the vitage.conf file and the kubeconfig file. The
> contents of the Kubeconconfig file are copied from the contents of the
> admin.conf file on the master node.
> I want to check my settings are right and connected, but I don't know how.
> It would be very much appreciated if you let me know how.
>
Unfortunately, Vitrage does not hold pods and containers information at the
moment. We discussed the option of adding it in Stein release, but I'm not
sure we will get to do it.

Br,
Ifat

[1] https://review.openstack.org/#/c/611258/
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-10-10 Thread Ifat Afek

Hi Won,

On Wed, Oct 10, 2018 at 11:58 AM Won  wrote:

>
> my prometheus version : 2.3.2 and alertmanager version : 0.15.2 and I
> attached files.(vitrage collector,graph logs and apache log and
> prometheus.yml alertmanager.yml alarm rule file etc..)
> I think the problem that resolved alarm does not disappear is the time
> stamp problem of the alarm.
>
> -gray alarm info
> severity:PAGE
> vitrage id: c6a94386-3879-499e-9da0-2a5b9d3294b8  ,
> e2c5eae9-dba9-4f64-960b-b964f1c01dfe , 3d3c903e-fe09-4a6f-941f-1a2adb09feca
> , 8c6e7906-9e66-404f-967f-40037a6afc83 ,
> e291662b-115d-42b5-8863-da8243dd06b4 , 8abd2a2f-c830-453c-a9d0-55db2bf72d46
> --
>
> The alarms marked with the blue circle are already resolved. However, it
> does not disappear from the entity graph and alarm list.
> There were seven more gray alarms at the top screenshot in active alarms
> like entity graph. It disappeared by deleting gray alarms from the
> vitrage-alarms table in the DB or changing the end timestamp value to an
> earlier time than the current time.
>

I checked the files that you sent, and it appears that the connection
between Prometheus and Vitrage works well. I see in vitrage-graph log that
Prometheus notified both on alert firing and on alert resolved statuses.
I still don't understand why the alarms were not removed from Vitrage,
though. Can you please send me the output of 'vitrage topology show' CLI
command?
Also, did you happen to restart vitrage-graph or vitrage-collector during
your tests?

> At the log, it seems that the first problem is that the timestamp value
> from the vitrage comes to 2001-01-01, even though the starting value in the
> Prometheus alarm information has the correct value.
> When the alarm is solved, the end time stamp value is not updated so alarm
> does not disappear from the alarm list.
>

Can you please show me where you saw the 2001 timestamp? I didn't find it
in the log.

> The second problem is that even if the time stamp problem is solved, the
> entity graph problem will not be solved. Gray alarm information is not in
> the vitage-collector log but in the vitrage graph and apache log.
> I want to know how to forcefully delete entity from a vitage graph.
>

You shouldn't do it :-) there is no API for deleting entities, and messing
with the database may cause unexpected results.
The only thing that you can safely do is to stop all Vitrage services,
execute 'vitrage-purge-data' command, and start the services again. This
will cause rebuilding of the entity graph.

> Regarding the multi nodes, I mean, 1 controll node(pc1) & 1 compute
> node(pc2). So one openstack.
>
> The test VM in the picture is an instance on compute node that has already
> been deleted. I waited for hours and checked nova.conf but it was not
> removed.
> This was not the occur in the queens version; in the rocky version,
> multinode environment, there seems to be a bug in VM creation on multi node.
> The same situation occurred in multi-node environments that were
> configured with different PCs.
>

Let me make sure I understand the problem.
When you create a new vm in Nova, does it immediately appear in the entity
graph?
When you delete a vm, it remains? does it remain in a multi-node
environment and deleted in a single node environment?

Br,
Ifat
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-10-04 Thread Ifat Afek

Hi,

Can you please give us some more details about your scenario with
Prometheus? Please try and give as many details as possible, so we can try
to reproduce the bug.

What do you mean by “if the alarm is resolved, the alarm manager makes a
silence, or removes the alarm rule from Prometheus”? these are different
cases. None of them works in your environment?

Which Prometheus and Alertmanager versions are you using?

 Please try to change the Vitrage loglevel to DEBUG (set “debug = true” in
/etc/vitrage/vitrage.conf) and send me the Vitrage collector, graph and api
logs.

Regarding the multi nodes, I'm not sure I understand your configuration. Do
you mean there is more than one OpenStack and Nova? more than one host?
more than one vm?

Basically, vms are deleted from Vitrage in two cases:
1. After each periodic call to get_all of nova.instance datasource. By
default this is done once in 10 minutes.
2. Immediately, if you have the following configuration in
/etc/nova/nova.conf:
notification_topics = notifications,vitrage_notifications

So, please check your nova.conf and also whether the vms are deleted after
10 minutes.

Thanks,
Ifat

On Thu, Oct 4, 2018 at 7:12 AM Won  wrote:

> Thank you for your reply Ifat.
>
> The alertmanager.yml file already contains 'send_resolved:true'.
> However, the alarm does not disappear from the alarm list and the entity
> graph even if the alarm is resolved, the alarm manager makes a silence, or
> removes the alarm rule from Prometheus.
> The only way to remove alarms is to manually remove them from the db. Is
> there any other way to remove the alarm?
> Entities(vm) that run on multi nodes in the rocky version have similar
> symptoms. There was a symptom that the Entities created on the multi-node
> would not disappear from the Entity Graph even after deletion.
> Is this a bug in rocky version?
>
> Best Regards,
> Won
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-10-03 Thread Won

Thank you for your reply Ifat.

The alertmanager.yml file already contains 'send_resolved:true'.
However, the alarm does not disappear from the alarm list and the entity
graph even if the alarm is resolved, the alarm manager makes a silence, or
removes the alarm rule from Prometheus.
The only way to remove alarms is to manually remove them from the db. Is
there any other way to remove the alarm?
Entities(vm) that run on multi nodes in the rocky version have similar
symptoms. There was a symptom that the Entities created on the multi-node
would not disappear from the Entity Graph even after deletion.
Is this a bug in rocky version?

Best Regards,
Won

2018년 10월 3일 (수) 오후 5:46, Ifat Afek 님이 작성:

> Hi,
>
> In the alertmanager.yml file you should have a receiver for Vitrage.
> Please verify that it includes "send_resolved: true". This is required for
> Prometheus to notify Vitrage when an alarm is resolved.
>
> The full Vitrage receiver definition should be:
>
> - name: **
>
>   webhook_configs:
>
>   - url: **  # example: 'http://127.0.0.1:8999/v1/event
> '
>
> send_resolved: true
>
> http_config:
>
>   basic_auth:
>
> username: **
>
> password: **
>
> Hope it helps,
> Ifat
>
>
> On Tue, Oct 2, 2018 at 7:51 AM Won  wrote:
>
>> I have some problems with Prometheus alarms in vitrage.
>> I receive a list of alarms from the Prometheus alarm manager well, but
>> the alarm does not disappear when the problem(alarm) is resolved. The alarm
>> that came once in both the alarm list and the entity graph does not
>> disappear in vitrage.  The alarm sent by zabbix disappears when alarm
>> solved, I wonder how to clear the Prometheus alarm from vitrage and how to
>> update the alarm automatically like zabbix.
>> thank you.
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

2018-10-03 Thread Ifat Afek

Hi,

In the alertmanager.yml file you should have a receiver for Vitrage. Please
verify that it includes "send_resolved: true". This is required for
Prometheus to notify Vitrage when an alarm is resolved.

The full Vitrage receiver definition should be:

- name: **

  webhook_configs:

  - url: **  # example: 'http://127.0.0.1:8999/v1/event'

send_resolved: true

http_config:

  basic_auth:

username: **

password: **

Hope it helps,
Ifat


On Tue, Oct 2, 2018 at 7:51 AM Won  wrote:

> I have some problems with Prometheus alarms in vitrage.
> I receive a list of alarms from the Prometheus alarm manager well, but the
> alarm does not disappear when the problem(alarm) is resolved. The alarm
> that came once in both the alarm list and the entity graph does not
> disappear in vitrage.  The alarm sent by zabbix disappears when alarm
> solved, I wonder how to clear the Prometheus alarm from vitrage and how to
> update the alarm automatically like zabbix.
> thank you.
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.

11 matches

Site Navigation

Mail list logo

Footer information