Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Hi, I checked that both of the methods you propose work well. After I add 'should_delete_outdated_entities' function to InstanceDriver, it took about 10 minutes to clear the old Instance. And I added two sentences you said to Nova-cpu.conf, so the vitrage collector get notifications well. Thank you for your help. Best regards, Won 2018년 11월 22일 (목) 오후 9:35, Ifat Afek 님이 작성: > Hi, > > A deleted instance should be removed from Vitrage in one of two ways: > 1. By reacting to a notification from Nova > 2. If no notification is received, then after a while the instance vertex > in Vitrage is considered "outdated" and is deleted > > Regarding #1, it is clear from your logs that you don't get notifications > from Nova on the second compute. > Do you have on one of your nodes, in addition to nova.conf, also a > nova-cpu.conf? if so, please make the same change in this file: > > notification_topics = notifications,vitrage_notifications > > notification_driver = messagingv2 > > And please make sure to restart nova compute service on that node. > > Regarding #2, as a second-best solution, the instances should be deleted > from the graph after not being updated for a while. > I realized that we have a bug in this area and I will push a fix to gerrit > later today. In the meantime, you can add to > InstanceDriver class the following function: > > @staticmethod > def should_delete_outdated_entities(): > return True > > Let me know if it solved your problem, > Ifat > > > On Wed, Nov 21, 2018 at 1:50 PM Won wrote: > >> I attached four log files. >> I collected the logs from about 17:14 to 17:42. I created an instance of >> 'deltesting3' at 17:17. 7minutes later, at 17:24, the entity graph showed >> the dentesting3 and vitrage colletor and graph logs are appeared. >> When creating an instance in ubuntu server, it appears immediately in the >> entity graph and logs, but when creating an instance in computer1 (multi >> node), it appears about 5~10 minutes later. >> I deleted an instance of 'deltesting3' around 17:26. >> >> >>> After ~20minutes, there was only Apigateway. Does it make sense? did you >>> delete the instances on ubuntu, in addition to deltesting? >>> >> >> I only deleted 'deltesting'. After that, only the logs from 'apigateway' >> and 'kube-master' were collected. But other instances were working well. I >> don't know why only two instances are collected in the log. >> NOV 19 In this log, 'agigateway' and 'kube-master' were continuously >> collected in a short period of time, but other instances were sometimes >> collected in long periods. >> >> In any case, I would expect to see the instances deleted from the graph >>> at this stage, since they were not returned by get_all. >>> Can you please send me the log of vitrage-graph at the same time (Nov >>> 15, 16:35-17:10)? >>> >> >> Information 'deldtesting3' that has already been deleted continues to be >> collected in vitrage-graph.service. >> >> __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Hi, A deleted instance should be removed from Vitrage in one of two ways: 1. By reacting to a notification from Nova 2. If no notification is received, then after a while the instance vertex in Vitrage is considered "outdated" and is deleted Regarding #1, it is clear from your logs that you don't get notifications from Nova on the second compute. Do you have on one of your nodes, in addition to nova.conf, also a nova-cpu.conf? if so, please make the same change in this file: notification_topics = notifications,vitrage_notifications notification_driver = messagingv2 And please make sure to restart nova compute service on that node. Regarding #2, as a second-best solution, the instances should be deleted from the graph after not being updated for a while. I realized that we have a bug in this area and I will push a fix to gerrit later today. In the meantime, you can add to InstanceDriver class the following function: @staticmethod def should_delete_outdated_entities(): return True Let me know if it solved your problem, Ifat On Wed, Nov 21, 2018 at 1:50 PM Won wrote: > I attached four log files. > I collected the logs from about 17:14 to 17:42. I created an instance of > 'deltesting3' at 17:17. 7minutes later, at 17:24, the entity graph showed > the dentesting3 and vitrage colletor and graph logs are appeared. > When creating an instance in ubuntu server, it appears immediately in the > entity graph and logs, but when creating an instance in computer1 (multi > node), it appears about 5~10 minutes later. > I deleted an instance of 'deltesting3' around 17:26. > > >> After ~20minutes, there was only Apigateway. Does it make sense? did you >> delete the instances on ubuntu, in addition to deltesting? >> > > I only deleted 'deltesting'. After that, only the logs from 'apigateway' > and 'kube-master' were collected. But other instances were working well. I > don't know why only two instances are collected in the log. > NOV 19 In this log, 'agigateway' and 'kube-master' were continuously > collected in a short period of time, but other instances were sometimes > collected in long periods. > > In any case, I would expect to see the instances deleted from the graph at >> this stage, since they were not returned by get_all. >> Can you please send me the log of vitrage-graph at the same time (Nov 15, >> 16:35-17:10)? >> > > Information 'deldtesting3' that has already been deleted continues to be > collected in vitrage-graph.service. > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
On Thu, Nov 15, 2018 at 10:28 AM Won wrote: > Looking at the logs, I see two issues: >> 1. On ubuntu server, you get a notification about the vm deletion, while >> on compute1 you don't get it. >> Please make sure that Nova sends notifications to 'vitrage_notifications' >> - it should be configured in /etc/nova/nova.conf. >> 2. Once in 10 minutes (by default) nova.instance datasource queries all >> instances. The deleted vm is supposed to be deleted in Vitrage at this >> stage, even if the notification was lost. >> Please check in your collector log for the a message of "novaclient.v2.client >> [-] RESP BODY" before and after the deletion, and send me its content. > > > I attached two log files. I created a VM in computer1 which is a computer > node and deleted it a few minutes later. Log for 30 minutes from VM > creation. > The first is the log of the vitrage-collect that grep instance name. > The second is the noovaclient.v2.clinet [-] RESP BODY log. > After I deleted the VM, no log of the instance appeared in the collector > log no matter how long I waited. > > I added the following to Nova.conf on the computer1 node.(attached file > 'compute_node_local_conf.txt') > notification_topics = notifications,vitrage_notifications > notification_driver = messagingv2 > vif_plugging_timeout = 300 > notify_on_state_change = vm_and_task_state > instance_usage_audit_period = hour > instance_usage_audit = True > Hi, >From the collector log RESP BODY messages I understand that in the beginning there were the following servers: compute1: deltesting ubuntu: Apigateway, KubeMaster and others After ~20minutes, there was only Apigateway. Does it make sense? did you delete the instances on ubuntu, in addition to deltesting? In any case, I would expect to see the instances deleted from the graph at this stage, since they were not returned by get_all. Can you please send me the log of vitrage-graph at the same time (Nov 15, 16:35-17:10)? There is still the question of why we don't see a notification from Nova, but let's try to solve the issues one by one. Thanks, Ifat __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Hi, We solved the timestamp bug. There are two patches for master [1] and stable/rocky [2]. I'll check the other issues next week. Regards, Ifat [1] https://review.openstack.org/#/c/616468/ [2] https://review.openstack.org/#/c/616469/ On Wed, Oct 31, 2018 at 10:59 AM Won wrote: > [image: image.png] The time stamp is recorded well in log(vitrage-graph,collect etc), but in vitrage-dashboard it is marked 2001-01-01. However, it seems that the time stamp is recognized well internally because the alarm can be resolved and is recorded well in log. >>> __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Hi, On Wed, Oct 31, 2018 at 11:00 AM Won wrote: > Hi, > >> >> This is strange. I would expect your original definition to work as well, >> since the alarm key in Vitrage is defined by a combination of the alert >> name and the instance. We will check it again. >> BTW, we solved a different bug related to Prometheus alarms not being >> cleared [1]. Could it be related? >> > > Using the original definition, no matter how different the instances are, > the alarm names are recognized as the same alarm in vitrage. > And I tried to install the rocky version and the master version on the new > server and retest but the problem was not solved. The latest bugfix seems > irrelevant. > Ok. We will check this issue. For now your workaround is ok, right? > Does the wrong timestamp appear if you run 'vitrage alarm list' cli >> command? please try running 'vitrage alarm list --debug' and send me the >> output. >> > > I have attached 'vitrage-alarm-list.txt.' > I believe that you attached the wrong file. It seems like another log of vitrage-graph. > > >> Please send me vitrage-collector.log and vitrage-graph.log from the time >> that the problematic vm was created and deleted. Please also create and >> delete a vm on your 'ubuntu' server, so I can check the differences in the >> log. >> > > I have attached 'vitrage_log_on_compute1.zip' and > 'vitrage_log_on_ubuntu.zip' files. > When creating a vm on computer1, a vitrage-collect log occurred, but no > log occurred when it was removed. > Looking at the logs, I see two issues: 1. On ubuntu server, you get a notification about the vm deletion, while on compute1 you don't get it. Please make sure that Nova sends notifications to 'vitrage_notifications' - it should be configured in /etc/nova/nova.conf. 2. Once in 10 minutes (by default) nova.instance datasource queries all instances. The deleted vm is supposed to be deleted in Vitrage at this stage, even if the notification was lost. Please check in your collector log for the a message of "novaclient.v2.client [-] RESP BODY" before and after the deletion, and send me its content. Br, Ifat __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Hi, > > This is strange. I would expect your original definition to work as well, > since the alarm key in Vitrage is defined by a combination of the alert > name and the instance. We will check it again. > BTW, we solved a different bug related to Prometheus alarms not being > cleared [1]. Could it be related? > Using the original definition, no matter how different the instances are, the alarm names are recognized as the same alarm in vitrage. And I tried to install the rocky version and the master version on the new server and retest but the problem was not solved. The latest bugfix seems irrelevant. Does the wrong timestamp appear if you run 'vitrage alarm list' cli > command? please try running 'vitrage alarm list --debug' and send me the > output. > I have attached 'vitrage-alarm-list.txt.' > Please send me vitrage-collector.log and vitrage-graph.log from the time > that the problematic vm was created and deleted. Please also create and > delete a vm on your 'ubuntu' server, so I can check the differences in the > log. > I have attached 'vitrage_log_on_compute1.zip' and 'vitrage_log_on_ubuntu.zip' files. When creating a vm on computer1, a vitrage-collect log occurred, but no log occurred when it was removed. Br, Won 2018년 10월 30일 (화) 오전 1:28, Ifat Afek 님이 작성: > Hi, > > On Fri, Oct 26, 2018 at 10:34 AM Won wrote: > >> >> I solved the problem of not updating the Prometheus alarm. >> Alarms with the same Prometheus alarm name are recognized as the same >> alarm in vitrage. >> >> --- alert.rules.yml >> groups: >> - name: alert.rules >> rules: >> - alert: InstanceDown >> expr: up == 0 >> for: 60s >> labels: >> severity: warning >> annotations: >> description: '{{ $labels.instance }} of job {{ $labels.job }} has >> been down >> for more than 30 seconds.' >> summary: Instance {{ $labels.instance }} down >> -- >> This is the contents of the alert.rules.yml file before I modify it. >> This is a yml file that generates an alarm when the cardvisor >> stops(instance down). Alarm is triggered depending on which instance is >> down, but all alarms have the same name as 'instance down'. Vitrage >> recognizes all of these alarms as the same alarm. Thus, until all 'instance >> down' alarms were cleared, the 'instance down' alarm was recognized as >> unresolved and the alarm was not extinguished. >> > > This is strange. I would expect your original definition to work as well, > since the alarm key in Vitrage is defined by a combination of the alert > name and the instance. We will check it again. > BTW, we solved a different bug related to Prometheus alarms not being > cleared [1]. Could it be related? > > >> Can you please show me where you saw the 2001 timestamp? I didn't find it >>> in the log. >>> >> >> [image: image.png] >> The time stamp is recorded well in log(vitrage-graph,collect etc), but in >> vitrage-dashboard it is marked 2001-01-01. >> However, it seems that the time stamp is recognized well internally >> because the alarm can be resolved and is recorded well in log. >> > > Does the wrong timestamp appear if you run 'vitrage alarm list' cli > command? please try running 'vitrage alarm list --debug' and send me the > output. > > >> [image: image.png] >> Host name ubuntu is my main server. I install openstack all in one in >> this server and i install compute node in host name compute1. >> When i create a new vm in nova(compute1) it immediately appear in the >> entity graph. But in does not disappear in the entity graph when i delete >> the vm. No matter how long i wait, it doesn't disappear. >> Afther i execute 'vitrage-purge-data' command and reboot the >> Openstack(execute reboot command in openstack server(host name ubuntu)), it >> disappear. Only execute 'vitrage-purge-data' does not work. It need a >> reboot to disappear. >> When i create a new vm in nova(ubuntu) there is no problem. >> > Please send me vitrage-collector.log and vitrage-graph.log from the time > that the problematic vm was created and deleted. Please also create and > delete a vm on your 'ubuntu' server, so I can check the differences in the > log. > > I implemented the web service of the micro service architecture and >> applied the RCA. Attached file picture shows the structure of the web >> service I have implemented. I wonder what data I receive and what can i do >> when I link vitrage with kubernetes. >> As i know, the vitrage graph does not present information about >> containers or pods inside the vm. If that is correct, I would like to make >> the information of the pod level appear on the entity graph. >> >> I follow ( >> https://docs.openstack.org/vitrage/latest/contributor/k8s_datasource.html) >> this step. I attached the vitage.conf file and the kubeconfig file. The >> contents of the Kubeconconfig file are copied from the contents of the >> admin.conf file on the master node. >> I want to check my settings are right and connected, but I don't know >> ho
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Hi, On Fri, Oct 26, 2018 at 10:34 AM Won wrote: > > I solved the problem of not updating the Prometheus alarm. > Alarms with the same Prometheus alarm name are recognized as the same > alarm in vitrage. > > --- alert.rules.yml > groups: > - name: alert.rules > rules: > - alert: InstanceDown > expr: up == 0 > for: 60s > labels: > severity: warning > annotations: > description: '{{ $labels.instance }} of job {{ $labels.job }} has > been down > for more than 30 seconds.' > summary: Instance {{ $labels.instance }} down > -- > This is the contents of the alert.rules.yml file before I modify it. > This is a yml file that generates an alarm when the cardvisor > stops(instance down). Alarm is triggered depending on which instance is > down, but all alarms have the same name as 'instance down'. Vitrage > recognizes all of these alarms as the same alarm. Thus, until all 'instance > down' alarms were cleared, the 'instance down' alarm was recognized as > unresolved and the alarm was not extinguished. > This is strange. I would expect your original definition to work as well, since the alarm key in Vitrage is defined by a combination of the alert name and the instance. We will check it again. BTW, we solved a different bug related to Prometheus alarms not being cleared [1]. Could it be related? > Can you please show me where you saw the 2001 timestamp? I didn't find it >> in the log. >> > > [image: image.png] > The time stamp is recorded well in log(vitrage-graph,collect etc), but in > vitrage-dashboard it is marked 2001-01-01. > However, it seems that the time stamp is recognized well internally > because the alarm can be resolved and is recorded well in log. > Does the wrong timestamp appear if you run 'vitrage alarm list' cli command? please try running 'vitrage alarm list --debug' and send me the output. > [image: image.png] > Host name ubuntu is my main server. I install openstack all in one in this > server and i install compute node in host name compute1. > When i create a new vm in nova(compute1) it immediately appear in the > entity graph. But in does not disappear in the entity graph when i delete > the vm. No matter how long i wait, it doesn't disappear. > Afther i execute 'vitrage-purge-data' command and reboot the > Openstack(execute reboot command in openstack server(host name ubuntu)), it > disappear. Only execute 'vitrage-purge-data' does not work. It need a > reboot to disappear. > When i create a new vm in nova(ubuntu) there is no problem. > Please send me vitrage-collector.log and vitrage-graph.log from the time that the problematic vm was created and deleted. Please also create and delete a vm on your 'ubuntu' server, so I can check the differences in the log. I implemented the web service of the micro service architecture and applied > the RCA. Attached file picture shows the structure of the web service I > have implemented. I wonder what data I receive and what can i do when I > link vitrage with kubernetes. > As i know, the vitrage graph does not present information about containers > or pods inside the vm. If that is correct, I would like to make the > information of the pod level appear on the entity graph. > > I follow ( > https://docs.openstack.org/vitrage/latest/contributor/k8s_datasource.html) > this step. I attached the vitage.conf file and the kubeconfig file. The > contents of the Kubeconconfig file are copied from the contents of the > admin.conf file on the master node. > I want to check my settings are right and connected, but I don't know how. > It would be very much appreciated if you let me know how. > Unfortunately, Vitrage does not hold pods and containers information at the moment. We discussed the option of adding it in Stein release, but I'm not sure we will get to do it. Br, Ifat [1] https://review.openstack.org/#/c/611258/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Hi Won, On Wed, Oct 10, 2018 at 11:58 AM Won wrote: > > my prometheus version : 2.3.2 and alertmanager version : 0.15.2 and I > attached files.(vitrage collector,graph logs and apache log and > prometheus.yml alertmanager.yml alarm rule file etc..) > I think the problem that resolved alarm does not disappear is the time > stamp problem of the alarm. > > -gray alarm info > severity:PAGE > vitrage id: c6a94386-3879-499e-9da0-2a5b9d3294b8 , > e2c5eae9-dba9-4f64-960b-b964f1c01dfe , 3d3c903e-fe09-4a6f-941f-1a2adb09feca > , 8c6e7906-9e66-404f-967f-40037a6afc83 , > e291662b-115d-42b5-8863-da8243dd06b4 , 8abd2a2f-c830-453c-a9d0-55db2bf72d46 > -- > > The alarms marked with the blue circle are already resolved. However, it > does not disappear from the entity graph and alarm list. > There were seven more gray alarms at the top screenshot in active alarms > like entity graph. It disappeared by deleting gray alarms from the > vitrage-alarms table in the DB or changing the end timestamp value to an > earlier time than the current time. > I checked the files that you sent, and it appears that the connection between Prometheus and Vitrage works well. I see in vitrage-graph log that Prometheus notified both on alert firing and on alert resolved statuses. I still don't understand why the alarms were not removed from Vitrage, though. Can you please send me the output of 'vitrage topology show' CLI command? Also, did you happen to restart vitrage-graph or vitrage-collector during your tests? > At the log, it seems that the first problem is that the timestamp value > from the vitrage comes to 2001-01-01, even though the starting value in the > Prometheus alarm information has the correct value. > When the alarm is solved, the end time stamp value is not updated so alarm > does not disappear from the alarm list. > Can you please show me where you saw the 2001 timestamp? I didn't find it in the log. > The second problem is that even if the time stamp problem is solved, the > entity graph problem will not be solved. Gray alarm information is not in > the vitage-collector log but in the vitrage graph and apache log. > I want to know how to forcefully delete entity from a vitage graph. > You shouldn't do it :-) there is no API for deleting entities, and messing with the database may cause unexpected results. The only thing that you can safely do is to stop all Vitrage services, execute 'vitrage-purge-data' command, and start the services again. This will cause rebuilding of the entity graph. > Regarding the multi nodes, I mean, 1 controll node(pc1) & 1 compute > node(pc2). So one openstack. > > The test VM in the picture is an instance on compute node that has already > been deleted. I waited for hours and checked nova.conf but it was not > removed. > This was not the occur in the queens version; in the rocky version, > multinode environment, there seems to be a bug in VM creation on multi node. > The same situation occurred in multi-node environments that were > configured with different PCs. > Let me make sure I understand the problem. When you create a new vm in Nova, does it immediately appear in the entity graph? When you delete a vm, it remains? does it remain in a multi-node environment and deleted in a single node environment? Br, Ifat __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Hi, Can you please give us some more details about your scenario with Prometheus? Please try and give as many details as possible, so we can try to reproduce the bug. What do you mean by “if the alarm is resolved, the alarm manager makes a silence, or removes the alarm rule from Prometheus”? these are different cases. None of them works in your environment? Which Prometheus and Alertmanager versions are you using? Please try to change the Vitrage loglevel to DEBUG (set “debug = true” in /etc/vitrage/vitrage.conf) and send me the Vitrage collector, graph and api logs. Regarding the multi nodes, I'm not sure I understand your configuration. Do you mean there is more than one OpenStack and Nova? more than one host? more than one vm? Basically, vms are deleted from Vitrage in two cases: 1. After each periodic call to get_all of nova.instance datasource. By default this is done once in 10 minutes. 2. Immediately, if you have the following configuration in /etc/nova/nova.conf: notification_topics = notifications,vitrage_notifications So, please check your nova.conf and also whether the vms are deleted after 10 minutes. Thanks, Ifat On Thu, Oct 4, 2018 at 7:12 AM Won wrote: > Thank you for your reply Ifat. > > The alertmanager.yml file already contains 'send_resolved:true'. > However, the alarm does not disappear from the alarm list and the entity > graph even if the alarm is resolved, the alarm manager makes a silence, or > removes the alarm rule from Prometheus. > The only way to remove alarms is to manually remove them from the db. Is > there any other way to remove the alarm? > Entities(vm) that run on multi nodes in the rocky version have similar > symptoms. There was a symptom that the Entities created on the multi-node > would not disappear from the Entity Graph even after deletion. > Is this a bug in rocky version? > > Best Regards, > Won > > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Thank you for your reply Ifat. The alertmanager.yml file already contains 'send_resolved:true'. However, the alarm does not disappear from the alarm list and the entity graph even if the alarm is resolved, the alarm manager makes a silence, or removes the alarm rule from Prometheus. The only way to remove alarms is to manually remove them from the db. Is there any other way to remove the alarm? Entities(vm) that run on multi nodes in the rocky version have similar symptoms. There was a symptom that the Entities created on the multi-node would not disappear from the Entity Graph even after deletion. Is this a bug in rocky version? Best Regards, Won 2018년 10월 3일 (수) 오후 5:46, Ifat Afek 님이 작성: > Hi, > > In the alertmanager.yml file you should have a receiver for Vitrage. > Please verify that it includes "send_resolved: true". This is required for > Prometheus to notify Vitrage when an alarm is resolved. > > The full Vitrage receiver definition should be: > > - name: ** > > webhook_configs: > > - url: ** # example: 'http://127.0.0.1:8999/v1/event > ' > > send_resolved: true > > http_config: > > basic_auth: > > username: ** > > password: ** > > Hope it helps, > Ifat > > > On Tue, Oct 2, 2018 at 7:51 AM Won wrote: > >> I have some problems with Prometheus alarms in vitrage. >> I receive a list of alarms from the Prometheus alarm manager well, but >> the alarm does not disappear when the problem(alarm) is resolved. The alarm >> that came once in both the alarm list and the entity graph does not >> disappear in vitrage. The alarm sent by zabbix disappears when alarm >> solved, I wonder how to clear the Prometheus alarm from vitrage and how to >> update the alarm automatically like zabbix. >> thank you. >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [vitrage] I have some problems with Prometheus alarms in vitrage.
Hi, In the alertmanager.yml file you should have a receiver for Vitrage. Please verify that it includes "send_resolved: true". This is required for Prometheus to notify Vitrage when an alarm is resolved. The full Vitrage receiver definition should be: - name: ** webhook_configs: - url: ** # example: 'http://127.0.0.1:8999/v1/event' send_resolved: true http_config: basic_auth: username: ** password: ** Hope it helps, Ifat On Tue, Oct 2, 2018 at 7:51 AM Won wrote: > I have some problems with Prometheus alarms in vitrage. > I receive a list of alarms from the Prometheus alarm manager well, but the > alarm does not disappear when the problem(alarm) is resolved. The alarm > that came once in both the alarm list and the entity graph does not > disappear in vitrage. The alarm sent by zabbix disappears when alarm > solved, I wonder how to clear the Prometheus alarm from vitrage and how to > update the alarm automatically like zabbix. > thank you. > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev