gushizone opened a new issue #7149:
URL: https://github.com/apache/skywalking/issues/7149
Please answer these questions before submitting your issue.
- Why do you submit this issue?
- [x] Question or discussion
- [x] Bug
- [ ] Requirement
- [x] Feature or performance improvement
___
### Question
- What do you want to know?
Why do I never see an event from the SkyWalking agent?
How can I be sure that the agent reports the event normally?
___
### Bug
- Which version of SkyWalking, OS, and JRE?
skywalking 8.6.0
- What happened?I want to use k8s event to trigger an alarm, but it doesn't
work properly
I want to use k8s event to trigger an alarm, but it doesn't work properly.
The console of `skywalking-kubernetes-event-exporter` printed the k8s
event, but the alarm did not trigger.
Something that can be determined.
- `service_resp_time_rule` can work normally.
- Alarms from k8s events sometimes appear, but they are very rare, and no
regularity can be found.
`alarm.default.alarm-settings`
```yaml
rules:
service_resp_time_rule:
metrics-name: service_resp_time
op: ">"
threshold: 500
period: 2
count: 1
message: Response time of service {name} is more than 500ms in 1 minutes
of last 2 minutes.
# agent event
start_event_rule:
metrics-name: Start
threshold: 1
op: ">="
period: 1
count: 1
message: Service instance [{name}] has been Start
shutdown_event_rule:
metrics-name: Shutdown
threshold: 1
op: ">="
period: 1
count: 1
message: Service instance [{name}] has been Shutdown
# oap event
alarm_event_rule:
metrics-name: Alarm
threshold: 1
op: ">="
period: 1
count: 1
message: Service instance [{name}] has been Alarm
# k8s event
killing_event_rule:
metrics-name: Killing
threshold: 1
op: ">="
period: 1
count: 1
message: Service instance [{name}] has been killing
ceated_event_rule:
metrics-name: Created
threshold: 1
op: ">="
period: 1
count: 1
message: Service instance [{name}] has been Created
started_event_rule:
metrics-name: Started
threshold: 1
op: ">="
period: 1
count: 1
message: Service instance [{name}] has been Started
unhealthy_event_rule:
metrics-name: Unhealthy
threshold: 1
op: ">="
period: 1
count: 1
message: Service instance [{name}] has been unhealthy for 1 minutes
# event statistics
event_total_rule:
metrics-name: event_total
threshold: 1
op: ">="
period: 1
count: 1
message: Service instance [{name}] has been event for 1 minutes
webhooks:
- http://127.0.0.1:8080/skw/alarm
```
`skywalking-kubernetes-event-exporter` console log
```shell
INFO
{"uuid":"2ab7c04e-148d-48dc-993b-4115534cf251","source":{},"name":"NoPods","message":"No
matching pods found","startTime":1623308503000,"endTime":1624259215000}
INFO
{"uuid":"5670a802-8f62-4949-a761-62b8e931879b","source":{},"name":"SuccessfulCreate","message":"Created
pod:
eai-engine-667b546458-nd9bd","startTime":1624258252000,"endTime":1624258252000}
INFO
{"uuid":"e4480ce5-5ee2-4ec2-b11e-ef9fc9bdfd67","source":{},"name":"ScalingReplicaSet","message":"Scaled
up replica set eai-engine-667b546458 to
1","startTime":1624258252000,"endTime":1624258252000}
INFO
{"uuid":"c9b0ae2f-9084-4189-a01a-5121d237c6f0","source":{},"name":"SuccessfulCreate","message":"Created
pod:
skywalking-event-exporter-866fc9bb87-95g8s","startTime":1624259277000,"endTime":1624259277000}
INFO
{"uuid":"23464532-d641-4bc5-80cf-674ee2341b1f","source":{},"name":"ScalingReplicaSet","message":"Scaled
up replica set skywalking-event-exporter-866fc9bb87 to
1","startTime":1624259277000,"endTime":1624259277000}
INFO
{"uuid":"1e284d4c-b7b9-4ea6-9832-2bf335c59807","source":{},"name":"ScalingReplicaSet","message":"Scaled
down replica set skywalking-event-exporter-8699d78495 to
0","startTime":1624259286000,"endTime":1624259286000}
INFO
{"uuid":"db9c5a06-c9c4-4f63-a3c8-1fbbaf3d22ff","source":{},"name":"SuccessfulDelete","message":"Deleted
pod:
skywalking-event-exporter-8699d78495-bz5qk","startTime":1624259286000,"endTime":1624259286000}
INFO
{"uuid":"3a628a9f-f751-4b7e-9357-be691520e40b","source":{"service":"eai-engine-svc","serviceInstance":"eai-engine-78468f4454-bl2dm"},"name":"TaintManagerEviction","message":"Cancelling
deletion of Pod
dev/eai-engine-78468f4454-bl2dm","startTime":1623738628000,"endTime":1624259058000}
INFO
{"uuid":"6194854b-7da7-4d7f-9f5e-1d89f97f7be2","source":{"service":"skywalk-server","serviceInstance":"skywalking-oap-5f6ff674f7-44m6l"},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: dial tcp 10.244.14.106:12800: i/o
timeout","startTime":1624258442000,"endTime":1624258925000}
INFO
{"uuid":"e0608fc6-b0b3-4208-a2cf-1fac3c931889","source":{"service":"***-auth-server-svc","serviceInstance":"***-auth-server-6d4cc4d69-zgrvm"},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: HTTP probe failed with statuscode:
503","startTime":1623147974000,"endTime":1624259214000}
INFO
{"uuid":"80671479-19ea-4930-8fa3-5e13b5412024","source":{"service":"***-task-svc","serviceInstance":"***-task-86d58c5b44-vwgvn"},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: Get http://10.244.14.109:9009/taskManager/actuator/health:
net/http: request canceled (Client.Timeout exceeded while awaiting
headers)","startTime":1624258381000,"endTime":1624258381000}
INFO
{"uuid":"21059e00-6435-4834-a072-d423664d55ea","source":{"service":"skw-ui","serviceInstance":"skywalking-ui-65c8b5b58c-vsljh"},"name":"TaintManagerEviction","message":"Cancelling
deletion of Pod
dev/skywalking-ui-65c8b5b58c-vsljh","startTime":1624259058000,"endTime":1624259058000}
INFO
{"uuid":"be998334-505d-4cb2-8cbe-c411854fcaaf","source":{"service":"skywalk-server","serviceInstance":"skywalking-oap-5f6ff674f7-44m6l"},"name":"Pulled","message":"Container
image \"apache/skywalking-oap-server:8.6.0-es7\" already present on
machine","startTime":1624242921000,"endTime":1624258546000}
INFO
{"uuid":"31319b10-2ae6-47a2-925c-6bd83c24aedc","source":{"service":"skywalk-server","serviceInstance":"skywalking-oap-5f6ff674f7-44m6l"},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: dial tcp 10.244.14.106:12800: connect: connection
refused","startTime":1624242955000,"endTime":1624258575000}
INFO
{"uuid":"5a6a2d38-dfff-4c6f-852d-9acbc2c607a5","source":{"service":"***-auth-server-svc","serviceInstance":"***-auth-server-6d4cc4d69-zgrvm"},"name":"Started","message":"Started
container ***-auth","startTime":1623060063000,"endTime":1624259062000}
INFO
{"uuid":"2e10b7ce-42f6-45a2-bcd9-f014f09a79ac","source":{"service":"***-auth-server-svc","serviceInstance":"***-auth-server-6d4cc4d69-zgrvm"},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: HTTP probe failed with statuscode:
503","startTime":1623147996000,"endTime":1624259196000}
INFO
{"uuid":"e84237c7-5ad7-4764-a619-493b787bfba3","source":{"service":"***-task-svc","serviceInstance":"***-task-86d58c5b44-vwgvn"},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: HTTP probe failed with statuscode:
503","startTime":1624259031000,"endTime":1624259121000}
INFO
{"uuid":"55225d94-4ad5-4711-bf19-781fe79c8359","source":{"service":"***-alarm-svc","serviceInstance":"***-alarm-5bf76c99f8-g2vgz"},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: HTTP probe failed with statuscode:
503","startTime":1623060743000,"endTime":1624258613000}
INFO
{"uuid":"1744068a-135a-477e-b3dd-5050894ac8a9","source":{"service":"skywalk-server","serviceInstance":"skywalking-oap-5f6ff674f7-44m6l"},"name":"Started","message":"Started
container oap","startTime":1624242923000,"endTime":1624258550000}
INFO
{"uuid":"50c02b6e-a944-454f-b444-d1176fb0c9c6","source":{"service":"***-alarm-svc","serviceInstance":"***-alarm-5bf76c99f8-g2vgz"},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: Get http://10.244.8.160:9012/actuator/health: net/http: request
canceled (Client.Timeout exceeded while awaiting
headers)","startTime":1623059514000,"endTime":1624259154000}
INFO
{"uuid":"2fea99fa-3703-414c-bcd2-b8976ce06405","source":{"service":"***-auth-server-svc","serviceInstance":"***-auth-server-6d4cc4d69-zgrvm"},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: Get http://10.244.8.171:8081/actuator/health: net/http: request
canceled (Client.Timeout exceeded while awaiting
headers)","startTime":1624258684000,"endTime":1624259154000}
INFO
{"uuid":"aefb021e-17e2-45fc-84e3-1b79b425a869","source":{"service":"***-gateway-svc","serviceInstance":"***-gateway-67d9846769-bzfx2"},"name":"TaintManagerEviction","message":"Cancelling
deletion of Pod
dev/***-gateway-67d9846769-bzfx2","startTime":1623738628000,"endTime":1624259058000}
INFO
{"uuid":"9dd3f0b2-13fa-4051-a53c-1d58d9a404fd","source":{"service":"***-notify-svc","serviceInstance":"***-notify-c979fcbc8-khgkb"},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: Get http://10.244.14.35:9092/actuator/health: net/http: request
canceled (Client.Timeout exceeded while awaiting
headers)","startTime":1624258378000,"endTime":1624258378000}
INFO
{"uuid":"d50ab810-d04b-4c80-a6ff-9f987611982a","source":{"service":"***-alarm-svc","serviceInstance":"***-alarm-5bf76c99f8-g2vgz"},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: Get http://10.244.8.160:9012/actuator/health: net/http: request
canceled (Client.Timeout exceeded while awaiting
headers)","startTime":1623059522000,"endTime":1624258382000}
INFO
{"uuid":"450b5e48-a3fa-4f4c-bf07-8cf93e26238e","source":{"service":"skywalk-server","serviceInstance":"skywalking-oap-5f6ff674f7-44m6l"},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: dial tcp 10.244.14.106:12800: i/o
timeout","startTime":1624258442000,"endTime":1624259058000}
INFO
{"uuid":"4ec5c6b9-d391-461b-b4af-4554a715ee03","source":{"service":"***-auth-server-svc","serviceInstance":"***-auth-server-6d4cc4d69-zgrvm"},"name":"Created","message":"Created
container ***-auth","startTime":1623060063000,"endTime":1624259062000}
INFO
{"uuid":"18aa6dc9-bdf2-4a43-8a89-bca9c91f3e0d","source":{"service":"***-auth-server-svc","serviceInstance":"***-auth-server-6d4cc4d69-zgrvm"},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: Get http://10.244.8.171:8081/actuator/health: dial tcp
10.244.8.171:8081: connect: connection
refused","startTime":1623060074000,"endTime":1624259104000}
INFO
{"uuid":"9fc82041-3b2b-40f7-9270-c1dfc5fca97f","source":{"service":"***-auth-server-svc","serviceInstance":"***-auth-server-6d4cc4d69-zgrvm"},"name":"Killing","message":"Container
***-auth failed liveness probe, will be
restarted","startTime":1624259056000,"endTime":1624259056000}
INFO
{"uuid":"325d249f-8d7f-43b1-a385-0d77ae1f3a86","source":{"service":"***-file-svc","serviceInstance":"***-file-587d99fc98-x9lq4"},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: HTTP probe failed with statuscode:
503","startTime":1623770160000,"endTime":1624259120000}
INFO
{"uuid":"d3edd2c8-dd19-441a-9eb1-253d0118f5b4","source":{"service":"***-auth-server-svc","serviceInstance":"***-auth-server-6d4cc4d69-zgrvm"},"name":"Pulled","message":"Container
image
\"10.180.4.204:30151/repository/yonyou-auto-architect-docker-repository/***-auth-server:e435507\"
already present on machine","startTime":1624259062000,"endTime":1624259062000}
INFO
{"uuid":"8f31ea81-a5d2-4141-96f3-431642b0bc4f","source":{"service":"***-gateway-admin-svc","serviceInstance":"***-gateway-admin-5686845c44-lwdff"},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: Get http://10.244.15.177:9005/actuator/health: net/http: request
canceled (Client.Timeout exceeded while awaiting
headers)","startTime":1623913007000,"endTime":1624258382000}
INFO
{"uuid":"a921c1da-e076-4a45-b438-015a1880f05e","source":{},"name":"Killing","message":"Container
eai-engine failed liveness probe, will be
restarted","startTime":1624258395000,"endTime":1624258395000}
INFO
{"uuid":"706ed425-c36e-4117-9aa3-3e6368cc3967","source":{},"name":"Started","message":"Started
container eai-engine","startTime":1624258287000,"endTime":1624258404000}
INFO
{"uuid":"f2a29a4a-56f4-4f04-b87a-abb1b4482a2c","source":{},"name":"Pulled","message":"Container
image
\"10.180.4.204:30151/repository/yonyou-auto-architect-docker-repository/eai-engine:info-eai-1.3.7-185-gc9ce93b\"
already present on machine","startTime":1624258399000,"endTime":1624258399000}
INFO
{"uuid":"01e780d1-68c7-495a-805c-9fb96a0e7ff4","source":{},"name":"Scheduled","message":"Successfully
assigned dev/eai-engine-667b546458-nd9bd to
node208","startTime":-6795364578871,"endTime":-6795364578871}
INFO
{"uuid":"d8cc09de-3dfb-4e9a-aaf9-ad76085a3497","source":{},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: Get http://10.244.17.50:9088/infoeai/actuator/health: dial tcp
10.244.17.50:9088: connect: connection
refused","startTime":1624258365000,"endTime":1624258395000}
INFO
{"uuid":"3b1e4787-9609-46f1-97fb-bab5383c9d5b","source":{},"name":"Pulled","message":"Successfully
pulled image
\"10.180.4.204:30151/repository/yonyou-auto-architect-docker-repository/eai-engine:info-eai-1.3.7-185-gc9ce93b\"","startTime":1624258283000,"endTime":1624258283000}
INFO
{"uuid":"5edea087-b979-445b-adfa-702dc62262a3","source":{},"name":"Pulling","message":"Pulling
image
\"10.180.4.204:30151/repository/yonyou-auto-architect-docker-repository/eai-engine:info-eai-1.3.7-185-gc9ce93b\"","startTime":1624258256000,"endTime":1624258256000}
INFO
{"uuid":"bb83ad2c-e153-4110-b60d-132a5fba8309","source":{},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: Get http://10.244.17.50:9088/infoeai/actuator/health: dial tcp
10.244.17.50:9088: connect: connection
refused","startTime":1624258302000,"endTime":1624259162000}
INFO
{"uuid":"9acb0f05-6183-4fd4-9673-c144e43e29ff","source":{},"name":"Created","message":"Created
container eai-engine","startTime":1624258287000,"endTime":1624258403000}
INFO
{"uuid":"6ddd9e64-2c68-4bcb-b040-d58aa2b14fa2","source":{},"name":"Pulled","message":"Container
image \"kezhenxu94/skywalking-kubernetes-event-exporter:2f5f8b5\" already
present on machine","startTime":1624259283000,"endTime":1624259283000}
INFO
{"uuid":"d42eaae8-d0a9-4018-8700-b400cdecce90","source":{},"name":"Pulled","message":"Successfully
pulled image
\"10.180.4.204:30151/repository/yonyou-auto-architect-docker-repository/***-gateway-admin:3d5e943\"","startTime":1623308143000,"endTime":1624258399000}
INFO
{"uuid":"8405f95f-536b-40c0-8e24-2770839646c9","source":{},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: Get http://10.244.17.250:9005/actuator/health: dial tcp
10.244.17.250:9005: connect: connection
refused","startTime":1624258385000,"endTime":1624258997000}
INFO
{"uuid":"5a407259-e878-4271-8ef3-c1a77d1ca1f7","source":{},"name":"Killing","message":"Container
second-***-gateway-admin failed liveness probe, will be
restarted","startTime":1624258375000,"endTime":1624258375000}
INFO
{"uuid":"49be4db7-c882-49f0-9c11-b1f341cd78f1","source":{},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: Get http://10.244.17.250:9005/actuator/health: dial tcp
10.244.17.250:9005: connect: connection
refused","startTime":1624258503000,"endTime":1624258503000}
INFO
{"uuid":"128d20f3-3363-48d5-b00a-be1939b13ed5","source":{},"name":"Unhealthy","type":1,"message":"Liveness
probe failed: Get http://10.244.17.250:9005/actuator/health: net/http: request
canceled (Client.Timeout exceeded while awaiting
headers)","startTime":1623852815000,"endTime":1624258375000}
INFO
{"uuid":"c3572bd8-e167-43f6-858f-8a43f359dbbb","source":{},"name":"Created","message":"Created
container
second-***-gateway-admin","startTime":1623308145000,"endTime":1624258402000}
INFO
{"uuid":"f7734a93-dd14-4399-9f80-3c5af29029e1","source":{},"name":"Started","message":"Started
container
skywalking-event-exporter","startTime":1624259285000,"endTime":1624259285000}
INFO
{"uuid":"1a2b45a8-3857-4504-9c81-d60357b02837","source":{},"name":"Created","message":"Created
container
skywalking-event-exporter","startTime":1624259284000,"endTime":1624259284000}
INFO
{"uuid":"a177d286-3f25-45c0-a08e-951381c466ff","source":{},"name":"BackOff","type":1,"message":"Back-off
restarting failed container","startTime":1624259225000,"endTime":1624259280000}
INFO
{"uuid":"9e1f3f1a-10fa-453a-87f5-fdeff8d39d47","source":{},"name":"Scheduled","message":"Successfully
assigned dev/skywalking-event-exporter-866fc9bb87-95g8s to
node209","startTime":-6795364578871,"endTime":-6795364578871}
INFO
{"uuid":"96f3bf87-9e64-44fb-9475-efb541e0b6e2","source":{},"name":"Pulling","message":"Pulling
image
\"10.180.4.204:30151/repository/yonyou-auto-architect-docker-repository/***-gateway-admin:3d5e943\"","startTime":1623308143000,"endTime":1624258398000}
INFO
{"uuid":"bd20ab65-d969-4ce5-8912-e36ee38ebbc5","source":{},"name":"Unhealthy","type":1,"message":"Readiness
probe failed: Get http://10.244.17.250:9005/actuator/health: net/http: request
canceled (Client.Timeout exceeded while awaiting
headers)","startTime":1623852823000,"endTime":1624258677000}
INFO
{"uuid":"f347845e-c7bf-4470-9277-06b8c4592c42","source":{},"name":"Started","message":"Started
container
second-***-gateway-admin","startTime":1623308146000,"endTime":1624258404000}
INFO
{"uuid":"a2339afc-5861-4cc9-84cd-1ff74311e727","source":{},"name":"Killing","message":"Stopping
container
skywalking-event-exporter","startTime":1624259287000,"endTime":1624259287000}
INFO
{"uuid":"2ab7c04e-148d-48dc-993b-4115534cf251","source":{},"name":"NoPods","message":"No
matching pods found","startTime":1623308503000,"endTime":1624259515000}
```
___
### Requirement or improvement
- Please describe your requirements or improvement suggestions.
The query of the alarm function can add a time filter, just like a
tracking query.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]