wu-sheng commented on code in PR #614: URL: https://github.com/apache/skywalking-website/pull/614#discussion_r1241374523
########## content/blog/2023-06-25-intruducing-continuous-profiling-skywalking-with-ebpf/index.md: ########## @@ -0,0 +1,248 @@ +--- +title: "Activating Automatical Performance Analysis -- Continuous Profiling" +date: 2023-06-25 +author: "Han Liu" +description: "Introduce and demonstrate how SkyWalking implements eBPF-based process monitoring with few manual engagements. The profiling could be automatically activated driven by the preset conditions." +tags: +- eBPF +- Profiling +- Tracing +--- + +# Background + +In previous articles, We have discussed how to use SkyWalking and eBPF for performance problem detection within [processes](/blog/2022-07-05-pinpoint-service-mesh-critical-performance-impact-by-using-ebpf) and [networks](blog/diagnose-service-mesh-network-performance-with-ebpf). +However, there are still two outstanding issues: Review Comment: ```suggestion They are good methods to locate issues, but still there are some challenges: ``` ########## content/blog/2023-06-25-intruducing-continuous-profiling-skywalking-with-ebpf/index.md: ########## @@ -0,0 +1,248 @@ +--- +title: "Activating Automatical Performance Analysis -- Continuous Profiling" +date: 2023-06-25 +author: "Han Liu" +description: "Introduce and demonstrate how SkyWalking implements eBPF-based process monitoring with few manual engagements. The profiling could be automatically activated driven by the preset conditions." +tags: +- eBPF +- Profiling +- Tracing +--- + +# Background + +In previous articles, We have discussed how to use SkyWalking and eBPF for performance problem detection within [processes](/blog/2022-07-05-pinpoint-service-mesh-critical-performance-impact-by-using-ebpf) and [networks](blog/diagnose-service-mesh-network-performance-with-ebpf). +However, there are still two outstanding issues: Review Comment: ```suggestion They are good methods to locate issues, but still there are some challenges: ``` ########## content/blog/2023-06-25-intruducing-continuous-profiling-skywalking-with-ebpf/index.md: ########## @@ -0,0 +1,248 @@ +--- +title: "Activating Automatical Performance Analysis -- Continuous Profiling" +date: 2023-06-25 +author: "Han Liu" +description: "Introduce and demonstrate how SkyWalking implements eBPF-based process monitoring with few manual engagements. The profiling could be automatically activated driven by the preset conditions." +tags: +- eBPF +- Profiling +- Tracing +--- + +# Background + +In previous articles, We have discussed how to use SkyWalking and eBPF for performance problem detection within [processes](/blog/2022-07-05-pinpoint-service-mesh-critical-performance-impact-by-using-ebpf) and [networks](blog/diagnose-service-mesh-network-performance-with-ebpf). +However, there are still two outstanding issues: + +1. **The timing of the task initiation**: It's always challenging to address the processes that require performance monitoring when problems occur. +Typically, manual engagement is required to identify processes and the types of performance analysis necessary, which cause extra time during the crash recovery. +The root cause locating and the time of crash recovery conflict with each other from time to time. +In the real case, rebooting would be the first choice of recovery, meanwhile, it destroys the site of crashing. +2. **Resource consumption of tasks**: The difficulties to determine the profiling scope. Wider profiling causes more resources than it should. +We need a method to manage resource consumption and understand which processes necessitate performance analysis. +3. **Engineer capabilities**: On-call is usually covered by the whole team, which have junior and senior engineers, even senior engineers have their understanding limitation of the complex distributed system, +it is nearly impossible to understand the whole system by a single one person. + +The **Continuous Profiling** is a new created mechanism to resolve the above issues. + +# Mechanism + +As profiling tasks consume a significant amount of system resources, can we find alternative ways to monitor processes that use fewer system resources? The answer is yes. Review Comment: ```suggestion # Automate Profiling As profiling is resource costing and high experience required, how about introducing a method to narrow the scope and automate the profiling driven by polices creates by senior SRE engineer? ``` ########## content/blog/2023-06-25-intruducing-continuous-profiling-skywalking-with-ebpf/index.md: ########## @@ -0,0 +1,248 @@ +--- +title: "Activating Automatical Performance Analysis -- Continuous Profiling" +date: 2023-06-25 +author: "Han Liu" +description: "Introduce and demonstrate how SkyWalking implements eBPF-based process monitoring with few manual engagements. The profiling could be automatically activated driven by the preset conditions." +tags: +- eBPF +- Profiling +- Tracing +--- + +# Background + +In previous articles, We have discussed how to use SkyWalking and eBPF for performance problem detection within [processes](/blog/2022-07-05-pinpoint-service-mesh-critical-performance-impact-by-using-ebpf) and [networks](blog/diagnose-service-mesh-network-performance-with-ebpf). +However, there are still two outstanding issues: + +1. **The timing of the task initiation**: It's always challenging to address the processes that require performance monitoring when problems occur. +Typically, manual engagement is required to identify processes and the types of performance analysis necessary, which cause extra time during the crash recovery. +The root cause locating and the time of crash recovery conflict with each other from time to time. +In the real case, rebooting would be the first choice of recovery, meanwhile, it destroys the site of crashing. +2. **Resource consumption of tasks**: The difficulties to determine the profiling scope. Wider profiling causes more resources than it should. +We need a method to manage resource consumption and understand which processes necessitate performance analysis. +3. **Engineer capabilities**: On-call is usually covered by the whole team, which have junior and senior engineers, even senior engineers have their understanding limitation of the complex distributed system, +it is nearly impossible to understand the whole system by a single one person. + +The **Continuous Profiling** is a new created mechanism to resolve the above issues. + +# Mechanism + +As profiling tasks consume a significant amount of system resources, can we find alternative ways to monitor processes that use fewer system resources? The answer is yes. +Currently, SkyWalking supports establishing policy rules for specific services to be monitored by the eBPF Agent in a low-energy manner, and run profiling when necessary automatically. Review Comment: ```suggestion So, in 9.5.0, SkyWalking first introduced preset policy rules for specific services to be monitored by the eBPF Agent in a low-energy manner, and run profiling when necessary automatically. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
