Re: [Essentials] Collecting usage telemetry from Jenkins Pipeline

R. Tyler Croy Mon, 19 Mar 2018 15:32:38 -0700

(replies inline)

On Mon, 19 Mar 2018, Samuel Van Oort wrote:


> Late to the party because I was heads-down on Pipeline bugs a lot of 
> Friday, but this is a subject near-and-dear to my heart and in the past 
> I've discussed what metrics might be interesting since this was an explicit 
> intent to surface from my Bismuth (Pipeline Graph Analysis APIs).  Some of 
> these are things I'd wanted to make a weekend project of (including 
> surfacing the existing workflow-cps performance metrics). 


Long reply is long! Thanks for taking the time to respond Sam. Suffice it to
say, there isn't a world in which I wouldn't use statsd for this :) My current
thinking is to incorporate the Metrics plugin
(https://plugins.jenkins.io/metrics) in order to provide the appropriate
interfaces, and if that's fine, then I would have no qualms with that becoming
a dependency of Pipeline itself. I need to do some research on what Dropwizard
baggage might be unnecessarily added into Jenkins.


To many of your inline comments, I do not think there's any problem collecting
as much telemetry as you and the other Pipeline developers see fit. My list was
mostly what *I* think I need to demonstrate success with Pipeline for Jenkins
Essentials, and to understand how Jenkins Essentials is being used in order to
guide our future roadmap.




Cheers


> 
> We should aim to implement Metrics using the existing Metrics interface 
> because then that can be fairly easy exported in a variety of ways -- I use 
> a Graphite Metrics reporter that couples to another metric aggregator/store 
> for the Pipeline Scalability Lab (some may know it as "Hydra").  Other 
> *cough* proprietary systems may already consume this format of data.  I 
> would not be surprised if a StatsD reporter is pretty easy to hack together 
> using https://github.com/ReadyTalk/metrics-statsd and you get a lot of 
> goodies "for free."
> 
> The one catch for implementing metrics is that we want to be cautious about 
> adding too much overhead to the execution process. 
> 
> As far as specific metrics:
> 
> > distinct built-in step invocations (i.e. not counting Global Variable 
> invocations)
> 
> This can't be measured easily from the flow graph due to the potential to 
> create multiple block structures for one step.  It COULD be added easily 
> via a registered new StepListener API in workflow-api (and implemented in 
> workflow-cps) though.   I think it's valuable.
> 
> > configured Declarative Pipelines, configured Script Pipelines 
> 
> We can get all Pipelines (flavor-agnostic) by iterating over WorkflowJob 
> items.  Not sure how we'd tell Scripted vs. Declarative -- maybe 
> registering a Listener extension point of some sort?   I see value here. 
> 
> I'd *also* like to have a breakdown of which Pipelines have been run in the 
> last, say week and month, by type (easy to do by looking at the most recent 
> build).   That way we know not just which were created but which are in 
> active use. 
> 
> > Pipeline executions
> 
> Rates and counts can be achieved with the existing Metrics Timer time.  I'd 
> like to see that broken down by Scripted vs. Declarative as well. 
> 
> > * Global Shared Pipelines configured 
> >   * Folder-level Shared Pipelines configured 
> 
> Do you mean Shared Library use?  One metric I'd be interested in is how 
> many shared libraries are used *per-pipeline* -- easy to measure from the 
> count of LoadedScripts I believe (correct me if there's something I'm 
> missing here, Jesse). 
> 
> > Agents used per-Pipeline
> 
> I think should be possible to do this easily via flow graph analysis, 
> looking for WorkspaceActionImpl -- nodes and labels are be available.  We 
> might want to count total nodes *uses* (open/close of node blocks) and 
> distinct nodes used.  
> 
> Best to triggers as a post-build analysis using the RunTrigger -- that way 
> it's just a quick iteration over the Pipeline. 
> 
> > Runtime duration per step invocation
> 
> This is one of the MOST useful metrics I think.
> 
> I already have an implementation used in the Scalability Lab that does this 
> on a per-flownode basis using the GraphListener (rather than per-step).  
> This is part of a small utility plugin for metrics used in the scalability 
> lab (not hosted currently since it's not general-use). 
> 
> Doing per-step is somewhat more complex - for many steps, trivial, but for 
> example for a Retry step there's not a logical way to do it because you get 
> multiple blocks.  Blocks in general are undefined - do you count the block 
> *contents*, just the start, just the end, or start+end nodes?   Also 
> remember that Groovy logic counts against the Step time with the 
> FlowNodes.  Usually that shouldn't be a huge issue unless the Groovy is 
> complex.  
> 
> If that's too noisy there might be ways to insert Listeners for the Step 
> itself (more complex though) -- I think using the FlowNodes is good enough 
> for now and gives us a solid first-order approximation that is useful 99% 
> of the time. 
> 
> I would also like to extend this by breaking it down into separate metrics 
> per step type, i.e. runtime for sh, runtime for echo, for 'node', etc.  
>  This is easier than you'd think since you can fetch the StepDescriptor and 
> call getFunctionName to get a unique metric key for the step.   This is far 
> more useful to us than just average step timings, because it helps spot 
> performance regressions in the field. 
> 
> Other aggregates of interest: total time spent in each step type for the 
> pipeline and counts of the FlowNode by step per pipeline.  This will show 
> if we're spending (for example) a LOT of time running 
> readFile/writeFile/dir steps due to some sudden bottleneck in the remoting 
> interaction and also reveal which step types are used most often.   Knowing 
> which steps are used heavily helps me know which deserve extra priority for 
> bugfixes, features, and optimizations.
> 
> It actually *sounds* far more complicated than it really is -- this would 
> be a pretty trivial afternoon project I think. 
> 
> > Runtime duration per Pipeline
> 
> I already have an implementation.  Same plugin as above.  It's exposed as a 
> DropWizard histogram as well, so you get rates + aggregate times with 
> median, mean, etc. 
> 
> *Other desired metrics:  *I think we want FlowNodes created as a rate per 
> unit time (I have an implementation in the same plugin above).   I also 
> have an impl for this already (same plugins as before). 
> 
> If we could find a way I'd really like to have a counter of how many 
> elements of GroovyCPS logic are run and how many function calls (for 
> off-master you obviously wouldn't get this data).  This is something useful 
> for measuring the real complexity of their Groovy -- even better than 
> Liam's Cyclomatic Complexity metric because it directly tracks runtime 
> operations, not just code structure.   I have notions how we'd accomplish 
> this.
> 
> On Friday, March 16, 2018 at 6:55:58 PM UTC-4, Andrew Bayer wrote:
> >
> > It???s a normal step - what I???m talking about is counting Pipelines 
> > containing one or more script blocks, I.e., what percentage of total 
> > Declarative Pipelines use script blocks, which I think is a more useful 
> > metric than just how many script block invocations there are.
> >
> > A.
> >
> > On Fri, Mar 16, 2018 at 5:32 PM R. Tyler Croy <[email protected] 
> > <javascript:>> wrote:
> >
> >> (replies inline)
> >>
> >> On Fri, 16 Mar 2018, Andrew Bayer wrote:
> >>
> >> > If we???re going to be tracking step invocations anyway, it???d be 
> >> interesting
> >> > to count the number of Declarative Pipelines with a script block, maybe?
> >>
> >> I kind of assumed that if we were incrementing a counter on step 
> >> invocations
> >> that script{} would be collected already by the machinery, e.g. isn't it 
> >> "just"
> >> a step?
> >>
> >> If it's a special snowflake then I'll make sure to include it in my 
> >> design.
> >>
> >>
> >> A few more which come to mind now that I'm thinking about Script:
> >>
> >>  * Count of stages per Pipeline
> >>  * Count of Pipelines with the Groovy sandbox disable
> >>  * Time spent in script{} block
> >>
> >>
> >> Thanks for the ideas abayer!
> >>
> >>
> >> Cheers
> >> - R. Tyler Croy
> >>
> >> ------------------------------------------------------
> >>      Code: <https://github.com/rtyler>
> >>   Chatter: <https://twitter.com/agentdero>
> >>      xmpp: [email protected] <javascript:>
> >>
> >>   % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
> >> ------------------------------------------------------
> >>
> >> --
> >> You received this message because you are subscribed to the Google Groups 
> >> "Jenkins Developers" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an 
> >> email to [email protected] <javascript:>.
> >> To view this discussion on the web visit 
> >> https://groups.google.com/d/msgid/jenkinsci-dev/20180316213201.ckenekkqcbgtsuzx%40blackberry.coupleofllamas.com
> >> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Jenkins Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/jenkinsci-dev/9842785f-49ee-43fc-ab61-d9e7b45dc3db%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.


- R. Tyler Croy

------------------------------------------------------
     Code: <https://github.com/rtyler>
  Chatter: <https://twitter.com/agentdero>
     xmpp: [email protected]

  % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/20180319223149.lcps35ksuqiwocax%40blackberry.coupleofllamas.com.
For more options, visit https://groups.google.com/d/optout.

signature.asc
Description: PGP signature

Re: [Essentials] Collecting usage telemetry from Jenkins Pipeline

Reply via email to