Le 16 févr. 2018 15:51, "R. Tyler Croy" <[email protected]> a écrit :


One of the necessary details, in my opinion, to make Jenkins Essentials [0]
successful is providing near-real-time error telemetry. Coupled with the
"Evergreen" distribution system [1], error telemetry "post-deploy" will be
absolutely crucial to determine whether or not we have just pushed out bad
code
worthy of reverting.

I currently define "error telemetry" to include:

 * Uncaught exceptions which cause the Evil Jenkins 500 page
 * Logged ERROR messages, with or without exceptions
 * Logged WARN messages, with or without exceptions


Totally agreed automated reporting is a must.

Shouldn't the evergreen client send feedback too? Like if it triggered a
Jenkins restart and never heard back since?

How about also a less automated /form/ in the Jenkins UI itself, to be used
by human in case something is clearly wrong but didn't cause logs or
outages.
About that probably a clear web ui somewhere in case everything went wrong.

General thought/note: this probably will require some setup to avoid
attackers can trigger an auto-revert by sending bad reports to the
telemetry endpoint.


This list is by no means set in stone, and it is expected that there's
going to
be some "noise" in the system, so rooming upstream of this error telemetry
won't be looking for the presence of errors but rather tracking patterns
over
time [2].


The big challenge that we have, for which I wanted feedback, is *how* we can
acquire this error telemetry


My first prototype in this area was a plugin which integrates with the
Sentry[3] error reporting service: https://github.com/jenkinsci/
sentry-plugin
This approach basically spins up a background busy-waiting thread which
loops
over all the loggers in the JVM, and adds the SentryHandler to loggers. Not
the
prettiest solution but it mostly works. There is an opportunity to miss
logged errors before the SentryHandler is added, but it's hard to quantify
how
serious a gap that might be.

I am not /thrilled/ with this approach, but it meets a very important
criteria in
that it's non-invasive to core and other plugins and can simply be
installed in
a Jenkins instance in order to work.


I wanted to ask for more thoughts on alternative approaches, if they exist,
which would enable the collection of the error telemetry discussed above.
I'm
sure there's something I'm missing.




[0] https://github.com/jenkinsci/jep/tree/master/jep/300
[1] https://github.com/jenkinsci/jep/tree/master/jep/300#auto-update
[2] For example: https://itmonitor.zenoss.com/is-your-performance-normal-
how-do-you-know/
[3] https://sentry.io


Cheers
- R. Tyler Croy

------------------------------------------------------
     Code: <https://github.com/rtyler>
  Chatter: <https://twitter.com/agentdero>
     xmpp: [email protected]

  % gpg --keyserver keys.gnupg.net --recv-key 1426C7DC3F51E16F
------------------------------------------------------

--
You received this message because you are subscribed to the Google Groups
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
To view this discussion on the web visit https://groups.google.com/d/
msgid/jenkinsci-dev/20180216145116.yizslgftmjgnhwmn%40blackberry.
coupleofllamas.com.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jenkinsci-dev/CANWgJS6xJnYhcuxTzPwtP%3DSrgymJmc6gKOAsb-ThMbK4YrGcLg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to