Yes, agreed.  Should have included the caveat, "this is a developer's point
of view."  But my point of view is definitely not at the expense of
"appreciating" the operations side as well.  Your point that the provenance
tooling mitigates the need for verbose logging of individual flowfiles is
spot on.  Before the provenance tracking, logging the history a flowfile
(via its uuid) across the entire flow was critical for both developers and
especially operations.  Now, not so much.

My comments were definitely made in the context of having the provenance
repository to track the life of a flowfile event.  Having the provenance
event database clearly shifts the utility of the log files way over to the
developer side and less the operations side.  Perhaps identifying what
useful operations-based logging still remains would be a good thing.

And since we're talking about provenance events, it would likely make sense
as well to provide the tooling to enable "grepping" of the provenance event
logs, so that unix style commands can be piped together in the same way
you'd do with plain text logs.  Perhaps a command line provenance client
would be useful to the operations toolkit and would further reduce the
(ab)use of the log files?

Adam



On Tue, May 26, 2015 at 3:12 PM, Joe Witt <[email protected]> wrote:

> It is important to keep in mind the logs have served two worlds:
> 1) The needs of a developer who wants to know when their stuff broke
> and needs context to figure out why
> 2) The needs of an operations person who does at times need the
> minutiae of starts/stops/statuses and how objects flowed through the
> system or what given processors did even when it isn't error or warn
> worthy.
>
> I believe the comments reflect a developer centric view at the expense
> of an appreciation for what operations folks have expressed needs for.
> Just as the developer doesn't care about all the items mentioned the
> operations person doesn't want those details drowned out by endless
> streams of stack traces.  Having said this...  Our usage of the logs
> and their value have diminished extensively thanks to the rise in
> capability and value of the provenance functions.  The entire notion
> of logging at this point could be reviewed.
>
> On Tue, May 26, 2015 at 2:58 PM, Adam Taft <[email protected]> wrote:
> > Sorry in advance for not cross posting this on github, but mine is more a
> > discussion oriented comment, not feedback on the pull request.
> >
> > Just to play devil's advocate a little...  I have found the default NiFi
> > logging configuration to be almost always logging the wrong things.  I
> most
> > definitely always want stack traces to be logged, because it's sometimes
> > very hard to recreate the state by which the stacktrace was thrown.  And
> > it's a pain to have to change the logging level (even at runtime) just to
> > hope that the stacktrace happens again.
> >
> > Whereas I almost never want all the existing framework INFO messages
> > logged.  I don't care anything about how many wali objects were garbage
> > collected, what the current state of each processor is, or if a heartbeat
> > was sent or received from a cluster configuration.  The things that one
> > must ignore when grepping a log often hinders the ability of finding the
> > thing you are looking for.
> >
> > A stacktrace, by definition, is more than a DEBUG level event.  It is at
> > minimum a WARN if not ERROR condition in almost all standard cases.  And
> > you definitely can't argue that infrequent stack traces are going to
> bloat
> > the log files anymore than the exist default logging configuration, which
> > is already very verbose.
> >
> > In short:  I want to know when things go wrong, not when they go right.
> >
> > If there's anything that might be done to help with stracktrace
> verbosity,
> > it would be suppress the stacktrace if it was seen x number of times in
> the
> > same period.  Thus, the first stacktrace would be logged, but all similar
> > stacks would be suppressed/minimized.  This strategy is used in a few
> > logging implementations (unsure if logback has direct support for this or
> > not).
> >
> > Hopefully, this discussion could lead to a more balanced default logback
> > configuration, one with more signal and less noise.
> >
> > Recommend:
> >   -  always log stacktraces
> >   -  change default org.apache.nifi.* to WARN level
> >
> >
> > Adam
> >
> >
> >
> >
> >
> > On Tue, May 26, 2015 at 11:14 AM, joewitt <[email protected]> wrote:
> >
> >> Github user joewitt commented on the pull request:
> >>
> >>
> >> https://github.com/apache/incubator-nifi/pull/59#issuecomment-105560569
> >>
> >>     Brian,
> >>
> >>     The issue with this proposal is that it takes away the
> administrators
> >> ability to control whether stack traces are provided or not.  The
> reason we
> >> used 'isDebugEnabled' was because that was the use case in which the
> stack
> >> trace was necessary (because they were debugging).   Given that you can
> >> edit logback config on the fly this seems like a reasonable approach.
> But
> >> taking away their ability to control this as the effect of this proposal
> >> means they're getting the stack traces whether they want them or not.
> This
> >> can cause very excessive and noisy output in the logs.
> >>
> >>     Do you have an alternative proposal for how we can retain the
> >> flexibility of letting the administrator toggle the stack traces?
> >>
> >>     Thanks
> >>     Joe
> >>
> >>
> >> ---
> >> If your project is set up for it, you can reply to this email and have
> your
> >> reply appear on GitHub as well. If your project does not have this
> feature
> >> enabled and wishes so, or if the feature is enabled but not working,
> please
> >> contact infrastructure at [email protected] or file a JIRA
> ticket
> >> with INFRA.
> >> ---
> >>
>

Reply via email to