Re: Review Request 20240: Follow-up KAFKA-1352: Standardize stack trace printing in logs

Guozhang Wang Thu, 24 Apr 2014 18:59:26 -0700


> On April 24, 2014, 6:53 p.m., Joel Koshy wrote:
> > Here are my initial thoughts on the conventions (and for this kind of RB -
> > i.e., which affects our coding conventions it would be great to have as many
> > opinions as possible).
> > 
> > - Single line log seems reasonable. The exception for produce/fetch requests
> >   (details=true) has its pros/cons. It will make it slightly harder to debug
> >   since each request can span multiple lines; also the log size will grow a
> >   little. One possibility (just a thought) would be to do all our request
> >   logging using a binary format. Although that would be somewhat
> >   inconvenient to parse (we will need a tool) it will at least reduce the
> >   log size on disk significantly. SREs don't like verbose (large) public
> >   access logs, but details=true is important for debugging. Hence the
> >   suggestion to consider a binary format.
> > 
> > - <quote>No stack trace below WARN</quote>: agreed, although I'm making it
> >   clearer to make sure we are on the same page: <quote-paraphrased>No stack
> >   trace for INFO, DEBUG, TRACE"</quote-paraphrased>
> > 
> > - <quote>For WARN: (a) Print stack trace when calling library functions
> >   (i.e. non-kafka functions) or we captured Throwable which could be various
> >   exception types. (b) Print e.toSting (e.Class.Name + ":" + e.Message) when
> >   we captured Throwable but we could be sure about possible exception types.
> >   (c) Only print e.Message otherwise.</quote>
> >                
> >   You *really* need to make the especially the second half of both (a) and
> >   (b) clearer and precise. I think you mean the following (can you confirm?)
> > 
> >   <quote-paraphrased>
> > 
> >   (a-i) - If we call a non-kafka function and catch a Throwable and emit a
> >   WARN and we have no clue what the underlying exception type is then log
> >   the stack trace.
> > 
> >   (a-ii)- If we call a non-kafka function and catch a Throwable and emit a
> >   WARN and we do know what the underlying exception type is then just use
> >   throwable.toString
> > 
> >   (a-iii) - If we call a kafka function use throwable.Message (or toString).
> > 
> >   </quote-paraphrased>
> > 
> >   So what if we have some helper function (in Kafka) that wraps some
> >   non-kafka function (and does not catch throwables). So if a caller of the
> >   helper function catches throwables then per the above guideline it would
> >   only log toString. However, if we were to get rid of the helper function
> >   and call that non-kafka function directly we would log the stack trace.
> >   (This is a bit contrived, but my point is that I'm not sure the convention
> >   is complete/reasonable.). Also for (a-ii) and we do know it could be (say)
> >   an IOException? That in itself could be of various specific subtypes. At
> >   what point do we say "we know what the underlying exception type is"?
> > 
> > - <quote>3. For ERROR, always print stack trace.</quote>
> >   Makes sense.
> > 
> > So it would be useful (if possible) to have some guideline on when to use
> > ERROR vs WARN since that essentially dictates whether we have a stack-trace
> > or not (per your convention above). It is probably obvious but I said it
> > anyway :) Mainly because if we go with the above convention it means the
> > difference between having a stack trace when calling a Kafka function or not
> > having one. So if we use WARN in a location which can be reached via
> > multiple exception paths then we may want to know where the exception came 
> > from.
> > That is possible only with a stack trace or a unique exception.getMessage
> > for each of the places the original exception was thrown from. So then that
> > becomes a significant point in code reviews: i.e., ensure that exception
> > messages are being set properly (and ideally uniquely) everywhere.
> > 
> > <quote>1. Avoid capture Throwable as less as possible.</quote>
> > 
> > This is a double-negative. Taken literally it suggests "Capture throwable as
> > much as possible". However, the ensuing comments seem to indicate otherwise.
> > I'm confused.
> > 
> > <quote>3. When log with WARN or below, do not include the stack trace; when
> > log with ERROR, only log the stack trace if we are catching Throwable or any
> > runtime exceptions.</quote>
> > 
> > How does the above tie in with (3) further up - i.e., always print stack
> > trace for ERROR?
> > 
> > So I'll stop here and let you clarify all of this.
> >


Thanks Joel and Jun for the comments. Let me rephrase my coding convention a 
little bit here. 

We actually do have a coding convention for logging levels between WARN and 
ERROR:

http://kafka.apache.org/coding-guide.html

Quote: "WARN and ERROR indicate something that is bad. Use WARN if you aren't 
totally sure it is bad, and ERROR if you are."

Since for most of the cases we are quite sure if it is bad or not, I would like 
to propose the following:

For WARN/ERROR level:
1) if this behavior or exception will cause the module to be mal-functional or 
in a bad state, use ERROR;
2) otherwise use WARN".

For try/catch usage:
1) if we know all the possible exception types, enumerate them in catch;
2) if 1) is not possible, catch Exception;
3) only catch Throwable if we do not want this try-block to throw "anything".

For logging exception content:
1) if we do not know where the exception can be thrown AND we care "where", log 
stack trace (e.g. we probably do not need to log stacktrace for Socket.open)
2) otherwise, if we catch Exception or Throwable, log e.toString
3) otherwise, log e.getMessage

So, we may not be necessarily always print stack trace for ERROR.

For this JIRA, I would like to focus on removing unnecessarily verbose 
according to the final conventions we discussed and agreed upon, and file a 
separate JIRA for thorough cleaning.


- Guozhang


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20240/#review41347
-----------------------------------------------------------


On April 11, 2014, 12:05 a.m., Guozhang Wang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20240/
> -----------------------------------------------------------
> 
> (Updated April 11, 2014, 12:05 a.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1352
>     https://issues.apache.org/jira/browse/KAFKA-1352
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> Standardize stack trace printing in logs:
> 
> This this the criterion I followed whenever we need to log with a thrown 
> exception:
> 
> 1. No stack trace below WARN
> 
> 2. For WARN,
> 
> a) Print stack trace when calling library functions (i.e. non-kafka 
> functions) or we captured Throwable which could be various exception types.
> 
> b) Print e.toSting (e.Class.Name + ":" + e.Message) when we captured 
> Throwable but we could be sure about possible exception types.
> 
> c) Only print e.Message otherwise.
> 
> 3. For ERROR, always print stack trace.
> 
> Also there are some mis-use of swallow, which should only be used when "we do 
> not throw more exceptions but just log in with stack trace in whole", but not 
> "when we really do not care if it throw any exceptions".
> 
> -------------
> 
> Moving forward, I would like to suggest some exception handling manners:
> 
> 1. Only capture Throwable when 1) calling a library function whose throwable 
> exceptions are not defined clearly, or 2) if we REALLY want to "swallow" any 
> exceptions thrown. In the second case, we should not any exceptions thrown.
> 
> 2. For our own scala classes, specify possible checked kafka exceptions in 
> the comments as often as possible (for java classes do that in signatures). 
> And then whenever possible, case-enumerate all possible exceptions and just 
> log their error messages.
> 
> 3. Only re-throw the same/another exception after logging when the function 
> needs a return value and the thrown exception cause it to not able to return 
> anything. And in this case, throw a specific exception and capture it the 
> caller without further double-logging.
> 
> 
> Diffs
> -----
> 
>   core/src/main/scala/kafka/client/ClientUtils.scala 
> fc9e08423a4127e1d64be1e62def567ea9eb80a3 
>   core/src/main/scala/kafka/consumer/ConsumerFetcherManager.scala 
> b9e2bea7b442a19bcebd1b350d39541a8c9dd068 
>   core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala 
> 1dde4fcdd7004af798e9eac8dde289575e99fd11 
>   core/src/main/scala/kafka/controller/ControllerChannelManager.scala 
> c95c650cffbeed27e837e7c2d628f9026feb2c17 
>   core/src/main/scala/kafka/controller/KafkaController.scala 
> 933de9dd324c7086efe6aa610335ef370d9e9c12 
>   core/src/main/scala/kafka/network/SocketServer.scala 
> 4976d9c3a66bc965f5870a0736e21c7b32650bab 
>   core/src/main/scala/kafka/producer/Producer.scala 
> 4798481d573bbdce0ba39035c50f4c4411ad0469 
>   core/src/main/scala/kafka/producer/SyncProducer.scala 
> 489f0077512d9a69be81649c490274964290fa40 
>   core/src/main/scala/kafka/producer/async/DefaultEventHandler.scala 
> d8ac915de31a26d7aa67760d69373975cacd0c9d 
>   core/src/main/scala/kafka/producer/async/ProducerSendThread.scala 
> 42e9c741c2dcef756416832f11d37678cb7710ee 
>   core/src/main/scala/kafka/server/AbstractFetcherThread.scala 
> 3b15254f32252cf824d7a292889ac7662d73ada1 
>   core/src/main/scala/kafka/server/KafkaApis.scala 
> d96229e2d4aa7006b0dbd81055ce5a2459d8758c 
>   core/src/main/scala/kafka/utils/Utils.scala 
> 6bfbac16e2f8d68b8c711a0336c698aa6f610ae8 
> 
> Diff: https://reviews.apache.org/r/20240/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Guozhang Wang
> 
>

Re: Review Request 20240: Follow-up KAFKA-1352: Standardize stack trace printing in logs

Reply via email to