Re: Proposal to add message IDs to vdsm.log

Anthony Liguori Thu, 06 Oct 2011 10:55:35 -0700

On 10/06/2011 12:18 PM, Keith Robertson wrote:
> On 10/06/2011 12:14 PM, Saggi Mizrahi wrote:
>> On Thu 06 Oct 2011 05:22:02 PM IST, Keith Robertson wrote:
>>> Message IDs do require some discipline on the part of the developer
>>> to ensure uniqueness. They also force the developer to think about
>>> whether or not the text that is being emitted is actually worthy of
>>> [INFO|ERROR|WARN|EXCEPTION|CRITICAL] instead of the the lowly DEBUG.
>>>
>>> As to how it is done, I have seen this done a few different ways but
>>> the most popular seems to be to assign a range of numbers to a
>>> particular component/package. In this manner, a developer can quickly
>>> scan that package and choose the next higher number. Some projects,
>>> like my last one, were fairly monolithic in design and we just used
>>> an up-counter for the whole source tree (a little harder but not
>>> impossible with grep and sort).
>>>
>>> All solutions generally involve a central document/registry where the
>>> IDs are listed and *can* be documented (generally the documentation
>>> evolves over time). For VDSM, I would probably recommend a simple
>>> up-counter. Take a look at how JBoss is breaking up their product.
>>> http://community.jboss.org/wiki/LoggingIds
>>>
>>> As for gettext, message IDs don't really affect gettext. gettext is a
>>> vehicle that is used to enable i18n. Message IDs are typically coded
>>> as symbols directly in the source while gettext handles the
>>> internationalization of all things that are not substitution
>>> variables (you don't want to translate them). I've attached the
>>> makefile that I wrote for the rhevm-iso-uploader to show the flow for
>>> using gettext in a project.
>>>
>>> Here is an example of a logging statement that would work with
>>> gettext and a message ID:
>>> // Assume the log format is this...
>>> FORMAT = '%(asctime)s %(lineno)s %(levelname)s(%(messageid)s):
>>> %(message)s'
>>> // An invocation
>>> logger.warning(_("A log message with a substitution string, %s, that
>>> doesn't get translated.") % "woweee", extra={ 'messageid' : 'VDSM1000'})
>>> // The result
>>> 2011-10-06 11:13:09,458 14 WARNING(VDSM1000): A log message with a
>>> substitution string, woweee, that doesn't get translated.
>>>
>>> Things to notice:
>>> 1. The substitution variable "woweee" doesn't get translated or
>>> picked up by gettext because it isn't surrounded by _(...) which is
>>> what you want.
>>> 2. The message ID (i.e. VDSM1000) doesn't get translated either for
>>> the same reason.
>>>
>>> Cheers,
>>> Keith
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 10/06/2011 10:02 AM, Saggi Mizrahi wrote:
>>>> On Wed 05 Oct 2011 08:24:13 PM IST, Keith Robertson wrote:
>>>>> Hi,
>>>>>
>>>>> I would like to propose enhancing the VDSM source to emit message
>>>>> IDs in
>>>>> log messages with levels greater than DEBUG (i.e. INFO, ERROR, WARN,
>>>>> EXCEPTION, and CRITICAL). Messages IDs have been used successfully in
>>>>> many products and they can aid in debugging, documentation, root cause
>>>>> analysis, and internationalization.
>>>>>
>>>>> I have scanned the source and have determined that there are
>>>>> roughly 525
>>>>> calls to log.[error|warn|info|critical|exception] and I am
>>>>> volunteering
>>>>> to go through and add IDs to them all but, before I do I want to get
>>>>> buy-in from the project maintainers. While adding IDs to the
>>>>> messages I
>>>>> would also enable them for i18n.
>>>>>
>>>>> I have written a rough plan for adding messages IDs to VDSM below and
>>>>> I've provided some examples of products/projects that use message IDs.
>>>>>
>>>>> Examples of Projects that Use Message IDs:
>>>>> - JBoss 7 is adopting Message IDs for all of its logging. See the
>>>>> annotations (e.g. @Message(id = 12000, ) in the following piece of
>>>>> source:
>>>>> https://github.com/jbossas/jboss-as/blob/master/process-controller/src/main/java/org/jboss/as/process/ProcessLogger.java
>>>>>
>>>>>
>>>>> - Almost all IBM products use message IDs and they do a fairly good
>>>>> job
>>>>> of documenting them. Google has indexed them all and you can simply
>>>>> type
>>>>> them in to a search and get a description of the problem that is
>>>>> associated with the ID.
>>>>> http://publib.boulder.ibm.com/infocenter/tivihelp/v42r1/topic/com.ibm.omegamon.stor.doc/GC23-9705-0484.htm
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Example of a Log Statement with a Message ID in Python:
>>>>> FORMAT = '%(asctime)s %(lineno)s %(levelname)s(%(messageid)s):
>>>>> %(message)s'
>>>>> logging.basicConfig(format=FORMAT)
>>>>> logger = logging.getLogger()
>>>>> logger.warning('NFS problem. Unable to mount.', extra={ 'messageid' :
>>>>> 'VDSM1000'})
>>>>>
>>>>> // Produces
>>>>> 2011-10-05 13:24:43,180 11 WARNING(VDSM1000): NFS problem. Unable
>>>>> to mount.
>>>>>
>>>>>
>>>>> Example of a Log Statement with a Message ID in Python with i18n:
>>>>> logger.warning(_('NFS problem. Unable to mount.'), extra={
>>>>> 'messageid' :
>>>>> 'VDSM1000'})
>>>>>
>>>>> Example of an i18n Log Message:
>>>>> Chinese: 2011年10月5日13:24:43,18011警告（VDSM1000）：NFS问题。无法
>>>>> 安 装。
>>>>> English: 2011-10-05 13:24:43,180 11 WARNING(VDSM1000): NFS problem.
>>>>> Unable to mount.
>>>>>
>>>>>
>>>>>
>>>>> A Rough Plan for adding Message IDs to VDSM:
>>>>>
>>>>> Step 0: Decide on a logging format of the ID itself:
>>>>> - JBoss is using "JBAS#####". I am thinking that VDSM could
>>>>> probably do
>>>>> something similar "VDSM####".
>>>>>
>>>>> Step 1: Decide where to list and document all of the IDs.
>>>>> - It could be as simple as a page on VDSM's fedorahosted wiki.
>>>>>
>>>>> Step 2: Pin the format of the log messages:
>>>>> - Currently VDSM allows a user configurable Python log format. We
>>>>> would
>>>>> need to pin the format so that it includes a substitution variable for
>>>>> the message ID.
>>>>> Example: '%(asctime)s %(lineno)s %(levelname)s(%(messageid)s):
>>>>> %(message)s'
>>>>>
>>>>> Step 4: Plow through the VDSM source and add message IDs.
>>>>>
>>>>>
>>>>> Cheers,
>>>>> Keith
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> vdsm-devel mailing list
>>>>> vdsm-devel@lists.fedorahosted.org
>>>>> https://fedorahosted.org/mailman/listinfo/vdsm-devel
>>>>
>>>> Seems like a nice idea. But how does it affect developers. How do
>>>> they choose mIDs? How do you prevent collisions? How are they
>>>> different from simple gettext?
>>>
>>
>> As I never used this myself so I can't really imagine the all benefits
>> of using such a system, at least enough to warrant all the tedious
>> work. If you could just list out some of the use cases it'll be most
>> helpful.
>>
>> For instance I assume grepping for an ID is easier then grepping for a
>> message especially in multiple versions where the message might have
>> changed.
>
> Sure.  As far as the tedious part, yes, adding message IDs to an
> existing source tree is a bit of a PITA but the 525 invocations in VDSM
> isn't all that bad.  I've seen much worse.  Plus, once it's done the
> effort is much smaller (new lines of code only).
>
> As for the value (I've tried to itemize a few common reasons for doing it):
>
> Value: Message IDs allow you to use gettext globalize your messages. If
> you just used gettext and pulled the strings out of your code for
> globalization then you might end up with something like this in vdsm.log...
>      错误：问题安装驱动器
> the English translation (according to google) is:
>      ERROR: Problem mounting drive
> Unless you can read Chinese, then the message above is completely
> unhelpful to a support engineer.  The converse is true also.  Consider a
> non-english speaking person trying to decipher "ERROR: Problem mounting
> drive".  With a message ID; however, you get this...
>      错误（VDSM1000）：安装驱动问题


Is it possible to just hash the main string before internationalization?

So basically:

fprintf(f, "[%s] ERROR(VDSM%04x) %s", date_str, crc32(message), message);

Or the equivalent Python.

> Value: Message IDs allow other applications to perform actions based on
> what is in your logs.  On the mainframe, this is extremely common.
> You'll have one program watching the logs for a particular ID(s).  When
> the program sees the ID it will look in a dictionary and decide what (if
> anything) needs to be done.  On linux, logwatch, is frequently used for
> this purpose and message IDs make this process much simpler because the
> program doesn't have to write complicated REGEXs to look for arbitrary
> strings it can just scan for an ID.
>
> Value: Message IDs generally have the same meaning across releases of
> the product.  This means that the developer can change the text of the
> log message without worrying about breaking someone's automation.  Example:
>      Version 1 of the product- ERROR(VDSM1000): Problem mounting drive.
>      Version 2 of the product- ERROR(VDSM1000): unable to contact
> 127.0.0.1 for mounting. Try blah blah.

Does this mean that logging messages are an ABI that needs to be maintained 
over 
time?  That's probably very difficult to do right.

Regards,

Anthony Liguori

>
> Value: If you do it right search engines will index your message IDs and
> take users to a page where they can hopefully get more detail on the
> issue and potentially "self solve" the problem without calling support.
> Put this into a search engine... "KDFAP0020S"
>
> Value: Message IDs allow you to easily describe flows.  You might be
> able to say... "if you saw ID123 then look farther back for ID124 and if
> you see ID124 then the problem is _probably_ XYZ."
>
> I am probably not doing enough justice to the value of message IDs but,
> I will say that lots of very large companies use them in complex
> software applications to solve these and other issues.
>
> Cheers,
> Keith
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> vdsm-devel mailing list
> vdsm-devel@lists.fedorahosted.org
> https://fedorahosted.org/mailman/listinfo/vdsm-devel

_______________________________________________
vdsm-devel mailing list
vdsm-devel@lists.fedorahosted.org
https://fedorahosted.org/mailman/listinfo/vdsm-devel

Re: Proposal to add message IDs to vdsm.log

Reply via email to