On Thu, 12 Oct 2000, Chris Lonvick wrote:

> >  The code set is seven-bit ASCII in an eight-bit field.  These are
> >  the ASCII codes as defined in "USA Standard Code for Information
> >  Interchange" [2] only using codes 32 through 126.
> >
> >I strongly suggest not doing the same mistake as where done in RFC 822
> >when email was standardized.  I see no reason to limit the syslog
> >messages to a seven bit code set, but I do see a number of problems.
> >Why should norwegian, french, swedish and almost everyone but the
> >english speaking peoples in the world not confirm to the syslog
> >standard when we send syslog messages containing our languages
> >characters.
> 
> That's a good point.  We are, however, writing the "observed" 
> behaviour of the syslog protocol.  Can you or anyone else provide 
> examples of messages that have been observed in the wild that do 
> contain characters as you describe?  I'll be glad to re-qualify
> the character set if we can document that messages have been
> observed that do contain them.

Usally only kernels, daemons and system utilities log through
syslog. A large part of them use english since it's an ideal 
vehicle for such kind of messages.
However, the are exceptions to this behaviour.
Potentially, all programs with i18n (and/or l12n) support can log
messages with a different code set. They are many.

If we choose to support different character set we should  also 
consider to adopt Unicode encoding.
Use simply a ISO 8859-X based 8 bit character set can turn out to 
be restrictive.
Infact many particularities of many languages are not part of ISO 8859-X
standard (eg. diphthong, some ideographic characters, ligatures, and so
on).

Unicode adoption present also drawbacks (ie. diffuclties for syslog
message interpreters to parse datas)

> Let's also be sensitive to the fact that there are syslog message 
> interpreters out there.  If we do find additional characters in the
> wild, can we confirm that they are acceptable to the better known
> packages?  I'm thinking of swatch; are there others that we should
> consider?  Could anyone on the list take a look through swatch and
> see if it has any constraints about the character set that it will
> accept?  I really don't want to make any changes to the acceptable
> behaviour of syslog that will have bad affects on good programs
> like swatch.

swatch configuration file use 'watchfor regex' and 'ignore regex'
statments to determine what types of expression patterns to look for.
Since swatch is a perl script, in regex it's possible to use \nnn
notation, where nnn is a string of octal digits that match characters 
whose ASCII value is nnn. (similarly, \xnn with hexdecimal digits).

So swatch should handle ISO 8859-X based 8 bit character sets.

ciao
alfonso

--
Alfonso De Gregorio,            [EMAIL PROTECTED]

Reply via email to