On Thu, 12 Oct 2000, Chris Lonvick wrote:
> > The code set is seven-bit ASCII in an eight-bit field. These are
> > the ASCII codes as defined in "USA Standard Code for Information
> > Interchange" [2] only using codes 32 through 126.
> >
> >I strongly suggest not doing the same mistake as where done in RFC 822
> >when email was standardized. I see no reason to limit the syslog
> >messages to a seven bit code set, but I do see a number of problems.
> >Why should norwegian, french, swedish and almost everyone but the
> >english speaking peoples in the world not confirm to the syslog
> >standard when we send syslog messages containing our languages
> >characters.
>
> That's a good point. We are, however, writing the "observed"
> behaviour of the syslog protocol. Can you or anyone else provide
> examples of messages that have been observed in the wild that do
> contain characters as you describe? I'll be glad to re-qualify
> the character set if we can document that messages have been
> observed that do contain them.
Usally only kernels, daemons and system utilities log through
syslog. A large part of them use english since it's an ideal
vehicle for such kind of messages.
However, the are exceptions to this behaviour.
Potentially, all programs with i18n (and/or l12n) support can log
messages with a different code set. They are many.
If we choose to support different character set we should also
consider to adopt Unicode encoding.
Use simply a ISO 8859-X based 8 bit character set can turn out to
be restrictive.
Infact many particularities of many languages are not part of ISO 8859-X
standard (eg. diphthong, some ideographic characters, ligatures, and so
on).
Unicode adoption present also drawbacks (ie. diffuclties for syslog
message interpreters to parse datas)
> Let's also be sensitive to the fact that there are syslog message
> interpreters out there. If we do find additional characters in the
> wild, can we confirm that they are acceptable to the better known
> packages? I'm thinking of swatch; are there others that we should
> consider? Could anyone on the list take a look through swatch and
> see if it has any constraints about the character set that it will
> accept? I really don't want to make any changes to the acceptable
> behaviour of syslog that will have bad affects on good programs
> like swatch.
swatch configuration file use 'watchfor regex' and 'ignore regex'
statments to determine what types of expression patterns to look for.
Since swatch is a perl script, in regex it's possible to use \nnn
notation, where nnn is a string of octal digits that match characters
whose ASCII value is nnn. (similarly, \xnn with hexdecimal digits).
So swatch should handle ISO 8859-X based 8 bit character sets.
ciao
alfonso
--
Alfonso De Gregorio, [EMAIL PROTECTED]