Chris:

I think having SD-ID with [enc="utf-8" lang="English"] may be a good
approach. If different language use utf-8 encoding, then "lang=" can
distinguish it. 

Also want to clarify that you suggest that if the message is in ASCII,
it will not required SD-ID, but for all other encodings, SD-ID will be
required.

Note most other encoding methods already imply the language used, for
example, in Chinese, there are several encoding methods, Traditional
Chinese used in Taiwan and Hong Kong is Big5, and simplified Chinese
used in Mainland China is GBK, so if the message is in traditional
Chinese char, it will be shown as [enc="Big5", lang="Traditional
Chinese"], a little bit redundant. The Big5 also includes all English
char so it can be a mix of Chinese and English.  



Regards,
 
Sheran

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Chris Lonvick
(clonvick)
Sent: Tuesday, November 29, 2005 10:22 AM
To: Rainer Gerhards
Cc: [EMAIL PROTECTED]
Subject: RE: [Syslog] #5 - character encoding (was: Consensus?)

Hi Rainer,

Why don't we look at it from the other direction?  We could state that
any encoding is acceptable - for ease-of-use/migration with existing
syslog implementations.  It is RECOMMENDED that UTF-8 be used.  When it
is used, an SD-ID element will be REQUIRED.  e.g. - [enc="utf-8"
lang="en"]

Thoughts?

All:  Let's discuss this and close this issue.

Thanks,
Chris

On Tue, 29 Nov 2005, Rainer Gerhards wrote:

> Chris & WG,
>
>>> #5 Character encoding in MSG: due to my proof-of-concept
>>>   implementation, I have raised the (ugly) question if we need
>>>   to allow encodings other than UTF-8. Please note that this
>>>   question arises from needs introduced by e.g. POSIX. So we
>>>   can't easily argue them away by whishful thinking ;)
>>>
>>> Not even discussed yet.
>>
>> I haven't reviewed that yet.  However, I'll note that allowing 
>> different encoding can be accomplished in the future as long as we 
>> establish a default encoding and a way to identify it in our current 
>> work.
>
> I have read a little in the mailing archive. Please note that in 2000 
> it was consensus that the MSG part may contain encodings other then 
> US-ASCII. Follow this threat:
>
> http://www.syslog.cc/ietf/autoarc/msg00127.html
>
> This discussion lead to RFC 3164 saying "other encodings MAY be used".
> While this was observed behaviour, we need still to be aware that the 
> POSIX (and glibc) API places the restrictions on us that we simply do 
> not know the character encoding used by the application. As such, no 
> *nix syslogd can be programmed to be compliant to syslog-protocol if 
> we demand UTF-8 exclusively.
>
> I propose that we RECOMMEND UTF-8 that MUST start with the Unicode 
> Byte Order Mask (BOM) if used. If the MSG part does not start with the

> BOM, it may be any encoding just as in RFC 3164. I do not see any 
> alternative to this.
>
> Rainer
>
> _______________________________________________
> Syslog mailing list
> [email protected]
> https://www1.ietf.org/mailman/listinfo/syslog
>

_______________________________________________
Syslog mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/syslog

_______________________________________________
Syslog mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/syslog

Reply via email to