Chris,

> Let's use this email as an example.  :)  There is no 
> indication that I'm 
> using US-ASCII encoding or that I'm writing in English.  

I think there actually is. If I am right, the SMTP RFCs require mail text to be 
US-ASCII. Only via MIME and/or escape characters you can include 8-bit data. 
For example Müller and Möller might create some problems in some mailers (But I 
guess my Mail system will encode them with =<hexval>). Dropping messages with 
octets > 127 in the subject is a common spam protection setting...

> However, you're 
> able to recieve this and read it.  Similarly, you could write 
> an email in 
> German and send it to me.  I would still be able to recieve 
> it but I'd 
> have a difficult time parsing the meaning.
> 
> I'm suggesting that same approach for the transmission of the syslog 
> content.  If I really wanted you to know what encoding and 
> language I'm 
> using in an email, I would specify a mime header.  syslog 
> senders will 
> continue to pump out whatever encoding and language they've 
> been using 
> and recievers will continue to do their best to parse them.  
> If a vendor 
> wants to get very specific about that, then they will have to 
> use an SD-ID 
> to identify the contents of the message.

Here I agree with you. What I was saying is that IF the header says it is 
US-ASCII, only then we should assume it actually is. If there is no "enc" 
SD-ID, then we do not know what it is but can assume ... whatever we assume. 
Let me phrase it that way:

If the message contains

[enc="us-ascii" lang="en"]

then the receiver can honestly expect it to be US-ASCII. But if it does not 
contain any "enc" the receiver does not know exactly and assume anything it 
finds useful (may be ASCII, may not).

Does this clarify? I somehow have the impression we mean the same thing and I 
simply do not manage to convey what I intend to ;)

Rainer

> 
> Mit Aufrichtigkeit,
> Chris
> 
> 
> 
> 
> On Wed, 30 Nov 2005, Rainer Gerhards wrote:
> 
> > Andrew,
> >
> >>> Hi Rainer,
> >>>
> >>> Why don't we look at it from the other direction?  We could
> >> state that any
> >>> encoding is acceptable - for ease-of-use/migration with
> >> existing syslog
> >>> implementations.  It is RECOMMENDED that UTF-8 be used.  
> When it is
> >>> used, an SD-ID element will be REQUIRED.  e.g. -
> >> [enc="utf-8" lang="en"]
> >>
> >> I like that idea too.
> >>
> >> So, if no SD-ID encoding element is specified, then we must
> >> assume US-ASCII
> >> and deal with it accordingly??
> >
> > I think not. If it is not present, we known that we do not 
> know it. If
> > it is US-ASCII, I would expect something like
> >
> > [enc="us-ascii" lang="en"]
> >
> > Of course, we could also say if it is non-present, we can assume
> > US-ASCII. But then we would need to introduce
> >
> > [enc="unknown"]
> >
> > for the (common) case where we simply do not know it (again: think
> > POSIX). I find this somehwat confusing.
> >
> > Rainer
> >
> 

_______________________________________________
Syslog mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/syslog

Reply via email to