Chris, > Let's use this email as an example. :) There is no > indication that I'm > using US-ASCII encoding or that I'm writing in English.
I think there actually is. If I am right, the SMTP RFCs require mail text to be US-ASCII. Only via MIME and/or escape characters you can include 8-bit data. For example Müller and Möller might create some problems in some mailers (But I guess my Mail system will encode them with =<hexval>). Dropping messages with octets > 127 in the subject is a common spam protection setting... > However, you're > able to recieve this and read it. Similarly, you could write > an email in > German and send it to me. I would still be able to recieve > it but I'd > have a difficult time parsing the meaning. > > I'm suggesting that same approach for the transmission of the syslog > content. If I really wanted you to know what encoding and > language I'm > using in an email, I would specify a mime header. syslog > senders will > continue to pump out whatever encoding and language they've > been using > and recievers will continue to do their best to parse them. > If a vendor > wants to get very specific about that, then they will have to > use an SD-ID > to identify the contents of the message. Here I agree with you. What I was saying is that IF the header says it is US-ASCII, only then we should assume it actually is. If there is no "enc" SD-ID, then we do not know what it is but can assume ... whatever we assume. Let me phrase it that way: If the message contains [enc="us-ascii" lang="en"] then the receiver can honestly expect it to be US-ASCII. But if it does not contain any "enc" the receiver does not know exactly and assume anything it finds useful (may be ASCII, may not). Does this clarify? I somehow have the impression we mean the same thing and I simply do not manage to convey what I intend to ;) Rainer > > Mit Aufrichtigkeit, > Chris > > > > > On Wed, 30 Nov 2005, Rainer Gerhards wrote: > > > Andrew, > > > >>> Hi Rainer, > >>> > >>> Why don't we look at it from the other direction? We could > >> state that any > >>> encoding is acceptable - for ease-of-use/migration with > >> existing syslog > >>> implementations. It is RECOMMENDED that UTF-8 be used. > When it is > >>> used, an SD-ID element will be REQUIRED. e.g. - > >> [enc="utf-8" lang="en"] > >> > >> I like that idea too. > >> > >> So, if no SD-ID encoding element is specified, then we must > >> assume US-ASCII > >> and deal with it accordingly?? > > > > I think not. If it is not present, we known that we do not > know it. If > > it is US-ASCII, I would expect something like > > > > [enc="us-ascii" lang="en"] > > > > Of course, we could also say if it is non-present, we can assume > > US-ASCII. But then we would need to introduce > > > > [enc="unknown"] > > > > for the (common) case where we simply do not know it (again: think > > POSIX). I find this somehwat confusing. > > > > Rainer > > > _______________________________________________ Syslog mailing list [email protected] https://www1.ietf.org/mailman/listinfo/syslog
