Re: [Syslog] #5 - character encoding (was: Consensus?)

Tom Petch Thu, 01 Dec 2005 03:11:25 -0800

Rainer

I think I detect an approach I do not agree with, in this and perhaps other
issues.


You seem to be saying that the (eg POSIX) syslogd must emit perfect syslog
messages and is responsible for anything that is wrong with them no matter what
it received from the application (I exaggerate slightly).

I would say that if the application passes incomprehensible garbage, something
criminal or illegal, then it is the application that is at fault; syslogd can
only be held responsible if it produces messages that are invalid for the parts
over which it has control, eg header syntax.

So if syslogd has no idea what the transfer encoding is because the rest of the
system does not tell it, then syslogd cannot be held responsible for the absence
of a field saying what the transfer encoding actually is.  Or put differently,
if our RFC specify what the application MUST or SHOULD do, as well as syslogd,
then that is ok with me.

What syslogd would be responsible for, IMO, would be allowing characters that
have a special meaning in the syntax (eg NUL is end of message) appearing
unescaped (or otherwise encoded).  Whether we have such problems depends on the
resolution of other issues, not saying that we have at present.

Tom Petch

----- Original Message -----
From: "Rainer Gerhards" <[EMAIL PROTECTED]>
To: "Chris Lonvick" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Wednesday, November 30, 2005 2:48 PM
Subject: RE: [Syslog] #5 - character encoding (was: Consensus?)


Chris,

I fully agree - thanks ;)

Rainer

> -----Original Message-----
> From: Chris Lonvick [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, November 30, 2005 2:39 PM
> To: Rainer Gerhards
> Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject: RE: [Syslog] #5 - character encoding (was: Consensus?)
>
> Hi Rainer,
>
> I believe that we are saying the same thing.  :)
>
> If there is no indicator of encoding or language then a
> reciever will not
> know what it is receiving - just like receivers don't know
> what they are
> receiving today.  They MAY make an assumption that it is something in
> US-ASCII (but may be disappointed).
>
> If there is an indicator of the encoding and language then
> the receiver
> will know exactly what it is.  Having an indicator should be
> RECOMMENDED
> but not REQUIRED for ease of migration.
>
> Is that what we're all saying?
>
> Thanks,
> Chris
>
>
>
> On Wed, 30 Nov 2005, Rainer Gerhards wrote:
>
> > Chris,
> >
> >> Let's use this email as an example.  :)  There is no
> >> indication that I'm
> >> using US-ASCII encoding or that I'm writing in English.
> >
> > I think there actually is. If I am right, the SMTP RFCs
> require mail text to be US-ASCII. Only via MIME and/or escape
> characters you can include 8-bit data. For example Müller and
> Möller might create some problems in some mailers (But I
> guess my Mail system will encode them with =<hexval>).
> Dropping messages with octets > 127 in the subject is a
> common spam protection setting...
> >
> >> However, you're
> >> able to recieve this and read it.  Similarly, you could write
> >> an email in
> >> German and send it to me.  I would still be able to recieve
> >> it but I'd
> >> have a difficult time parsing the meaning.
> >>
> >> I'm suggesting that same approach for the transmission of
> the syslog
> >> content.  If I really wanted you to know what encoding and
> >> language I'm
> >> using in an email, I would specify a mime header.  syslog
> >> senders will
> >> continue to pump out whatever encoding and language they've
> >> been using
> >> and recievers will continue to do their best to parse them.
> >> If a vendor
> >> wants to get very specific about that, then they will have to
> >> use an SD-ID
> >> to identify the contents of the message.
> >
> > Here I agree with you. What I was saying is that IF the
> header says it is US-ASCII, only then we should assume it
> actually is. If there is no "enc" SD-ID, then we do not know
> what it is but can assume ... whatever we assume. Let me
> phrase it that way:
> >
> > If the message contains
> >
> > [enc="us-ascii" lang="en"]
> >
> > then the receiver can honestly expect it to be US-ASCII.
> But if it does not contain any "enc" the receiver does not
> know exactly and assume anything it finds useful (may be
> ASCII, may not).
> >
> > Does this clarify? I somehow have the impression we mean
> the same thing and I simply do not manage to convey what I
> intend to ;)
> >
> > Rainer
> >
> >>
> >> Mit Aufrichtigkeit,
> >> Chris
> >>
> >>
> >>
> >>
> >> On Wed, 30 Nov 2005, Rainer Gerhards wrote:
> >>
> >>> Andrew,
> >>>
> >>>>> Hi Rainer,
> >>>>>
> >>>>> Why don't we look at it from the other direction?  We could
> >>>> state that any
> >>>>> encoding is acceptable - for ease-of-use/migration with
> >>>> existing syslog
> >>>>> implementations.  It is RECOMMENDED that UTF-8 be used.
> >> When it is
> >>>>> used, an SD-ID element will be REQUIRED.  e.g. -
> >>>> [enc="utf-8" lang="en"]
> >>>>
> >>>> I like that idea too.
> >>>>
> >>>> So, if no SD-ID encoding element is specified, then we must
> >>>> assume US-ASCII
> >>>> and deal with it accordingly??
> >>>
> >>> I think not. If it is not present, we known that we do not
> >> know it. If
> >>> it is US-ASCII, I would expect something like
> >>>
> >>> [enc="us-ascii" lang="en"]
> >>>
> >>> Of course, we could also say if it is non-present, we can assume
> >>> US-ASCII. But then we would need to introduce
> >>>
> >>> [enc="unknown"]
> >>>
> >>> for the (common) case where we simply do not know it (again: think
> >>> POSIX). I find this somehwat confusing.
> >>>
> >>> Rainer
> >>>
> >>
> >
>

_______________________________________________
Syslog mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/syslog


_______________________________________________
Syslog mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/syslog

Re: [Syslog] #5 - character encoding (was: Consensus?)

Reply via email to