RE: [Syslog] New direction and proposed charter

2005-11-22 Thread Rainer Gerhards
WG,

I have completed the promised testing. I used various syslogds on Linux,
BSD and Windows platforms. The list obviously is not complete, but I
think I got a fair enough sample of what is deployed.

The good news is that by putting the PRI part in front of the message
all of them were able to put the otherwise syslog-protocol-15 formatted
message into the right bins. The format recorded was also acceptable,
the non-recognized hostname was replaced by the sender address and the
local date was added. This is within the typical current user experience
when different syslogds are being used. Please note that some (e.g. BSD
syslogd) do never pull the hostname from the message but always use the
sender's address.

The only culpit that I came along is associated with NUL octets. Many
syslogds can not handle them. The message is only recorded up to the
first NUL inside the message, no further interpretation happens. 

We have had discussion on this topic previously. As a reminder, it was
said that excluding the NUL would be a crappy little rule that could
open the door for many more CLRs. I still tend to think this is true and
the problem exposed is acceptable.

I try to sum up yesterday's discussion and my current position on it:

Once again, I think David's comments on the charter are in the right
direction
(http://www.mail-archive.com/syslog%40lists.ietf.org/msg00143.html). It
calls for some compatibility but puts emphasis on newer development. I
suggest we accept the wording.

We have had several comments on the field order in syslog-protocol.
Based on them, I propose the following format:

PRIVERSION TIMESTAMP HOSTNAME APP-NAME PROCID [SD-ID]s MSG

With (optional) SD-IDs for
- Extended Facility
- TRUNCATE 
- MSGID
- [Language Identification]

Please note that I moved MSGID to the optional SD-ID part and instead
re-introduced APP-NAME and PROCID as formal fields. I have done this
because these two were designed as a replacement for TAG, whereas MSGID
was meant to be something totally different. If we would prefer to have
one field, only, I recommend to name it TAG and use the same semantics
RFC 3164 describes.

I support Chris proposal to leave the current size limitations exactly
as they are in syslog-protocol. Everything else would cause the n-th
re-iteration of the message size issue (side note: exactly the same
discussion seems to have started on the NETCONF WG for event
notifications right now, so this is an ever-popular topic).

I have included an Extended Facility to provide finer-grain facility
control within the message. This was brought up by Anton and others and
does not seem to hurt anything if in SD-ID.

I have included the Language Identification. I don't object it, but I
question its usefulness.

If we follow the message defition above, we can probably recycle most of
the text in syslog-protocol, just shuffle it a little around. This has
two advantages:

- we do not loose what we already have discussed
- the work can progress rather quickly

Also, syslog-transport-udp most probably does not need any change at
all, at most in a minor way.

Modifying existing syslog receivers should not be very hard with the new
definitions. The only major issue I see from the implementors point of
view is UTF-8 decoding. But that is more of a storage problem. It is of
course possible to receive the message and store it as UTF-8. I do not
think this would cause non-compliance to the spec - or would it? If it
would, UTF-8 would most probably be a major drawback when it comes to
implementor acceptance. I am also a bit concerned about the NUL
character, which can only be handled in one of two ways with existing
syslog code base:

- implementing a byte-counted string class and do not use the C RTL
- replace NUL with an escape sequence upon reception (e.g. 00)

I guess most implementations would take the later route. If we consider
this to be acceptable, the majority of syslogds should be fairly easy to
upgrade.

If we can agree on these points, I would volunteer to implement the
resulting document in C code so that we might see if there were any
hidden problems. I would try to apply as minimal changes as possible.

Suggestion for progressing:

- we need more comments from other list members!
- we should re-charter
- we should reach rough consensus on the new format
- once done, I can update syslog-protocol to it
- I (and others?) can do test implementations in
  parallel to the review
- discussion can show if (hopefully minor) adjustments
  need to be made
- the goal should still be to finish this work
  (including AD approval) by the next IETF meeting

Rainer

 -Original Message-
 From: Chris Lonvick [mailto:[EMAIL PROTECTED] 
 Sent: Monday, November 21, 2005 9:58 PM
 To: Rainer Gerhards
 Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
 Subject: RE: [Syslog] New direction and proposed charter
 
 Hi Rainer and all,
 
 On Mon, 21 Nov 2005, Rainer Gerhards wrote:
 
  Chris  WG
 
 
  From the meeting, it sounds like 

RE: [Syslog] New direction and proposed charter

2005-11-22 Thread Anton Okmianski \(aokmians\)
Darren:

  WG,
  PRIVERSION TIMESTAMP HOSTNAME APP-NAME PROCID [SD-ID]s MSG
 
 I would put the SD-IDs after the message.
 
 The SD-IDs and detailed bits of meaning to the MSG and 
 without the MSG, are irrelevant.  The exception being a 
 language marker.

I would prefer SD-ID where it is in example.  I would also re-iterate 
suggestion of having MSGID in the header, which a number of people supported.  
Those two combined are arguably more important than the MSG part itself.  For 
example (in abbreviated syntax:

host.domain.com MyApp proc1234 STARTED_UP [ip=1.1.1.1]: The applications has 
started

I could live without the MSG here if it got truncated. The MSGID and SD-ID are 
much more important in this case.  BTW, the possible truncation of text and its 
variability (possible substitutable variables) is another reason why MSGID is 
so useful. It makes it easier for intelligent receivers to do things like event 
correlation. 

Thanks,
Anton. 

 
  - replace NUL with an escape sequence upon reception (e.g. 00)
 
 Why not \0 ?
 
 Darren
 
 ___
 Syslog mailing list
 Syslog@lists.ietf.org
 https://www1.ietf.org/mailman/listinfo/syslog
 

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


RE: [Syslog] New direction and proposed charter

2005-11-22 Thread Rainer Gerhards
  If we go for framing, we must use byte-couting, because we have not
  outruled any sequence. If we go for octet-stuffing, we must 
 define an
  escape mechanism. Any of this would be helpful for plain 
 tcp syslog, but
  that is definitely a big departure from current syslog. 
 Please note that
  currently many syslogds do octet-stuffing and the message 
 TRAILER is LF.
 
 That's unfortunate :(

I agree, but that's the way it is in current (non-standard)
implementations.

 
 In nearly all IETF protocols, the message trailer or EOL 
 marker is CR-LF.

If we go for a very simplistic tcp transport, there is nothing that
hinders us in chaning it to CR-LF. That would also be compatible to
existing receivers, as LF thankfully comes after CR...

Rainer

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


[Syslog] Message format

2005-11-22 Thread Andrew Ross

WG,

Sorry for joining in the discussion late. I've only just found some time to
reply.

My thoughts below...

The new format looks great.

PRIVERSION TIMESTAMP HOSTNAME APP-NAME PROCID MSGID [SD-ID]s MSG

Replace all received null characters with either 00 or /0. My preference
is 00.

Keep MSGID in the header as a required field

SD-IDs should come before the MSG. Otherwise encoding issues and MSG
delimiter will become a problem.

Store all messages written to disk in UTF-8 format. This allows any received
encoding to be stored safely without loss or corruption.

My preference is to enforce UTF-8 for data encoding on the wire. This allows
US-ASCII to be used for the first 127 characters and Unicode mappings into
UTF-8 for all other international characters. Trying to switch encodings for
each message based on the SD-ID language or local setting will be a parsing
nightmare. As far as I know, all modern systems are now capable of sending
in US-ASCII or mapping their own language into UTF-8. Can anyone think of a
good reason not to enforce UTF-8?

I believe the above format would be easy to implement in both a sender and
receiver. Mandating that the disk storage format is UTF-8 would also help
reporting and parsing of all languages and character sets. 

Mapping over UDP should be limited to a single message per packet.

When mapping over plain TCP I believe we should limit the total message size
to 65507 bytes (to keep it compatible with UDP) and delimit each message
stream with an LF, or CRLF. Either delimiter would work for me.

Rainer, keep up your good work and persistence on the drafts. I believe the
new format will solve a lot of problems.

Cheers

Andrew




___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] New direction and proposed charter

2005-11-22 Thread Darren Reed
   WG,
   PRIVERSION TIMESTAMP HOSTNAME APP-NAME PROCID [SD-ID]s MSG
  
  I would put the SD-IDs after the message.
 
 This raises the question of what terminates the MSG part ;)

Using the above syntax, how do you distinguish between [] at the start
of the message from actualy SD-ID data?

I think what's missing from the above, is a ':' and the syntax should
be:

PRIVERSION TIMESTAMP HOSTNAME APP-NAME PROCID [SD-ID]: MSG

The protocol document needs to outlaw ':' being in any field before
the MSG.

If you mark VERSION, PROCID and SD-ID data as all being optional
then the format comes back to being very close to what's in use today.

 That would
 mean we would need to introduce byte-counting, at least I think so.

Well, without the ':' to say where the MSG starts, I'd have argued
How do you tell where SD-ID ends and MSG starts? vs there just being
a string of bad SD-ID data following some good SD-ID data.

As for but the SD has important information and the MSD does not,
that's simply a matter of how you structure the message.

   - replace NUL with an escape sequence upon reception (e.g. 00)
  
  Why not \0 ?
 
 That's another good choice.

It's also how data gets escaped, in general, in Internet stuff.

 That was my main message. Is it better to live
 with that or introduce a CLR on not allowing NUL?

I'd like to see NUL outlawed from messages.

Darren

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog