WG,

Sorry for joining in the discussion late. I've only just found some time to
reply.

My thoughts below...

The new format looks great.

<PRI>VERSION TIMESTAMP HOSTNAME APP-NAME PROCID MSGID [SD-ID]s MSG

Replace all received null characters with either <00> or /0. My preference
is <00>.

Keep MSGID in the header as a required field

SD-IDs should come before the MSG. Otherwise encoding issues and MSG
delimiter will become a problem.

Store all messages written to disk in UTF-8 format. This allows any received
encoding to be stored safely without loss or corruption.

My preference is to enforce UTF-8 for data encoding on the wire. This allows
US-ASCII to be used for the first 127 characters and Unicode mappings into
UTF-8 for all other international characters. Trying to switch encodings for
each message based on the SD-ID language or local setting will be a parsing
nightmare. As far as I know, all modern systems are now capable of sending
in US-ASCII or mapping their own language into UTF-8. Can anyone think of a
good reason not to enforce UTF-8?

I believe the above format would be easy to implement in both a sender and
receiver. Mandating that the disk storage format is UTF-8 would also help
reporting and parsing of all languages and character sets. 

Mapping over UDP should be limited to a single message per packet.

When mapping over plain TCP I believe we should limit the total message size
to 65507 bytes (to keep it compatible with UDP) and delimit each message
stream with an LF, or CRLF. Either delimiter would work for me.

Rainer, keep up your good work and persistence on the drafts. I believe the
new format will solve a lot of problems.

Cheers

Andrew




_______________________________________________
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog

Reply via email to