Hi all,

I am taking up on Chris' call for syslog internationalization.

If we look into languages with a huge character set (e.g. Japanese,
Chineese and Korean), we obviously need to encode this characters with
more than a single byte (octet, to be precise). Depending on the
encoding, between 2 and 5 bytes are needed for a single character.
Obviously, we have some issues with the encoding as this is not
printable in an US ANSI sense. But let's postpone this discussion. I
would just like to look at the message *size*.

It is fair to say that we need at least twice as much bytes as with US
ANSI. Thus, the usable message size drops dramatically to around 500
characters. If we look at the actual encoding, it can get even worse:
one approach (though probably not a clever one, not really thought this
out yet) could be to use base64 encoding on 8bit character streams. That
would fit nicely in "traditional" and current RFC interop. BUT with
bass64, we have even lengthier messages and as such the usable
"character message size" (payload) would probably be reduced to a point
where it is simply unsuable (read: too short to do something useful).

Thus it would make sense to allow for larger syslog message, BUT

- we run into interop issues with existing syslog implementations / RFCs
- this raises UDP fragmentation concerns which can mess up the whole
syslog message

I have no good answer on how this could be solved.

The only thing I have on my mind is that we introduce a new larger max
size that the message MUST NOT exceed and recommend that the size SHOULD
be less than the current limit. Then, we could suggest that an
implementor SHOULD add a user-configurable option to allow large syslog
packets - and probably also an option NOT to disregard "oversize"
packets (according to the existing RFCs/IDs).

In this scenario, a syslog client SHOULD also be configurable to emit
messages with the current max length - this would most probably mean
message truncation. An alternative to truncation would be to introduce
some fragmentation at the syslog level, but I do not "feel" really good
about this.

I think the length restriction on syslog messages - and how to slove it
- severely impacts the possible message encoding and other parts of
"international syslog". As such, I think it makes sense to sort out
these things before proceeding into any payload issues.

Comments are highly appreciated. Hopefully I am overlooking something
obvious.

Rainer Gerhards


Reply via email to