Hi Andrew, > Can anyone come up with a way that the initial message header > can be read in any character set and still make sense?
I strongly think the header should NOT be changed. In my view, i18n should apply to the payload, only. Besides that, US ANSI is always present in foreign character sets. I know this for sure for European languages as well as Japanese, Korean and Chineese. With European alphabets, some characters may be reassigned. But these are seldomly used ones like "~" and "^". This should be no problem at all as far as the header is concerned. > > If the header contained the message encoding method then we > could choose to encode it in any method we wanted. UUE, > Base64, UTF-8 etc. Final syslog server that writes the data > to file or displays messages would simply decode data into > users desired character set. Relays could just pass the data > unaltered. Yes, I agree that this is a solution to solve the problem of the potential need (?) for differnt encodings. However, I would stay away from the header and tend to define a payload format that introduces a "payload header" which then A) designates this as an i18n enabled message B) specifies the charset C) has another payload section which then includes the actual message Of course, all of this needs to be backwards compatible. We can't risk to break existing implementations... > > Bearing in mind that the syslog messages were originally > intended to be read by an operator at some stage. It is no > use if they remain encoded at the end point. :) That is a beauty of UTF-7: it remains at least partly readable. The actual storage is not part of the protocol, I think it should be left to the syslogd implementation. > > Maybe after the <PRI> code, we use a couple of bytes to > indicate the message encoding type. Better still, make it the > first two bytes that determine the whole message encoding > system. FF 01 = uue, FF 02=UTF-8 etc. Avoid using any 00 > bytes so as not to confuse C strings. I strongly object this - it would break all current RFCs/Ids ;) > > My 2c worth on TCP vs UDP. > > UDP is great for a last minute "I am dying" type message from > a device. TCP would require ACK back and a connection setup > if not already connected. If we go for something like BEEP > with the extra handshaking we are in for a longer wait before > sending the message. > > Question is, how many "I am dying" type messages do we get vs > normal "user logged in" type messages? Probably not enough to > warrant staying with UDP. > I posted a syslog-like protocol draft that details TCP transport. It also nicely shows a number of shortcomings this approach has. I'll dig out the URL and post it here for those interested. In general, I (now;)) think that RFC3195 is the right way ... We just need to get the small libs up and running (I have to admit that my ambitions in this regard have been severely hit by other work). It is the right long term solution. If you really think about a reliable protocol, you'll end up with mtr's point that if you do it up you end up at a protocol very similar to BEEP ;) Aynhow, I wouldn't object to participate if there is some support for an interim solution or at least a documentation of the current state. As Andrew and others pointed out, there *are* a good number of existing syslog/tcp applications out there, and at least 4 of them (Pix, Kiwi, syslog-ng and WinSyslog) are interoperable. Looking forward to comments! Rainer
