> Note that this only adds the parsing, the rest of the current behaviour > of stays the same. I have another diff in the pipeline for allowing the > hostname in the message.
I object to this process. You want to add parsing code as a fait accompli. With no justification. Then later on, spring on us the code that uses it. What if we disagree with the code that uses it? Will you delete this parsing code which nothing uses? > - Timestamp: is easy to interpret, since it's a strict format. > No changes here. I believe "timestamp missing" is not strictly permitted. But this was common for a while, and in OpenBSD it is the default message format. This is a due to the desire to make syslog_r(3) be signal/re-entrant safe on top of sendsyslog(2). Then there is no good way of creating a timestamp string in the sender libc context. A timestamp is easily re-inserted by the receiving syslogd. > + for (i = 0; i < HOST_NAME_MAX; i++) { Unlike MAXHOSTNAMELEN/NI_MAXHOST, HOST_NAME_MAX storage does not include a NUL. You might have the loop right. Be careful. > + * fqdn: [[:alnum:]-.] That is not totally correct. hostnames very often also contain '_' in the middle positions, early RFCs said no, but around 1990-ish Vixie in particular had to face reality.. I was involved in that conversation, it seems so long ago. Your pattern is also incorrect in other ways, such as ".." is not legal, hostnames cannot start or end with '-' or '-', etc. The current accepted rules are encoded in the undocumented libc function __res_hnok in lib/libc/net/res_comp.c I don't know if false-identification of broken hostnames is bad or not I guess it depends what will happen with this information later on [ie. the part of your proposal that is being kept a secret]. Will an incorrect parse become dangerous, I don't know. Because you are keeping a secret. I also don't know happen when I use a program called foo.com, which will look like a hostname obviously. There is absolutely zero strictness for messages programs can send -- any program can send ANY GARBAGE it wants -- so I'm confused as to the purpose of this proposed layer. So I am confused why there is an attempt to messages indistinguishable from garbage. I will also bring up log4j. People thought the software only did logging. As in receive messages, and store or redirect them. Similarily in syslogd, we expect this layer to do the ABSOLUTE MINUMUM, because this daemon code is always running a a critical position and we cannot accept it having a hole. syslogd was intentionally simple. Why does syslogd now need to parse messages, when it hasn't parsed messages since instruction in 4.3BSD? But you didn't tell us your proposal, you kept it a secret. You diff is not OK.