Hi all,
I'm just updating my parsing code to work with the RFC 3164 standard and have come across a few issues. I was wondering if anyone else has come across this or could offer a solution. 1). Solaris 8 sends messages in the following format... <38>Apr 2 11:41:25 sshd[12345]: [ID 54321 auth.info] Accepted password It is missing the hostname field in the header, but has a valid PID. What do we do here? Do we assume the host address is the socket peer address and insert it? Or do we assume the hostname is "sshd[12345]:" and the PID is "[ID"? which is obviously not correct (to a human). Do we just mark the packet as having an invalid header and treat the whole thing (after the priority field) as message text and not try and extract details from it? Do we make an exception for Solaris for being "nearly" right and try and 'fix it' by looking for a Unix PID[1234] type sequence and inserting the socket peer address? 2). When parsing the hostname, assuming we allow for IP V6 addresses, how do we determine a valid host name or address? As I understand it IP V6 has an address like: FEDC:BA98:7654:3210:FEDC:BA98:7654:3210 (39 chrs) How do I know it is a valid address and not some Unix type PID code ending in a colon? Are the [] braces generally accepted as being the container for the PID? The overhead involved in parsing a message, ensuring it actually complies with the RFC and contains valid data is very high. Does anyone have a set of tests to ensure the data is 100% valid? 3). How do we decide what year the RFC date is referring to? We have to take the year into account on the collector in order to ensure the date is really valid. Feb 29 is only valid on leap years for example. In my program the RFC date stamp is converted to a real date with a year so that it can be validated and used elsewhere in the program. At the moment I am using a "closest year" approach based on the current month and the month reported by the message. Anyone else got any bright ideas. Any feedback would greatly appreciated. Cheers Andrew
