OK, enough bickering about the fine points of how timestamps
should go over the wire. Here's a specific straw-man
proposal for a fully cooked protocol. The basic idea is to
use the XML-style structure of unalog, but with the more
compact name=value representation of ULM.
Data elements are taking pretty much straight from ULM, with
some extensions. Required data elements are:
* LVL=0..99 - ULM does specific strings, but theirs are
not really levels, but sort of a combination of level
and facility. This is a pure level, with mappings to
the existing syslog and SNMP levels spelled out in the
spec.
* HOST=string - unique identifier for the host creating
the log message
* PROG=string - name of the program generating the
message. Probably need some standard "special" names
like "kern". Can also add the facility concept here by
doing things like MAIL/MTA/sendmail,
MAIL/ACCESS/popper, KERN/VM, KERN/FS, though we need to
define some of the base "facilities".
* DATE=YYMMDhhmmss.fraction (unlike ULM, this must be in
UTC).
Optional data elements are:
* TYPE=string - event type such as AUTH.FAIL,
AUTH.SUCCESS, PROG.FAIL, PROG.START, etc. -
extends/subsumes ULM's STAT
* TZ=[+-]hhmm - extension to ULM to show timezone of
originating process
* LANG=<ISO-3366 2-letter code>
* DUR=seconds.decimal - duration of event in seconds
* SRC.ADDR=(IPv4 dotted notation, IPv6 16HEX, others?) -
ULM uses .IP, which seems restrictive to IP-only.
* SRC.FQDN=string
* SRC.NAME=string - some identifier of the source system
other than address or FQDN.
* SRC.USR=string (user name or similar)
* SRC.MAIL=string (e-mail address)
* DST.ADDR, DST.FQDN, DST.NAME, DST.USR, DST.MAIL - dest
instead of source
* REL.ADDR, REL.FQDN, REL.NAME, REL.USR, REL.MAIL -
relay/proxy instead of source/dest
* VOL, VOL.SENT, VOL.RCVD, CNT, CNT.SENT, CNT.RCVD -
volume in byest and count (articles, files, events,
etc.)
* PROG.FILE, PROG.LINE - name (and line number) of the
program source file from which the message was
generated (useful for messages like "out of memory",
"can't fork", "assert failed", etc.).
* TTY=string - tty or other description of user's
physical connection to the host
* DOC=string - name of an accessed document, such as an
FTP file, a newsgroup, or a URL.
* PROT=string - protocol used such as ESMTP, SSH2, etc.
* CMD=string - an issued command
* MSG=string - free form message text
Structure:
Since we're talking variables here, it would be nice to be
able to provide context in the message stream. This is done
by putting end-messages in <M name=value ...> and then
wrapping contexts within <CNTXT name=value ...>name=value,
...</CNTXT>:
<CNTXT HOST=myhost.somewhere.com>
<M LVL=22 PROG=sendmail DATE=19991025105522
FAC=MAIL TYPE=PROG.START DOC=/var/log/whatever
MSG="processing queue">
<CNTXT LVL=80 PROG=tripwire DATE=19991025105519
FAC=AUDIT TYPE=AUTH.FAIL MSG="Expected mode 0444,
saw mode 0644">
<M DOC=/a/b/c/d>
<M DOC=/a/b/c/e>
<M DOC=/a/b/c/f>
<M DOC=/a/b/c/g>
</CNTXT>
</CNTXT>
As you can see, adding this bit of syntactic sugar can
decrease the overall byte count quite a bit and makes up for
the extra bytes needed to support the flexibility of this
scheme. Additionally, by assuming that message streams not
beginning with < are <M>, we can whittle down the number of
bytes needed to xmit simple messages over UDP (though
obviously still much higher than existing syslog). The
reference implementation can also be made to process
existing syslog format and convert to this format.
One drawback to such structure is that it makes the logs
kind of hard for humans to follow, but that problem can be
solved by having an "expansion" script/program. The example
above would expand to:
HOST=myhost.somewhere.com LVL=22 PROG=sendmail
DATE=19991025105522 FAC=MAIL TYPE=PROG.START
DOC=/var/log/whatever MSG="processing queue"
HOST=myhost.somewhere.com LVL=80 PROG=tripwire
DATE=19991025105519 FAC=SEC TYPE=AUTH.FAIL
MSG="Expected mode 0444, saw mode 0644"
DOC=/a/b/c/d
HOST=myhost.somewhere.com LVL=80 PROG=tripwire
DATE=19991025105519 FAC=SEC TYPE=AUTH.FAIL
MSG="Expected mode 0444, saw mode 0644"
DOC=/a/b/c/e
HOST=myhost.somewhere.com LVL=80 PROG=tripwire
DATE=19991025105519 FAC=SEC TYPE=AUTH.FAIL
MSG="Expected mode 0444, saw mode 0644"
DOC=/a/b/c/f
HOST=myhost.somewhere.com LVL=80 PROG=tripwire
DATE=19991025105519 FAC=SEC TYPE=AUTH.FAIL
MSG="Expected mode 0444, saw mode 0644"
DOC=/a/b/c/f
Authentication and encryption:
Most existing secure syslog implementations have extended
the protocol with network-level hop-by-hop authentication
and encryption. What I'm after, however, is something in
the data stream itself so that the data can pass through
un-trusted third parties and so that third parties can
independently verify that logs stored on disk/tape/whatever
have not been modified.
To do this, I introduce <CRYPT> and <AUTH>. Auth works
something like this:
<AUTH SIGNER>stuff to be signed</AUTH
SIG.TYPE=md5, SIG.VALUE=j43kj3248>
Where the signature value is encoded in base-64.
Crypt with static previously exchanged keys looks like:
<CRYPT [EMAIL PROTECTED]
[EMAIL PROTECTED] KEY=STATIC
CIPHER=static>
base-64 representation of stream
encrypted in blowfish using previously
exchanged key in previously exchanged
cipher
</CRYPT>
Crypt with public-key (which also provides sender and
recipient authentication) looks like this:
<CRYPT [EMAIL PROTECTED]
[EMAIL PROTECTED] TYPE=RSA
CIPHER=blowfish KEY=jfkadj43098kjer>
base-64 representation of stream
encrypted in blowfish using key
specified above, itself encrypted using
[EMAIL PROTECTED]'s private RSA key and
[EMAIL PROTECTED]'s public RSA key
</CRYPT>
You could also leave out the ENCRYPTOR to get the "anyone
could have written it but only this guy can read it"
situation (which can be fixed with an extra <AUTH>). Or
leave out the RECIPIENT to get "any of a small number of
people I've sent my public key to can read it but only I
could have sent it".
The reference implementation will also need code to process
the authentication and encryption records. Perhaps the
"expand" function mention above can automagically decrypt
and stick in AUTH=NONE, AUTH=<signer>, and AUTH=!<signer>
(for failed authentication).
--
Chris Calabrese
Internet Infrastructure and Security
Merck-Medco Managed Care, L.L.C.
[EMAIL PROTECTED]