syslog-auth

John Kelsey Wed, 01 Nov 2000 21:02:31 -0800
-----BEGIN PGP SIGNED MESSAGE-----

At 12:33 PM 10/27/00 -0500, Chris M. Lonvick wrote:
>At 05:24 AM 10/26/00 -0400, John Kelsey wrote:

Sorry it's taken me a few days to get back to this.

Let me make some points up front, that I think make the rest
of the note clearer:

1.  I'm thinking it will make sense to put a sort of
``version number'' in front of everything else in the
authentication block.  That will specify such things as:

a.  The source of the reboot ID (e.g., clock, random,
counter).

b.  Whether there's a specific destination counter in the
message, as well as the global counter.

c.  Whether there's any encryption being used (if we open
that can of worms; there's nothing about encryption that
*requires* reliable delivery.)

d.  Any other details we want to include.

The only thing we can't do with this is specify which MAC
we're using, since the only thing authenticating the value
is that MAC.  But any other options we might want to specify
can be put into this.  I'm visualizing this as a one or two
byte field.

Among other things, this lets me get rid of that horrible
kludge of using the high order couple of bits of H to encode
the source of the H value.

2.  What should the message format look like?

I'm visualizing something like this for the authentication
block:

V = version (8 to 16 bits)
H = reboot session ID (32 to 128 bits)
L = message counter (32 to 48 bits)
(optional) L_{local} = counter for messages to this dest only.
M = MAC (64 to 128 bits)

Now, I'm curious about how important it is to keep the size
of the messages down.  There are some tradeoffs:

a.  V (version and options) will just need to be whatever
length it needs to be, to cover all the options we need.
Right now, I think I need one byte, but we may eventually
need two.

b.  H (reboot session ID) can be fairly small (e.g., 32 to
64 bits) when we're setting it with a reboot counter or a
clock, because we know that we'll never repeat a value by
accident.  When we set H randomly or pseudorandomly, H has
to be much bigger.  If we imagine a lifetime limit of 2^{32}
reboot sessions for a machine, and we want a probability of
less than 2^{-32} that there will ever be a pair of random H
values that will collide, we need about a 96-bit value for
H.  It seems reasonable to me that the length of H will be
determined by some bits in V, which specify whether H comes
from a clock, a reboot counter, a persistent counter kept on
the device, or is being randomly generated.

c.  L (message counter) needs to be big enough to handle the
maximum imaginable number of successive records in one
reboot session.  We can always just start a new reboot
session whenever L gets too big, but that won't work well
for devices that rely on a reboot counter for H.  I think L
should be 48 bits, but we can probably let it go down to 32
bits.  But I'd like feedback on this.  (My concern is that
there will be some devices that generate millions of log
messages per day, and almost never reboot.)

d.  M  (MAC) needs to be big enough that an attacker has a
negligible chance of guessing the right MAC for a fake or
changed message.  It won't have much security impact at all
to let M be 64 bits wide.

The other question is how these values are encoded.  Like,
should we be doing ASCII hex values, or maybe doing some
kind of base-64 encoding or something?  So far, the examples
have been in hex, but if we're worried about conserving
bandwidth, we can compress this down quite a bit by encoding
it in base 64, while still sticking to printable characters.
Also, we don't necessarily need the delimiters to appear in
the clear; where each field starts and stops in the
authentication block can be determined by the options
selected in V.  I don't know how worthwhile all this stuff
is to save bandwidth.

3.  We need to make at least a few decisions with respect to
key management to design this system.  For example, I think
a minimal requirement is to have a different shared secret
between the device and each syslog server that's expected to
receive messages from it.  The alternative is to have each
device have one shared secret, and every server that
receives messages from it must also know that secret.  The
problem with that is that if I compromise one of those
servers, I can fool all the other servers into thinking
they're receiving messages from the device.

There are ways to derive these shared keys on the fly on the
device.  Suppose the device has a single device master key,
K_{device}.  Suppose we also have a MAC, MAC_{Key}(Message)
which gives us an output the size of a key.  If we have an
unambiguous way to identify all destinations, we can have

K_{dest_1} = MAC_{K_{device}}(dest_1 ID)
K_{dest_2} = MAC_{K_{device}}(dest_2 ID)
etc.

This doesn't help the syslog servers to keep track of keys
at all, but it does help the devices, since they can derive
the key for each destination on the fly.  It also helps with
key management, since the owner of the device knows
K_{device}, and so can derive the key shared with each
destination, and can share those keys with the destination
in whatever way works best.

Also, we probably need to specify at least some algorithms
we'll use, and the designation of the algorithm is at least
conceptually part of the key.  We need to leave the
flexibility for people to fill in their own algorithm
choices, here.  For now, we can assume hmac-md5, since it's
fairly computationally cheap and it's still considered
secure.  We may want to steal algorithm identifiers from
IPSec or TLS.  These would be stored with the key, as a
prefix.

4.  Storage security and transmission security aren't the
same thing.  That is, what we're doing here is intended to
authenticate that the messages received were sent when and
by whom we think they were sent.  This crypto doesn't do
anything to ensure that, say, the syslog server didn't
rewrite the whole set of received messages.  (It has the
secret key used for MACing them, so it can make up messages
if it's taken over by an attacker.)  We can talk about
storage security, too; it's fairly easy in a networked
environment to do the hash-chain-based secure logging.

Okay, now, on to some comments from Chris:
...
>Let's also take a look at what happens if the rules on the
>system generating the messages are (in order):

>  Critical-severity messages go to ss01.example.org
>  Sendmail-facility messages go to sm13.example.org
>  User-facility messages go to he77.example.org
>then their wrappers won't denote sequence unless
>reassembled from their respective final destinations.  This
>could get even messier since each of the receivers may have
>additional rules for the disposition of the messages.  For
>example, sm3.example.org may have the following rules:

>  sendmail.debug messages are displayed on the console
>  sendmail.alert and sendmail.emergency messages are forwarded
> elsewhere 
>  sendmail.* are written to a file

>Honestly, I'm still OK with all of this.  The records in
>the file should still have an increasing sequence number as
>long as they still have the same H.  I could see where there
>may be clusters of monotonically sequenced messages in a
>single file.  The rest would have skips between them but
>they will still show increments.  Those gaps would either
>represent messages with different Pri, Facility or Severity
>values that went to other destinations, or messages that
>were not received.  If needed, I could go through all files
>and assemble the true picture of the sequence of all events
>from a single device except for those messages that were
>somehow dropped.

Right.  With minimal changes, the scheme as described so far
can give us these properties once we have a unique H:

a.  Replay resistance
b.  Message sequence authentication
c.  Origin authentication
d.  Destination authentication (*)
e.  Offline detection of any lost messages
f.  Offline synchronization of events (*)

(*) means we have to add something to ensure this, but it's
not going to have any important impact.

>There may be an alternative to these disjointed files.  The
>simplistic answer would be to keep a unique sequence number
>for each pri value. That would be 192 counters for robust
>systems.  Going back to the prior example, we could observe
>this on the wire:
>     <14>message1              +L(<14>.1)+H+MAC
>     <22> blah blah blah       +L(<22>.1)+H+MAC
>     <14>message2              +L(<14>.2)+H+MAC
>     <18> aieeee...            +L(<18>.1)+H+MAC
>     <14>message3              +L(<14>.3)+H+MAC

Hmmm.  I'm showing my ignorance here; why 192 pri values?
(As opposed to 17, or 2^{20}, or...)  Is that some kind of
limit to what's allowed?  And I am guessing from context
that a pri value is a destination for some messages.  Right?

So, if 192 is the limit, then in the worst case, we require
1152 bytes of counter data, assuming 6 bytes of counter per
destination.  Right?

...
>Would it be acceptable to have the following:
>Each device MUST have a single counter (let's call it
>system-L) that MUST be included in each message.  They MAY
>have any additional counters that they feel would be
>appropriate and they MAY designate them in the message.  It
>may be a policy decision on our example device above to have
>an individual counter for sendmail but not for any of the
>other facilities. That would allow something like the
>following:
>      (internal message)
>     <14>message1              +systemL(2)+H+MAC
>     <22> blah blah blah       +systemL(3)+L(<sendmail>.1)+H+MAC
>     <14>message2              +systemL(4)+H+MAC
>     <18> aieeee...            +systemL(5)+L(<sendmail>.2)+H+MAC
>      (internal message)
>     <14>message3              +systemL(7)+H+MAC

Yes, I think this is reasonable.  It needs to be easy for
the receiver to determine whether it's supposed to be
getting a counter specific to it or not.  That can be put
into V, though once a device has started keeping a counter
for a given receiver, it must maintain that counter for the
rest of the reboot session.

>Along these lines, it may be possible for an implementor to
>assign counters for Facilities, Severities or combinations -
>various PRIs.  It would still be allowable for an
>implementor to have a counter for each PRI.

>John:  I'd like to hear your thoughts on this.  Would there
>be any other mechanisms that you could suggest to simplify
>this?

This all makes sense to me.  At worst (for devices with lots
of syslog servers and very little RAM), we keep track of
only one counter, and detecting which messages were lost
requires collecting the records from all those syslog
servers.  The device can be configured to provide specific
counters for any subset of servers it's sending log messages
to, or to none.

>Thanks,
>Chris

 --John Kelsey, Counterpane Internet Security, [EMAIL PROTECTED]
PGP Fingerprint: 5D91 6F57 2646 83F9  6D7F 9C87 886D 88AF


-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 6.5.1 Int. for non-commercial use
<http://www.pgpinternational.com>
Comment: foo

iQCVAwUBOgCCxyZv+/Ry/LrBAQG7FwQAjKHQRh/uZQhmYO1ZpWQTj+MOoGHL1u2I
BbOV3Vzy1IymsAPmjl0CECD89r5kM7SMqcZ/0F+VGV98OvK2O5pzMQw5vvJtFD4A
QLiwmPghRvIyUuKxI7tFbFpR2A/WildY+mT0IOpCoP5iSEbxCX1NtTe2zoD2/MS1
7gkzqfFf3f0=
=c1V6
-----END PGP SIGNATURE-----
syslog-auth

Reply via email to