-----BEGIN PGP SIGNED MESSAGE-----

Guys,

I have been thinking about the problems of adding crypto to
UDP-based syslog, in a way that we can support even on
devices with very few resources.  I think I know how to do
this in a simple way that will require little overhead and
will add a great deal of security.  I wanted to bounce this
off the list, because while I understand the cryptography for
doing all this, I have a lot less background with the
networking stuff.  I'd really appreciate comments.

When a person or program is looking at a bunch of syslog
messages, they need to know exactly what has been verified
about the messages.  It seems to me that one big problem
with Alex' method with using hash chains comes from
this--there's no way to tell whether a message that's been
received among a bunch of other messages is being replayed,
or is simply surrounded on both sides by lost messages.
(Though he wasn't proposing his scheme for anything but a
quick fix.)  If I'm looking at a set of syslog messages, I
need an unambiguous statement about what I can trust about
these messages.  Were they all sent in the sequence I'm
seeing them?  Do I see some indication of missed messages?
Do I have any idea when these messages were sent?

Schemes with reliable delivery ought to be able to handle
this really well; I haven't looked at the use of crypto in
BEEP, but from my very brief reading of their proposal, a
person or program reading a log file received at a syslog
server will generally know that the messages weren't
replayed, and when they were sent, will know the right
sequence for the messages, and will know whether there were
any blocks of time during which the connection went away and
there were no messages sent/received.  (Is this right, or am
I missing something?)

My impression is that unreliable, one-way delivery is still
an important case to be able to handle.  For example, a
secure syslog scheme won't be a drop-in replacement for all
current applications of syslog over a network while it
requires two-way delivery or reliable delivery, right?

Anyway, there are two basic ideas behind the way I think the
problem should be solved:

a.  Each time the sending machine (``device,'' I guess)
reboots or otherwise loses its current state, we call that a
new ``reboot session.''  So, if a machine crashes and starts
back up, it's in a new reboot session.  Since we have
unreliable message delivery, we can't count on a message
informing the recipient (``log server'') that we've
rebooted.  It turns out to be really useful to give each
reboot session its own unique ID, which I'll call H.

b.  Each message within a reboot session is numbered.  That
is, we keep (say) a 32-bit counter, and increment the
counter for each message.  I'll call that counter L.

We include H and L in every message, along with the MAC used
to verify that the message hasn't been altered in transit.

Now, there are a couple neat things this does for us right
away:

a.  When we receive a bunch of messages with the same H and
different L, we can put those messages into the order in
which they were sent, and we can see any ``gaps''
representing messages that didn't arrive.  There's no way
for an attacker to splice in messages from previous reboot
sessions into this reboot session's messages, or confuse us
about the order of these messages, becase he can't change H
or L without messing up the MAC.

b.  If we can figure out a way to make sure those H values
come out in some sequence (like clock values or reboot
counters), we can put every message we ever receive in the
right sequence, and we may be able to notice gaps of whole
reboot sessions.  If we can store old H values previously
received on the syslog server from this machine, we can
*always* detect replay attempts.

That means that this stuff can give us everything we can get
from cryptography in this application, as far as
authentication goes.  (We can also use the same H and L
values and shared keys to encrypt messages securely.)

Note that we don't really care if we see the first message
of a new reboot session.  All we care about is that messages
with the same H value are always from the same reboot
session, and within the reboot session, the messages were
sent in the order of their L values.

Now, this raises only one problem:  Not every device has any
persistent state (like a clock, reboot counter, or flash
memory) across reboot sessions.  The question is then how to
get a unique H value on those machines.  If we had a random
number generator on the machine, we could just generate H
randomly; a 128-bit H would more-or-less guarantee, in
practice, that we'd never see two reboot sessions with the
same H value from the same device.  But a device with no
clock, persistent memory or reboot counter is unlikely to
have a good random number generator, and a crappy one (like
an LCG based on a 32 bit seed) won't work for this
application.

My general solution to this is to use an initial value of
H==0, which is defined to be a special ``null'' reboot
session identifier, and then to keep a running hash of
messages sent, and eventually, to generate a new H from it.
Thus, most messages from these very low-end devices will
have a unique reboot session ID, but the first few messages
from each reboot session will have H==0, and so won't be
able to be placed in a known sequence with certainty.

To do this, we start with some assumption about how many
messages, N, it takes before we are all but certain that
this sequence of messages is unique among reboot sessions.
We then take the hash of those N messages, and use it as the
reboot session ID.   Included in this hash may be any number
of optional inputs taken from the device's internal state.
If there's an internal clock which is reset each time the
system reboots, then hashing in the value of that internal
clock along with each message is a good idea.

The idea here is to use the messages themselves to generate
H.  There's no authentication going on from this, but using
a hash function here ensures that if there's even one bit
changed in one message, we get a different H.

The whole proposal thus looks like this:

a.  At reboot time, if we can, we generate a
guaranteed-unique H.  The first couple bits in H specify
what other properties it has, such as whether H includes a
timestamp, reboot counter, etc.  There's nothing about H
that requires any secrecy or unpredictability, we just want
to make sure it's always unique.

If we can't generate a guaranteed-unique H, we set H = 0,
and start accumulating the running hash of messages for
the first N messages.

In either case, we set L = 0.

b.  Each time we send a message, we increment L.  The
message we send is

message-text,H,L,M

where M = the MAC of (H,L,message-text).

If H == 0, then we also keep a running hash of messages sent
so far.

If H == 0 and L == N, we compute the final hash value for
all those messages we've hashed together, and use it for our
new H value.  We keep using the same L value.

c.  When we receive a message, we verify the MAC, and then
look at H and L to see whether they're consistent with what
we've been receiving lately.  Whenever H changes, we
consider that a discontinuity--we know how to arrange
messages with the same H value, but not necessarily how to
arrange messages with different H values.

The first couple bits of H signal what kind of reboot
session ID this is: 00 means it's random, 01 means it's
based on a clock, 10 means it's based on some kind of
counter.

Comments?

 --John Kelsey, Counterpane Internet Security, [EMAIL PROTECTED]
PGP Fingerprint: 5D91 6F57 2646 83F9  6D7F 9C87 886D 88AF


-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 6.5.1 Int. for non-commercial use
<http://www.pgpinternational.com>
Comment: foo

iQCVAwUBOff4JCZv+/Ry/LrBAQHpIwP9GCErX1Y+UE0yLMTUJj02nN2o3DUUGIRj
Lt16amsWwnGlSa4cvT4zyCGOMT7iS1peXilD3cXAlLABR0kvayobPp+zprjPoWJY
LtTttfl3Zfb/AxFOPC+1VNflJgxt0nS9Gx1qde6WZcZNlRA6tseaGagUO1wAq6to
LsIaHk1cAXU=
=K5aV
-----END PGP SIGNATURE-----


Reply via email to