Re: Simple authentication of syslog messages for embedded network devices

John Kelsey Mon, 16 Oct 2000 09:48:21 -0700
-----BEGIN PGP SIGNED MESSAGE-----

>Date: Wed, 11 Oct 2000 01:19:14 -0400
>From: Alex Brown <[EMAIL PROTECTED]>
>Subject: Re: Simple authentication of syslog messages for
>embeddednetwork devices  

>Chris Lonvick wrote:

...
>>- The thought of using the chain works well in a system
>>  with sequenced delivery.  However UDP won't ever provide
>>  that so we may see a behavior of the following:

[ Illustration of a problem with hash chain deleted to save
  space. --JMK]

>>  There would be nothing to confirm the order of C
>>  relative to D.  The indication would be that there were
>>  gaps between B and F and that both C and D were received
>>  in that time period however, it could not be reliably
>>  said that C was generated before D.  Your note about the
>>  sequence number could change this behavior and I'd like
>>  to see that explored.

>Yes, there's no way to sequence single-packet segments.
>This is not intended to provide reliable transport or even
>reliable characterization of corruption.  It will detect
>bogus packets and sequence corruption; that's about it.

So, I'm really puzzled about why you've stuck with the hash
chain idea.  It gives you almost none of what you're trying
to accomplish, here.  I think I can design something simple
that *does* do most of what you want, using sequence
numbers or hash chains.  (But in that case, the hash-chain
version is more complex.)

Let's consider some of the security properties here:

a.  Detecting alterations of messages.

Putting a MAC into the log message takes care of this
nicely.

b.  Detecting missing messages.

Putting a MAC into the log message does us no good here.
The hash chain scheme that Alex uses detects *breaks* in
sequences of messages, but doesn't tell us anything about
how many messages were omitted.

c.  Detecting inserted messages.

Putting a MAC into the log message does no good here,
either.  Alex's hash chain scheme detects discontinuities in
the message stream, but can't distinguish between a couple
of dropped packets and an inserted message.

d.  Detecting replayed old messages.

When we receive a message from some sender, the MAC
convinces us that the message really came from the sender
and hasn't been altered since.  But it doesn't tell us
*when* the message came from the sender.  An attacker can
record messages today, and replay them to us in a month.

There are a couple of possible ways to fix this:

(i)  Add some kind of sequencing information, like a hash
chain or sequence number, so that it's obvious that we're
receiving old log messages.  This adds some overhead to the
sender and receiver, but it prevents replay attacks if
it's done properly.

(ii) Keep track of log messages we've stored, and never
store a duplicate message.  This only detects replay attacks
if we keep every log message ever generated by this sender,
so that we recognize every duplicate message. Otherwise, it
just increases the difficulty of replay attacks, by forcing
the attacker to wait longer between when he records the
messages and when he replays them.

Now, Alex is using a hash chain in his scheme, but it
doesn't detect replays.  An attacker can record messages
from this sender today, and then replay them tomorrow, and
the receiver will never be able to distinguish this from
long streams of valid log messages with an occasional
dropped packet.

If the sender can be assumed to have a clock, reboot
counter, or any persistent storage, replays are  easy to
resist.  If not, then it's harder to resist, but it can
still be made easy to detect that something strange is going
on.  But the current scheme in Alex's draft doesn't do this.

I'd like to recommend making some changes to Alex's scheme
to resist replays whenever possible, and I'd like to also
add some recommendations to the document for implementors,
to resist replay attacks online and detect them offline.
It's *critical* that an attacker not be able to replay a
totally misleading sequence of log messages from a previous
day, and get it added to the log in a way that's believed by
the sysadmin reading it to be cryptographically
authenticated.  This is probably worse than leaving the
sysadmin knowing his log messages are unreliable.

e.  Resisting flooding attacks.

We'd like to be able to resist attacks where an attacker
floods us with bogus messages to fill up our memory, so that
we either lose previously stored messages, or can't accept
future ones.  Once we've added a MAC to the messages,
flooding is possible only by replaying old messages.  The
attacker records a bunch of messages, and then floods the
receiver by replaying them in a loop, over and over again.

If the receiver can detect replays, it can just discard
these messages.  If not, it may still keep track of the MACs
of a few old messages from various days, and use this to
at least detect flooding attacks.

Since the current hash-chain based scheme doesn't let us
discard replayed messages, it doesn't protect us from
flooding attacks.

>> Is this going to be limited to md5 or are other hash
>> algorithms going to be accepted?

>Again, that's up to the implementor.  This is just an
>indication of a general method.

The MAC used ought to be considered part of the key shared
between the sender and receiver.  We're already securely
distributing that, and there's nothing to be gained by
allowing additional changes of algorithm on the fly, since
the sender and receiver can't negotiate a new algorithm or
anything.

I mean, is there any circumstance where we'd *want* to
distribute a key for use with hmac-md5, and then change over
to using hmac-sha1 *on the fly*?  I can't see where this
would ever be useful.

Even worse, the receiver *can't* count on message text
to tell it which MAC to use, since the only thing
authenticating that message text is the MAC.  It's like
trying to lift yourself up by your own bootstraps.

There are all kinds of weird-looking attacks that come from
trying to allow the message being MACed to specify the kind
of MAC using it.  Just think of the key as being
specifically an hmac-md5 key, or hmac-sha-256, or whatever,
and all those attacks evaporate for free.

...
>The MAC is computed over message text only, and is
>completely independent of the IP header.  NAT translation
>will not affect it.

The MAC should also be computed over the previous message's
MAC, in the current scheme, right?  I've understood this to
be how the scheme works as proposed, right?  (Otherwise, the
hash chain adds no security at all.)

...
>>Some of these questions come from parts of the discussions
>>in RFCs 2082 - RIP-2 MD5 Authentication - 2328 - OSPF
>>Version 2 - and 2385 - Protection of BGP Sessions via the
>>TCP MD5 Signature Option.
>>ftp://ftp.isi.edu/in-notes/rfc2082.txt
>>ftp://ftp.isi.edu/in-notes/rfc2328.txt
>>ftp://ftp.isi.edu/in-notes/rfc2385.txt
>>The RIPv2 messages are somewhat similar to this case of
>>trying to authenticate the messages without verification
>>of receipt.  From these works, I'd ask if it would be
>>better to use a "chain" or to have a "sequence number"?

>Well, these discussions are all about much more serious
>protocols and security problems.  John Kelsey wrote to me to
>recommend adding a higher resolution wall clock value at the
>source and each forwarder, but I think tiny but managed
>devices may not have clocks or even a boot count.

Yeah, I made some suggestions based on some possibly
incorrect assumptions, and then got busy with work and
moving, and lost track of the discussion.  But the basic
idea of what I proposed will work, and will provide better
security in just about every instance I can think of than
the one in the current draft.  It shouldn't be substantially
more complex than the hash chain based scheme.

>He
>recommended a sequence number from boot as an alternative,
>noting that an initial value can be used as a shared secret
>to validate the whole sequence, so that no secret needs to
>be hashed with each message. (This is the approach of
>[PEO95];  "PEO" stands for "Primer Estado Oculto", "first
>state hidden".)  Unfortunately, the device's reboot may not
>be observed by the log server, so the sequence's origin may
>be unknown and therefore the whole sequence is
>unauthenticated.

It's easy to avoid this happening.  I'll show a simple
scheme for doing this below.

>So I think it's simplest just to
>authenticate each message with a shared secret, and use the
>previous MAC only as a chain value and nonce in the message.
>This means that the authentication only verifies that the
>sender is known, and that nothing was sent or lost between
>the previous message and the current one.  A segment can
>thus begin anywhere and end anywhere.

So, it looks to me like we can get exactly this property in
the worst case with a sequence number, and in the general
case, we can do much better.

Suppose each message has three fields added for
authentication:

H,L,M

where

H = high 64 bits of sequence number
L = low 64 bits of sequence number
M = 128 bit MAC

The sender and receiver share a permanent secret value, K.

When the sender reboots, it must initialize H and L.  It
does this as follows:

a.  If the sender has a clock, a reboot counter, or
persistent memory, the high bit of H is always set to one.
If it has a clock, then the low 63 bits of H are set
according to the current timestamp.  If it has persistent
memory, then the previous value of H is read from memory,
incremented, written back to memory, and used.  If it has a
reboot counter, the low 63 bits of H are set to the value of
that counter.

b.  If the sender doesn't have a clock or persistent memory,
the high bit of H is always set to 0.  The remaining bits of
H may be set to a random value (if any changing or
unpredictable values are available to the sender), or may be
set to all binary zeros.

c.  L is always initialized to 0.

When a message is generated:

a.  We compute M = HMAC-MD5(K,(H,L,text-message)), where the
hexidecimal representations of H and L are used.

b.  We include in the message an authentication block,
consisting of H,L,M.

c.  We increment L.


When a message is received:

a.  The receiver computes M' = HMAC-MD5(K,(H,L,text-message)).

b.  If M = M', the receiver examines H to see if it has
changed from the previous message seen from this sender.

c.  If not, then the receiver checks L to see if it has
already been seen or is too old to be accepted.  Based on
this, the message is accepted or rejected.  (That is,
accepted or rejected by the crypto; the receiver may decide
to drop or store these messages.)

d.  If H is new, and its high bit is set to one, then the
new H is checked to see if it is a successor to the old H.
If so, the new H is accepted as genuine, and (after some
short window to allow out-of-order messages to arrive), no
more messages with the old H will be accepted.

e.  If H is new and its high bit is set to zero, then the
new H is accepted.

At log analysis time:

a.  The messages are sorted on H, then L.

b.  Gaps in the messages are easy to identify.

c.  Changes of H can be scrutinized offline for signs of
replay attacks, when the high bit of H is zero.  When the
high bit of H is one, replayed H values are a sign that
either the sender has been compromised or has malfunctioned
in some way.

The result of all this will be that:

a.  When we have a clock, persistent memory, or a reboot
counter on the sender machine, there's no way to replay log
messages.

b.  Otherwise, we make replays harder to do and easier to
catch, but can't totally prevent replays from previous
days' log traffic.  However, it will be obvious from the
logs that either the sender machine has been rebooting a lot
lately, or we're under a replay attack.  And there are some
tricks for detecting these offline automatically.

Contrast this with the MAC chaining scheme:

a.  It's always possible to replay old log messages.

b.  There's no way to tell whether a message whose hash
chain doesn't match the previous received message is being
replayed to insert it into the log, or whether the
intervening message simply didn't make it to the receiver.

c.  There's no information about how many messages were
missed in a gap, and no additional information to use to
determine whether a replay is being carried out.

For what it's worth, it's also possible to do nearly the
same thing with hash chains.  This is a bit more
complicated, but it will work pretty well, and will even
provide slightly more replay resistance in the case of
sender machines with no clock, reboot counter, persistent
memory, or internal state with which to generate a single
random number.  I may write such a scheme up and post it
later; it's not all that complicated, since we're mostly
using all the crypto primitives for what they were designed
for, instead of having to bend them into some odd shape to
fit our application.

>Thank you, Chris.  I'll reflect your comments in the final
>writeup, which will be in standard form.  As you know,
>no-one is currently supporting my work on this draft, so
>I'll have to keep it simple.  Let's pay more attention to
>the syslog replacement problem.

Maybe some of these recommendations of mine will be more
useful for replacing syslog, especially the more complicated
hash chain stuff.  But replacing the hash chain scheme
you're currently using with a sequence number scheme seems
to me to make the whole thing much stronger in nearly all
cases, with very little additional work.

>Alex
>Alex Brown <[EMAIL PROTECTED]> http://www.msg.com/~abrown +1 617 504
>8761  

 --John Kelsey, Counterpane Internet Security, [EMAIL PROTECTED]
PGP Fingerprint: 5D91 6F57 2646 83F9  6D7F 9C87 886D 88AF


-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 6.5.1 Int. for non-commercial use
<http://www.pgpinternational.com>
Comment: foo

iQCVAwUBOesxByZv+/Ry/LrBAQH4dAP9F+WTBDspsz0ksEBucnaoDcszSvfArkCh
0zEa3nvkB3iyVgOc/XgdOut6kyDpPAyyhxWEHLspKHmdulTLKhSWZapjrxWtanoT
wDCSdEUzAvCXp3XYZcVZ/hnrtK3rhYpVIo1p+i4sReebAezsrI19j9wX7TUpEXEC
Vy0wPqAdakk=
=+p4x
-----END PGP SIGNATURE-----
Re: Simple authentication of syslog messages for embedded network devices

Reply via email to