Re: A hideously long description of syslog-auth

John Kelsey Wed, 06 Dec 2000 10:14:50 -0800
-----BEGIN PGP SIGNED MESSAGE-----

At 03:59 PM 12/4/00 -0800, you wrote:
>Just a few comments. Some are a little bogus, as you answer
>the question later in the document (like the 4.3b comment),
>but I left it there so you know that perhaps it would be
>better to rearrange or at least add a footnote forward
>referencing the discussion.

Thanks.  I know this is badly arranged--I originally
intended this to be a much smaller document.

>Throughout - Wouldn't it be better to use the terminology
>from the syslog-syslog-02 draft? Device, collector, relay?
>And reserve "sender" and "receiver" to have their more usual
>meanings (i.e., the machine putting a message on the wire or
>taking it off the wire, respectively)? Otherwise, a sentence
>like
>> A message with the Replay Vulnerable flag not set comes with
>> a promise from the sender:
>becomes very ambiguous. Who is doing the promising?

Good point.

>0.1 - You missed a design decision that I think ought to be
>explicitly stated: The packets are going in the clear over
>UDP. Almost all of what you address here is already handled
>in syslog-reliable. Only the "storage MAC" is unique to
>syslog-auth and not syslog-reliable. I.e., a great deal of
>what this addresses really is reliability (in the sense of
>getting the message safely to the destination without
>alteration) rather than authentication (in the sense of
>knowing who created the message). I worry a bit about having
>two syslog protocols proposed with so much overlap.

Hmmm.  I see your point, but S-A doesn't really do anything
for reliability in the sense of making sure your packets
arrive, it just tells you whether they arrived or not, and
guarantees that they weren't changed in transit and
originated from the sender you expect them to have
originated from.  The thing that makes this so complicated
isn't really the duplication of TCP in things like sequence
numbers, it's dealing with huge numbers of options for the
devices, relays, and collectors, and propogating verified
statements from a chain of relays to the final collector.

How does syslog-reliable handle forwarded information from
an old-style device?  My impression is that it doesn't do
anything with it, because that's not the way you're supposed
to use this stuff.  And I'm inclined to think that this is
exactly right.  Our goal is clearly for people to move to
syslog-reliable wherever possible, since that gives superior
guarantees (e.g., instead of just noting where the missing
messages are, we either have all the messages in sequence,
or have evidence of a denial-of-service attack of some
kind), and since we can then use two-way communications for
key management and such.  But I wanted to make sure that
syslog-auth would be something that could be dropped in with
a minimum of hassle, and which would provide a clear
statement of guarantees about what could be said about the
logs stored on the concentrator.  And syslog-auth over TCP,
or over some other reliable delivery mechanism, would (I
think) provide the same kind of guarantees that
syslog-reliable would.

Anyway, I am writing up syslog-sign (I'll think of a better
name) to deal with the whole totally in-band storage
authentication scheme I described before.  It's about ten
times simpler, since I don't have to worry about
transmission security at all.

>2.2 - Base64 is defined in the basic MIME RFCs.

Thanks.

>2.2.4 - Define "destination"?  Do you mean collector? Or just one
>hop?  

Just one hop.  My impression is that the device can't know
either:

a.  Whether the machine it's sending messages to will
forward them or store them.

b.  What criteria the machine it's sending to will use to
decide where to forward messages.

This makes detection of gaps in the messages much harder.

>4.3b  - 96 bits is probably *way* overkill. My guess would
>be that 2^-32 is far, far smaller than the probability that
>someone missets the clock on a superincreasing counter and
>generates duplicates that way.

Hmmm.  If the superincreasing session ID decreases, a
collector or relay receiving messages with that session ID
can legitimately just discard the messages.  (This is the
only way they can be certain of the ability to resist
flooding attacks.)  So this would be kind of a nasty failure
mode. This is another place where the relays add a lot of
complexity.  A collector can decide whether to discard
replayed messages or keep them based on whether its disk is
getting too full.  A relay can't know anything about
collectors that will eventually get this message (the
machine the relay's sending to may also be a relay, so it
doesn't even know where the message will end up).  If the
relay detects replayed session IDs, it pretty-much has to
discard the replayed messages.

>Consider that it's only a
>collision if you actually *store* all the old session
>numbers; consider how much memory it would take to even list
>a sparse bit array with 2^96 entries in it. (I haven't done
>this, but I expect it's rather large if you assume you have
>(say) 2^32 bits set out of 2^96.)

There are two sides to this.

- From the perspective of a collector or relay trying to
detect repeated pseudorandom session IDs, it can keep a
sorted list of old session IDs and do a binary search on
them whenever it sees a new one, or it can keep the whole
list in a hash table that's about twice the size of the list
(using the first four bytes of the session ID as the key),
and chain collisions.  Or any number of other tricks from
Knuth.  Under normal circumstances, I'd be surprised to see
a machine keep the same key and keep running for more than
ten years, and I'd be surprised to see a machine reboot more
than ten times a day.  That would result in a total of about
36525 reboot session IDs in its lifetime.  If each is
stored on disk at the collector, it requires under a
megabyte to store a hash table that's twice as big as the
list.

- From the perspective of an attacker trying to find repeated
session IDs, we can reasonably keep a hash table just like
what I described above, but we also have to store some
useful sequence of messages to splice in.  (If the device
ever generates a session ID that's been used before, the
attacker can splice in messages from the previous session in
real time.)

The pseudorandom RSID might need to be a little smaller, but
I hate to make it much smaller, since that will decide what
anyone is allowed to use it for in the future.

>4.3.2 - I think the wording you're looking for is that the
>PRNG needs to be cryptographically secure.

That's close.  I need a seed that has something like 96 bits
of entropy in it, in the sense that I should expect to have
to wait until I've seen on the order of 2^{48} independently
generated seeds, before I see a pair that are equal.
There's no need for cryptographically strong mechanisms to
expand the seed or condense it.  There's not even any need
for the seed to be hard to guess, given other information
like the time of reboot or what's going on on the device's
local network.  The only thing that's required is that we
get a unique RSID.

>4.3.2.f - Regarding "N", if the device just always
>generates the same single message on every reboot, why would
>one be concerned about message replay? Also, any device with
>that little capability is unlikely to be break-into-able, I
>would expect?

I'm not worried about someone breaking into the device.  I'm
worried about their being able to confuse the logs by
replaying old messages in a way that can't be detected.

Suppose the device ever sends only one message, saying ``I'm
okay'' once every hour.  The collector may have some use for
these messages.  Maybe the device is a motion sensor in a
wiring closet, and if it ever sends an ``I'm not okay''
message, it means there really may be something funny going
on.  It would be nice if syslog-auth could make sure that
when you see ``I'm okay'' from this device in the logs every
hour for the last two weeks, that actually meant that once
an hour for the last two weeks, the message was sent by this
device.  If the reboot session ID doesn't ever get a unique
value, then the attacker can always resend old ``I'm okay''
messages, and there will be no way to distinguish them from
fresh messages.

>4.4.8 - What is the advantage of having a longer key-id
>than MAC? Why not 64 bits for each? [OK, you answer this
>later, buy you might want to consider a footnote mentioning
>that you answer this later. :-]

Good point.

>5.0 - "When a message is received from an old-style
>forwarder..."  How does it know? Can I de-authenticate a
>message by mucking with the source IP address as it goes
>past me? If an old-style formatter changes nothing but the
>IP headers, what is the vulnerability to continuing to trust
>the authentication blocks? Otherwise, the IP headers should
>be included in the MAC hash for the forwarders.

I can receive a message that's got a valid authentication
block, but whose key ID I don't recognize.  In that case, I
can't make *any* inferences from information in that
authentication block.  (The authentication block could just
as easily have been generated by an attacker.)

Also, I put a flag in there somewhere in the fine print that
specifies that the sender (device or relay) thought it was
sending to an old-style sender.  Though I can't seem to
think of a case where this would actually be useful, now
that I'm looking at it.  Hmmm.

Anyway, the basic question a relay needs to answer about a
message it's forwarding over syslog-auth is ``Do I have some
reason to trust that this message hasn't been altered or
replayed in transit to me?''  And it will embed its answer
in a flag in the forwarding block, so that later relays and
the final collector will know the answer.

>5.2.2.a - Again, how does it know? Or do you mean "from a
>machine for which the receiver shares no key?"

That's the effective meaning.  Sorry about the confusion.

>5.2.3.b - Replay resistance - Does each forwarder check
>that the previous forwarder is not replaying, or that the
>original sender is not replaying? I.e., if you have
>Device->Relay1->Relay2->Collector, how does Relay2 check
>that the message isn't being replayed? Are you expecting
>both Relay1 *and* Relay2 to remember all the messages it has
>seen from any device? What's the benefit of having the
>relays check for replays vs having the collector check?
>Remembering that by the time it gets to Relay2, it may be
>out of order, so you can't just keep track of the highest
>sequence number you've seen for each reboot ID.  [This last
>bit you answer later, so you might want again to mention you
>answer it later.]

Each relay checks to make sure that the incoming message
isn't being replayed, from the device or the previous relay.

Here's the idea I'm trying to get across:

(Device,Relay1) share a key and some context.
(Relay1,Relay2) share a key and some context.
(Relay2,Collector) share a key and some context.

There's no reason to expect Device to share context with
Collector or Relay2, or for Relay1 and Collector to share
any context.  There's no reason to even imagine that Device
has any clue about the very existence of Relay2 or the
Collector.

Let's imagine that all these machines are using
superincreasing session IDs, because it makes the
explanation easier.  (This will all still work with
pseudorandom session IDs.)

Relay1 knows the highest session ID it's seen from Device so
far, and the currently active session ID, if any.  It shares
a key with Device.  Suppose someone starts replaying old
messages from Device to Relay1.  If Relay1 discards these
messages, then Relay2 and Collector never see them, and so
don't have to worry about detecting replays.  Relay1 knows
whether or not they're replays, and has all the information
needed to discard them.  If Relay1 forwards them along
anyway, then Relay2 or the Collector have to dig into the
innermost authentication block to try to detect replays.
This can be done in principle, though it does require some
additional overhead for the Collector.  It also allows a
replay flooding attack to add a bunch of work for Relay1,
Relay2, and Collector.  (All those flooded messages have to
be authenticated, verified, and transmitted.)

Note also that Relay1 doesn't know anything about the
replay-detection capacity of Relay2 or Collector.  It's
possible that Relay1 is the last chance to filter out these
replayed messages.

Now, it's possible to specify that the Device MUST make all
messages replay-resistant, and that the Collector MUST do
all replay detection on the original message text and
authentication block.  Maybe this would make more sense.  It
seems like this is a tradeoff between simplicity of the
specification and a moderately nice feature.

Comments?

>Re trimming hashes - Might it not be better to XOR the
>hashes ("folding" them, so to speak) rather than truncating
>the high bits and thus losing their significance. It's not
>obvious to me that if md5 is collision-resistant, that only
>the bottom X bits of md5 is necessarily collision-resistant
>as well.

Well, there's a pretty simple argument for this.  Suppose
there is a subtle property of md5 that makes the low-order
96 bits slightly predictable when you choose the right kind
of input, say inputs that are 500 bits long and end in 300
zero bits.  Suppose this additional predictability makes it
take only 2^{32} of these special inputs before we expect to
see a 96-bit collision.  The obvious question to ask is:
what would this mean for the security of md5?

So now, we can use this to attack md5.  We generate a huge
set of collisions in the low-order 96 bits.  This special
set of inputs has to be making some 96-bit outputs very
unlikely, and other very likely.  With 2^{32} outputs, we
get about 2^{63} pairs of outputs.  If we expect one
collision in that set of pairs with probability 1/2, another
way of saying that is that we think that the probability of
any given pair colliding in this set is about 2^{-64}.

Let's suppose that the remaining 32 bits of md5 are
effectively random for these special inputs.  This is the
best situation possible, in terms of making collisions
harder to find.

If we could find 2^{32} pairs of md5 outputs that collide in
their low 96 bits, then we would expect to have one of those
pairs collide in their upper 32 bits, as well.  How many
special inputs will we need to expect about 2^{32} colliding
pairs?

We want N such that

N^2/2 2^{-64} == 2^{32}

Solving, we find N = 2^{49}.

So, for the cost of 2^{49} inputs, and some work with huge
sorted lists, we could find a collision in md5.  What all
this means is that if there's a property that makes it much
easier to find collisions in some subset of md5's output
bits than we'd expect from the birthday attack, then we can
use that to make it easier to find collisions in all of md5.

If the flaw in the subset of md5's outputs is very small,
then so is the resulting flaw in md5; as the flaw in the
subset of the bits gets worse, so does the corresponding
flaw in the whole hash function.

This is why I'm not too worried about using the subset of
output bits as a sort of shortened hash, useful in this
special context.  The reboot session ID also has a really
nice property--the pseudorandom session IDs are *never*
chosen by an attacker to exploit some property of the hash
function; instead, they're chosen in a way intended to
approximate a random selection of a 96-bit number.

>Re knowing how far back to keep - Calculate the diameter of
>your network in hops, multiply each by 255, and look at the
>message number that many seconds old. Assuming your IP stack
>actually implements the hop count in seconds as well as
>hops, you should have a hard time getting a message older
>than that unless many of the relays delayed it for a long
>time. If you allow a relay to hold a message for an
>arbitrarily long time, there's no guaranteed upper limit,
>and hence no good way of giving a "reasonable" upper limit.

Right.  The big potential problem seems to me to exist if
the relay holds big messages back for a long time, but
happily sends along smaller messages right away.  If many
relays along a path do this, the message could end up
arriving outside its replay window.

>Appendix A: It's not that hard to test, if you have the
>source to the code. You simply put an "if" at the front if
>the hash function that says "If the key is YADDA1 or YADDA2,
>return a hash of 123". Then don't use YADDA1 or YADDA2 in
>your real configuration. :-)

Well, the test would have to deal with intelligent handling
of colliding key IDs.  Which would mean trying each matching
key ID on a message, until one worked.  It's obvious because
of this that we wouldn't want the key ID to be, say, 8 bits,
since a collector or a relay that handled a couple dozen
different devices' logs would suffer a performance hit.

>It just seems wierd to me to
>design for a possibility that is so unlikely to ever happen
>that you can't even do it on purpose. And when it *does*
>happen and gets handled improperly, the first likely
>response is "funny, messages from that new router aren't
>getting authenticated. I must have mistyped the password. I
>better change it." :-)

Actually, my plan was to design it so that the probability
of this bad event ever happening is so astronomically low,
that in practice we will be totally shocked if it occurs.
This is exactly the same practice that people use for
ciphers or hash functions or signatures.  SHA1 hashes from
almost arbitrarily long messages down to 160 bits.  It's
obvious that any message you choose is overwhelmingly likely
to have a huge number of messages that hash to the same
value.  But the probability that a randomly-selected pair of
messages will accidentally hash to the same value is so low,
we never, ever expect it to happen in practice.

>Darren New / Senior MTS & Free Radical / Invisible Worlds Inc.
>San Diego, CA, USA (PST).  Cryptokeys on demand.
>There is no "P" on the end of "Winnie the Pooh."

 --John Kelsey, [EMAIL PROTECTED]
  ``Slavery's most important legacy may be a painful insight
  into human nature and into the terrible consequences of
  unbridled power.'' --Thomas Sowell, _Race and Culture_


-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 6.5.1 Int. for non-commercial use
<http://www.pgpinternational.com>
Comment: foo

iQCVAwUBOiyskCZv+/Ry/LrBAQEn4wP/VIruz9fLxlZrvRnJp5e/yH81RQke1ma6
H56Ax/ufKsB979E+68UI7xJom5RCgRVOjv7I3X3zJEwY2LSxDi1CUu4NQezSAOmZ
8pDJg8mD2r5xmT0oiyb6qY6bI5xeqF563vR5vKJknC7m7TSSLaO3r/HwMNAtOxw+
5gc0hSfwxRI=
=dOYd
-----END PGP SIGNATURE-----
Re: A hideously long description of syslog-auth

Reply via email to