-----BEGIN PGP SIGNED MESSAGE-----

Guys,

I've been working on the syslog-sign draft, particularly
dealing with the problem of (more-or-less) real-time
signature verification.  I wanted to summarize some of what
I've worked out here, and see if I can get any useful
comments.


1.      Review

Syslog-sign sends the original syslog messages along
unaltered, but then sends an occasional syslog message which
carries a signature on the previous N normal syslog messages
sent.  This message is called a signature block, and it has
the following properties:

a.  We can put the received signature blocks back into the
order in which they were sent, and detect replays and gaps.

b.  We can use each received signature block to verify the
contents, location, and freshness of its corresponding
sequence of messages.

c.  Missing messages are noticed, but they don't cause any
problems verifying the remaining messages verified by the
signature block.

d.  There is a tunable redundancy parameter, which
determines how many times each signature block (and
certificate block) is resent.

e.  We also send certificate blocks at the beginning of the
message, but I'll be able to pretty-much ignore those blocks
in this message.

f.  I'm also going to ignore a few other complicating
factors, because they don't have much impact on this
analysis.

By convention, I'll refer to normal syslog messages as
``messages,'' and other kinds of syslog-sign messages as
``blocks.''  Note, however, that signature blocks and
certificate blocks are themselves encapsulated in apparently
normal syslog messages.

2.      Offline Verification

Offline verification of logs using syslog-sign is really
easy, assuming we have enough disk space.  During normal
operations, we simply collect all syslog messages from a
given sender into the same sequential file.  To build a
verified log file offline, we use three stages:

a.  In the first stage, we go through the raw log file and
read each message in sequence.  The certificate blocks and
signature blocks are each put into a file in sorted order,
sorted on their session IDs and sequence numbers.  We thus
create a sorted certificate block file, and a sorted
signature block file.  Any duplicates (by sequence number)
are discarded.  (For simplicity, we will assume here that
only one session ID is being processed at a time.) Each
normal message is hashed, its hash value is concatenated,
and the result is stored in a hash+message file, sorted by
hash value.

b.  In the second stage, we go through the certificate block
file, and try to build the full self-signed certificate for
this session ID.  If this fails, we can't verify anything
about the log file (which is why we resend those certificate
blocks many times).  Assuming this succeeds, we verify the
certificate, and extract the public key from it.

c.  In the third stage, we build the authenticated log file.
We go through the signature blocks in order; for each
signature block, we note the message sequence numbers
covered, and then each message's hash value.  We look up
each hash value in the hash+message file, and insert it into
the authenticated log file.  The format used is

(message_sequence_number, message)

where message_sequence_number is a 48-bit integer.  The
authenticated log file will show all gaps in the arrived
messages.  (There's no way to tell whether a given message
was totally lost, or merely arrived corrupted in some way.)

When the sorted signature block file has been used up, we
have built the full authenticated log file.  This file will
show every message we received whose contents were able to
be verified, and gaps for every message which either wasn't
received or wasn't able to be verified.

It's easy enough to see that this is O(N lg N) in the length
of the raw message file, since most of the raw message file
will be standard syslog messages, and they must be hashed
and put into a sorted file.  It's possible to do this a
little faster by trading space for time, building a hash
table for the messages, and using the first R bits of the
hash as the key for the hash table.  (This is one of the few
times when we mean the same thing by ``hash function'' in
terms of both cryptography and a searching algorithm.)

Flooding attacks are a threat against this system--anyone
can simply flood us with plausible-looking syslog messages,
and we'll have to store them for later.

3.      Online Verification

We can also build an authenticated log file in
near-real-time.

3.1.    Variables

Let:

U = the maximum time elapsed between when the first message
verified by a signature block is sent, and when that
signature block itself is sent.

V = the maximum difference in message arrival delay between
any pair of messages sent at most U seconds apart.

W = the maximum rate of messages sent per second

X = the maximum rate of signature blocks sent per second.

3.2.    Simple Algorithm

We have a simple algorithm to run on the receiver side.  Let
us assume for now that the certificate blocks have all been
received and verified.  We now must process signature
blocks and message blocks.  (There will be duplicate
certificate blocks sent from time to time, but these can be
immediately detected and discarded.)

3.2.1.  Data Structures

HM is a queue of messages and their hashes.  It can be
efficiently accessed by hash value, but it must also be able
to keep track of when messages roll off the queue. HM[table]
must be able to hold at least W*(U+V) messages and their
corresponding hashes.  Adding, rolling off, and seeking by
hash value must all be efficient.  The best way I can see to
build this is to keep a queue of messages and corresponding
hashes, and to store the messages' positions in the queue in
a hash table keyed on the first R bits of the hash value.
The entries in the hash table are stored in linked lists, so
that we can reasonably efficiently delete entries.  (It
might be smarter to use a binary tree, and rebalance it as
needed; I haven't spent enough time to be sure.)

NH is a queue of message sequence numbers, stored with
either a message hash or a message.  Each signature block
received adds N (sequence number, message hash) pairs.
The best way to build this seems to be as a queue containing
sequence numbers, hash values, and message strings.  The
message strings start out as null strings.

3.2.2.  Message Arrival

When a message arrives, it is classified as either a
signature block or a normal message.  If it's a normal
message, we do the following:

a.  Hash the message.
b.  Add the message and its hash to the HM queue.
c.  Handle any message that must be rolled off the queue.

3.2.3.  Signature Block Arrival

When a signature block arrives, it is processed as follows:

a.  Check to see if the session ID and sequence number are
valid.

b.  Check to see if we've seen this sequence number before.

c.  If not, verify the signature on the signature block.

d.  If the signature verifies, add the N (sequence number,
message hash) pairs from the signature block to the NH
queue.  For each added pair, look for a message with the
corresponding hash in the HM queue.  If so, add the message
to the data.

e.  Check to see if any (sequence number, message hash)
pairs roll off the NH queue.  If they do, check to see if
they all have found their messages.  For any that haven't,
seek in the HM queue for their hashes.  If the messages
aren't found, then that sequence number is left blank in the
authenticated output stream.  If a message has been found,
then the (sequence number, message) pair is written to the
authenticated output stream.

It looks to me like this is more-or-less linear in the
number of messages we process.  Each arriving signature
block requires a couple table lookups and one signature
verification, and then does N additions to the NH table, and
N lookups in the HM table.  Each message arrival does one
addition to the HM table and one hash computation.  Since
each signature block ought to correspond to N normal
messages, that looks like it will be O(N) if the lookups
into the HM table are linear.

Again, we're vulnerable to flooding attacks.  An attacker
can give us a gazillion plausible-looking messages, and fill
up our input queue.  That has to somehow be detected, though
I don't know what the best way to deal with it is.

Comments?

- --John Kelsey, [EMAIL PROTECTED]


-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 6.5.1 Int. for non-commercial use
<http://www.pgpinternational.com>
Comment: foo

iQCVAwUBOn8ouSZv+/Ry/LrBAQHyKwP/dmsRM0e3Xy9yRrb/pRXq1HsMWJnM0eU4
+72ScGSMQyPVawUlFySBgCqnW6m2vpD3BkG8w6uBSmCQLXkalHHNdkIsWXp42/Th
joQkfnfY8CXvIsAm+BxZ3FvKz5tiqFe5AmraRvxsDuJBtcXxTF0IYn4TULrG63ge
t3H2OG3T/Qc=
=n/Kr
-----END PGP SIGNATURE-----

Reply via email to