> A 64-bit binary microsecond counter comes out like this:
 > 
 >    * Resolution - excellent
 >    * Bandwidth - 8 bytes
 >    * Ease of processing off the wire -poor on processors not capable of
 >      doing 64-bit processing, especially if different endianness
 >      than net transmission

Endianness might be a problem. But changing the byte-order requires
less cpu-cycles than converting to a string representation from an
internal integer representation (I suppose most hosts, even "simple"
components like routers, switches, etc., use such a format). Even if the
date is in second resolution, mutliplying by 1,000,000 would do the trick.

(We could be tricky and go for 1/1,048,576 as the sub second
resolution, three simple instructions, one copy, two shifts, would do
the conversion.)

 >    * Ease of processing to get human readable - poor on processors not
 >      capable of 64-bit processing
 >    * Ease of doing date arithmetic - poor on processor not capable of
 >      64-bit processing

I assume two things:
1) By the time this protocol will be in use, most cpus on systems 
    working as log-hosts will be 64-bit processors, especially in those
    environments, where security and performance is important. Most simple
    components still working the 32 bit processors will only be logging 
    messages, not processing them.
2) The way most log-messages will take is
    logging component -> log-host -> log-processing -> human viewer
                      a)          b)                c)
    With a string-format, the timestamp will be converted from integer
    to string format at a), from string-format to integer-format at b)
    and (perhaps) back to string-format at c).
    With a 64bit integer format, conversion at a) and b) is not
    necessary and even conversion to and from 32bit (at a) and b)) is
    much cheaper than converting to and from string formats.

 > YYYYMMDDHHMMSS.fraction
 > 
 >    * Resolution - variable

Reading from a variable length field requires more processing than
a fixed length field.

 >    * Bandwidth - 15 bytes (including stop byte) to do second
 >      resolution, more to do higher resolution
 >    * Ease of processing off the wire - no processing needed

Depends. Only if the log message is written directly to a file that
is to be directly read by humans. In all other cases, the binary
format wins.

 >    * Ease of processing to get human readable - simple string
 >      manipulation to get to something mktime() can deal with
 >    * Ease of doing date arithmetic - easy after feeding to mktime()

mktime() still needs much more cpu cycles than copy and shift or
multiply.

 > seconds-since-the-epoch-in-ascii.fraction
 > 
 >    * Resolution - variable
 >    * Bandwidth - under 12 bytes (including stop byte) to do second
 >    resolution in
 >      the foreseeable future, more to do higher resolution
 >    * Ease of processing off the wire - no processing needed
 >    * Ease of processing to get human readable - easily feed to gmtime()
 >    * Ease of doing date arithmetic - easy after feeding to gmtime()

 > Sounds like this last one wins if you ask me.

It all depends on what we want the protocol to be. If we would like
to have human readable messages at every instance, I would go with the
second format. But I'm more concerned about a protocol that is fast to
process and secure. I'm not interested in reading lots of messages I'm
never or rarely need to look at. I leave the job of sorting out the
unimportant messages to the computer. Even better, I would like some
standard API, that allows me to specify to the applications/components
what messages I'm really interested in. This way, I could avoid
filtering out all those unecessary/uninteresting messages.

If I would really care about network bandwidth, I would go with a 32
bit second resolution counter. We can always avoid the overflow
problems when we allow the <start-of-the-epoch> point to be chosen
locally by the admins. I don't really think that sub-second resolution
is really needed now, but it can be included at little cost, so why
not ?

 > Furthermore, timezones are just a bad idea altogether because:
 > 
 > 1. Nobody really cares about what the timezone was on the original
 >      machine.  They only care about the timezone they're in when
 >      they're reading the logs.
 > 2. Dealing with multiple machines having different timezeones but
 >      their logs going to the same file is too much of a pain.  >
 > Just use UTC.

If erverything on my network would do this, it would be okay. But then,
dates are often kept in the local timezone, so everything that logs has
to convert from local format to UTC and back.

        Klaus

-- 
Klaus Moeller            |                    mailto:[EMAIL PROTECTED]
DFN-CERT GmbH            |
Vogt-Koelln-Str. 30      |                      Phone: +49(40)42883-2262
D-22527 Hamburg          |                        FAX: +49(40)42883-2241
Germany                  |       PGP-Key: finger [EMAIL PROTECTED]

Reply via email to