Rainer:

Good draft!  I have a laundry list of suggestions/questions below.  Feel
free to respond to them in different emails at your convenience.  The
issues are in more or less section order.

1. First, I have to say I don't understand the backward compatibility
aspects. RFC 3164 compliant syslog collector or relay is supposed to
accept ANY messages as stated in section 4.2 of that RFC.  Can you
explain in practical terms what backward-compatibility we are trying to
achieve?  I think it is an important question we need to settle as this
affects the whole draft.

I assume the new draft RFC will provide hard requirements for syslog in
contrast to the informational nature of RFC 3164.  As such, this RFC
cannot be fully backward compatible with RFC 3164 which allowed free
form messages.  If we make various selected RFC 3164 aspects required in
the draft (like old timestamp), we are essentially putting a stamp of
approval on a previously informational recommendation and making it a
requirement instead of obsoleting it.  How are we ever going to obsolete
the old format this way?

2. Section 2 refers to "machines" and "devices" which is misleading.  I
think we need to talk about "applications". After all a sender and
collector can both be on the same machine.

3. Section 4. HOSTNAME.  I think "." and "-" characters are allowed in
FQDN (except no trailing "-") per RFC 1123. Also, the limit of 64
characters is inappropriate. It should be 255 per same RFC.

4. Section 4. The time-secfrac field should be specified as 1*4DIGIT.
This is the only number of digits that would be allowed given the 32
character limit you specified for TIMESTAMP field. This just makes it
more explicit and actually removes the need to specify the length of the
TIMESTAMP field.

5. Section 4. MSG. I think the character set specified here is not
consistent with specifying that UTF-8 is supported.  UTF-8 character can
consist of multiple bytes and each byte can be any 8-byte value. Also
you refer to "PRINTABLE" in the comment, which is not defined anywhere.

6. Section 4.1. PRI field.  First, I support Albert's proposal of a new
format which increases the number of facilities and provides a format
that is easier to handle.  I just don't know why stop at 999 facilities
and not allow say 2bln (signed 32-bit).  An alternative (less optimal
for performance) is defining a structured content parameter "facility"
or "channel" and assuming new syslog collectors/relays will use it in
configurations.

7. Section 4.1. PRI field.  I think naming facilities 16-23 "local" is
misleading.  In fact, remote logging uses those almost exclusively. So,
how are they local? I'd call them "custom facility 1,2,..." or something
like that.

8. Section 4.1. Note 1.  I think here and in many other places in this
draft RFC we should avoid using language such as "...have been seen...",
etc.  This is not intended to be an informational RFC like 3164. I think
it would be more appropriate to be talking about what SHOULD or MUST
happen instead of what has been seen to happen.

9. Section 4.2. Here and thereafter you use the term "visible (printing)
characters".  Although you clarify everywhere the specific character
range, I think this term is imprecise.  A Chinese character encoded in
UTF-8 will be visible if you have the right viewer and not visible if
you don't.  Maybe you should refer to "non-control characters" instead.

10.  Section 4.2 Last 3 sentences. Again you mention "has usually been
seen".  Do we actually want to recommend the use of one IP or the other
or at least the consistent use of one?

11. Section 4.2.1. In the note, you mention "single syntax".  In fact,
use of second fractions is optional. Yes, technically it is one ABNF
syntax. But then so is the RFC 3339 which you claim provides "multiple
syntaxes".

12. Section 4.2.1. My feeling is we should not support the old timestamp
format in this RFC.  If some collector wants to support it, they can be
both RFC3164 and RFC.new compatible, right?  Why give more prominence to
the old legacy timestamp which we know is bad?

13. Section 4.2.1. Bullet point talking about time-secfrac should
mention that performance considerations is another condition for the
recommendation, not just availability of clock accuracy.

14. Section 4.2.2.  Again, I am not convinced that supporting the legacy
of just the hostname instead of FQDN is the good reason to have. We may
still want just the hostname option for local logging though.

15. Section 4.2.2.  Do we want to make a recommendation as to what is
preferred hostname-only or IP?

16. Section 4.2.2. Where we mention IPv6 RFC 2373, we should mention
specifically the section on "Textual representation" of that RFC -
section 2.

17. Section 4.2.3.  We never say what the purpose of the TAG field is
nor give any guidance to what should be put there. This field of the
syslog specification is, to me, very strange.  I understand the legacy,
but see my concerns about backward-compatibility.  The fact that no
spaces are allowed is not optimal.  Recommendation of a trailing ":" can
only mislead casual observers into believing it is used as a separator
character while it is not.  Then, what's the purpose of this
recommendation?

18. Section 4.2.3.  We never explain what's the difference between
static and dynamic portions of TAG.  The last sentence talks about use
of "consistent tag value", but I don't understand what it means.
Consistent between what and what? This needs clarification.

19.  Section 4.3. The phrase "traditionally and most frequently used"
should be replaces with SHOULD, MUST or RECOMMENDED I think.

20. Section 4.3. The last two paragraphs are talking about some "code
sets".  I think if we are talking about UTF-8, we are talking about
*one* code set -- UNICODE -- and one encoding -- UTF-8.  I thought
UNICODE and UTF-8 obsoleted all that code set business, or am I wrong?

21. Section 4.4. TRAILER. This says that some receivers may require a
trailer.  Aren't we supposed to specify here what compatible receiver is
allowed to require and what not? Why are we allowing this?  I think
nobody should require trailer and we should drop this from format.

22. Section 4.5. Sentence "..locally defined facility (local4)...".
Again, I am confused by term "locally-defined".

23. Section 5 & beyond. Why is there a need to specified structured data
*anywhere* within the message.  I thought we will designate a special
field like TAG for the structured data.  This way we won't need a
special sequence to identify it.  Also, I think allowing it everywhere
gives too much unnecessary freedom. Harder to evolve protocol later.

24. Section 5.1.  Like with the MSG, I think the character set of
parameter value is any non-control character with some characters being
escaped. We are supporting UTF-8 within the parameter values, right?

25. Section 5.1. I think the fact that each structured data item which
has a different IANA dictionary needs to be in a different block is
somewhat cumbersome and limiting. For example, if I want to put the
msgid parameter in all of my messages regardless of use of
fragmentation, then when I do use fragmentation, do I have to put this
parameter twice?

26. Section 5.1.  I think dictionary identifier can be made into a just
another key-value parameter. This would be more consistent with
providing a general mechanism key-value pairs and idea of using []
brackets to group related tags.

27. Section 5.1. Can the SD-ID be optional for experimental parameters.
This way I don't have to put "x-cisco" in front of all tags.  I don't
see any value in this.  We can just assume experimental tags.  If a
given vendor needs to identify his tags they can do this with their own
parameters like "vendor", "product", "version", or whatever else the
vendor wants.  Vendor tags are for vendor use only, right? General
syslog collector won't use them anyway, correct?

28. Section 5.1. I would also consider the following approach which
eliminates dictionaries.  If we only need parameter namespace so we can
avoid conflicts between current & future syslog RFCs and vendor
parameters, then we can just define some prefix for current and future
syslog protocol parameters. For example "sys.msgno", "sys.fragcount",
etc.  Then, we will control the tags in this namespace using IANA or
RFCs.  If some vendor wants to re-use the "sys.msgno" tag because the
definition of the tag suites them for a different use case, then they
don't need to duplicate it.

29. Section 5.1. I think we should require a space character after each
structured data block closing bracket.  This will make it more readable
while eliminating the ambiguity as to whether or not the space is part
of the message.  Even you examples will look nicer. I think we can make
the space optional between two structured blocks of data.

30. Section 6. Paragraph 3 call for not using fragmentation when message
can fit in a single message.  I think, in general, we assume the use of
fragmentation *only* for splitting long messages.  We had some
discussion on this a long time ago, but I don't remember the conclusion.
The other use case for fragmentation (or better named multi-part
messages in this case) is when the message is inherently multi-line.
For example, a stack trace:

LockConflictException
 at com.cisco.csrc.db.LockTable.obtainUpdateLock(LockTable.java:199)
 at
com.cisco.csrc.db.indexes.OidIndex.obtainUpdateLock(OidIndex.java:448)
 at
com.cisco.csrc.db.PObjectImpl.obtainUpdateLock(PObjectImpl.java:1184)

How can I send such message with current draft?  I would have to come up
with some new parameters likely.  I think this needs to be standardized.
The distinction here is that the original message is not a single line.
Rather the original message is a multi-part message with each part being
a separate line.

To handle the above we need to differentiate the case when message does
not need to be assembled.

31. Section 6.  Again we had discussion on this before... It would be
useful if message parts could be sent before the total length of the
message is know.  We have one message in our system which is about 2000
lines long. It dumps all kinds of properties on crash.  It would be nice
if I could send parts of this message without knowing the total message
part count.  Otherwise, I would need to assemble the whole message
before sending it.  This can be problematic if I am crashing due to out
of memory condition, for example.  To address this, we simply need to
sate that recount parameter is optional in all fragments except for the
last message.  This will designate the end of the fragmented message.

32. Section 6.2.  The above suggestions would mean that you can't sign
the whole message, only parts.  You suggest that signing all parts is
not as safe as signing the whole message.  Why?  We know exactly the
message to which each part belongs and this information is  signed,
right?

33. General. What do we do with non-conforming messages.  Do we want to
recommend that collectors/relay agent fire some diagnostic message which
embeds the offending message?

34. Do we want to introduce more standard parameters? Good candidates
are "facility" and "severity".  Yes, this will duplicate information,
but we can make them optional.  At least this will overcome the problem
of syslog servers only storing the message and not the PRI field which
leads to then not knowing what facility or severity the message had if
you store multiple facilities/severities in the same log file.

I did not review section 7 and beyond yet.  It seems a lot of it is
identical to old RFC.

Thanks,
Anton.


> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Rainer Gerhards
> Sent: Wednesday, January 21, 2004 3:53 AM
> To: [EMAIL PROTECTED]
> Subject: syslog-protocol-01 posted & comments
>
>
> Hi WG,
>
> as you may have seen, the draft editor has posted protocol-01:
>
> http://www.ietf.org/internet-drafts/draft-ietf-syslog-protocol-01.txt
>
> First things first: this was a "quick" edit (while not as
> quick as I hoped... ;)). My main objective was to get out
> some text as quickly as possible. There are probably some
> typos and some other minor inconsistencies. Also, some of the
> descriptions may not be as good as they should be. As the
> format issue was quite controversal, I try to save a little
> bit work by providing ONE POSSIBLE text to handle it. But
> further discussion may go into a different direction. So I
> tried to make it as understandable as possible while not
> putting total finishing efforts into it. Once we have decided
> the final direction, I will either revamp totally or polish
> the current text.
>
> To the content:
>
> #1
> As said on the list, I used Anton's non-XML proposal.
> Weighting all the arguments received, I really think we do
> not actually need XML in syslog, even though syslog is no
> longer just for human review but also for automatted
> processes (this in answer to David's question).
>
> After finishing the text, I am far more convinced that the
> simple tagging approach is not only sufficiently enough for
> transport, but it also is a good solution for syslog in
> general. In the long term, it can also be used to define
> payload dictionaries, which may be very useful (should we
> manage to do this;)).
>
> Regarding integrating this into XML-based systems, I think a
> mapping of what I described now and XML is fairly easy, at
> least as long as you assume that the message is orginally
> generated by a syslog device and the brought over to some XML system.
>
> #2
> I would also like to drag your attention to the fragmentation
> that I now described. There are some specific implications in
> regard to syslog-sign. I would appreciate if those deeply
> involved in -sign could cross-check that this format could
> actually do the job for -sign.
>
> #3
> The section on fragmentation is currently missing a
> recommendation to use a reliable transport. This will be
> moved in once we have a general concensus.
>
> I appreciate comments on these points as well as -protocol-01
> in general. I am prepared to do a quick re-edit.
>
> Thanks,
> Rainer
>
>
>



Reply via email to