RE: [Syslog] Sec 6.1: Truncation

2006-01-20 Thread Anton Okmianski \(aokmians\)
I think the suggestion from me and Tom (if I interpret his email correctly) is 
to state that messages can be truncated at the end at an arbitrary point.  We 
also make a note that this may result in invalid UTF character encoding, or a 
change in UTF character.  

I don't think it even warrants a SHOULD for truncation to preserve UTF 
character in full. Valid characters when you only get some of them after 
truncation may result in a wrong language word, anyway. 

Anton.  

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Darren Reed
 Sent: Friday, January 20, 2006 8:57 AM
 To: Tom Petch
 Cc: [EMAIL PROTECTED]
 Subject: Re: [Syslog] Sec 6.1: Truncation
 
 
 Is the truncation of a message on a UTF-8 boundary rather 
 than within an extended character something that syslog 
 daemons SHOULD do rather than MUST do ?  (To use the RFC words.)
 
 Darren
 
 ___
 Syslog mailing list
 Syslog@lists.ietf.org
 https://www1.ietf.org/mailman/listinfo/syslog
 

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] Sec 6.1: Truncation

2006-01-20 Thread Tom Petch
I don't have a strong view on SHOULD v MUST v neither, happy with any of them.
My point was that this is UTF-8 which I see as an unfamiliar technology for
some, perhaps for many, so while the idea that truncating a message can change
its meaning I would expect to be obvious to all, the idea of truncation leading
to a change within a character (from base + diacritic mark to base) or to an
illegal string (UTF-8 says three octets follow and there are none) to be less
familiar and so worth pointing out.

So, if we truncate any UTF-8 string, then I would like to see a warning of what
the consequences might be. I think it unrealistic to ask for truncation at the
boundary of a
composite character, I suspect it unrealistic to ask for more than a SHOULD to
truncate at an UTF-8 boundary and perhaps even a SHOULD is too much, but as I
said, I don't have a strong view on that aspect on truncation.

Tom Petch


- Original Message -
From: Anton Okmianski (aokmians) [EMAIL PROTECTED]
To: Darren Reed [EMAIL PROTECTED]; Tom Petch
[EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Friday, January 20, 2006 4:39 PM
Subject: RE: [Syslog] Sec 6.1: Truncation


I think the suggestion from me and Tom (if I interpret his email correctly) is
to state that messages can be truncated at the end at an arbitrary point.  We
also make a note that this may result in invalid UTF character encoding, or a
change in UTF character.

I don't think it even warrants a SHOULD for truncation to preserve UTF character
in full. Valid characters when you only get some of them after truncation may
result in a wrong language word, anyway.

Anton.

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Darren Reed
 Sent: Friday, January 20, 2006 8:57 AM
 To: Tom Petch
 Cc: [EMAIL PROTECTED]
 Subject: Re: [Syslog] Sec 6.1: Truncation


 Is the truncation of a message on a UTF-8 boundary rather
 than within an extended character something that syslog
 daemons SHOULD do rather than MUST do ?  (To use the RFC words.)

 Darren

 ___
 Syslog mailing list
 Syslog@lists.ietf.org
 https://www1.ietf.org/mailman/listinfo/syslog



___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] Sec 6.1: Truncation

2006-01-19 Thread Tom Petch

- Original Message -
From: Rainer Gerhards [EMAIL PROTECTED]
To: Tom Petch [EMAIL PROTECTED]; Darren Reed
[EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, January 18, 2006 9:32 AM
Subject: RE: [Syslog] Sec 6.1: Truncation


Tom,

I agree there are some issues with truncation, but I think they are
inherent. We have specified that the message should be truncated at the
end of the message. In the text I proposed, I wanted to make sure that
the message ends with a technically-complete UTF-8 sequence. Based on
Anton's comment, I have to admit I am unsure if there is really benefit
in this. Anyhow, even if it is, I think we should not try to preserve
the proper meaning. If the message is truncated, the end of it is in
doubt. This might also mean a few characters at the end might be wrongly
interpreted due to truncated control characters. I think we should
document it and live with it (but it was important to bring this issue
up so that it can be documented).

Any comments?

Thanks,
Rainer

Rainer

Yes, I had in mind only to add a sentence which, assuming we end up with
truncation in mid text, points out the two issues, that truncating on a octet
boundary may result in incomplete UTF-8 encodings, and that truncating on the
boundary of a UTF-8 encoding may result in an incomplete composite character (a
brief foray into the UCS website suggests that that is the appropriate term)

Tom Petch
snip


___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] Sec 6.1: Truncation

2006-01-17 Thread Tom Petch
- Original Message -
From: Darren Reed [EMAIL PROTECTED]
To: Tom Petch [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Monday, January 16, 2006 10:51 PM
Subject: Re: [Syslog] Sec 6.1: Truncation


 [ Charset ISO-8859-1 unsupported, converting... ]
  Truncation of UTF-8 is actually slightly worse than has been described.
 
  It is possible to determine from the UTF-8 octets where one coded
  character ends and another begins.  But because Unicode contains
  combining characters, with no limit on how many of these there can
  be, and these modify the meaning of previous or later coded characters,
  it is not possible to determine where one 'symbol' ends.  So truncation
  at a UTF-8 boundary could subtlety change the meaning of a message,
  even breach security.  Not something we can guard against
  but should mention.

 The above seems a little confused to me.  How can there be a problem
 if a message is truncated on the boundary of complex character ?

 Darren

I lack the precise terminology.  Unicode includes base characters and modifying
characters, such as diacritic marks, as well as characters that combine the two.
Where the combination exists as a single code point, no problem.  Where it does
not, then what the user would see as a single character is actually sent as
several code points, each separately encoded in UTF-8.  It is fairly easy for a
truncating relay to work out the boundary of the UTF-8 and so ensure that a
complete UTF-8 encoding is truncated (or not).  It is much harder, probably
impossible, to work out where any modifying characters belong, whether they
should be removed or left in.  And the character 'o' with a diacritic mark is
not the same as that character without that diacritic mark, so removing trailing
modifying characters changes the meaning, which could be a security exposure.
.
Tom Petch


___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] Sec 6.1: Truncation

2006-01-16 Thread Darren Reed
[ Charset ISO-8859-1 unsupported, converting... ]
 Truncation of UTF-8 is actually slightly worse than has been described.
 
 It is possible to determine from the UTF-8 octets where one coded
 character ends and another begins.  But because Unicode contains
 combining characters, with no limit on how many of these there can
 be, and these modify the meaning of previous or later coded characters,
 it is not possible to determine where one 'symbol' ends.  So truncation
 at a UTF-8 boundary could subtlety change the meaning of a message,
 even breach security.  Not something we can guard against
 but should mention.

The above seems a little confused to me.  How can there be a problem
if a message is truncated on the boundary of complex character ?

Darren

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] Sec 6.1: Truncation

2006-01-13 Thread Tom Petch
Truncation of UTF-8 is actually slightly worse than has been described.

It is possible to determine from the UTF-8 octets where one coded character ends
and another begins.  But because Unicode contains combining characters, with no
limit on how many of these there can be, and these modify the meaning of
previous or later coded characters, it is not possible to determine where one
'symbol' ends.  So truncation at a UTF-8 boundary could subtlety change the
meaning of a message, even breach security.  Not something we can guard against
but should mention.

Tom Petch

- Original Message -
From: Rainer Gerhards [EMAIL PROTECTED]
To: Anton Okmianski (aokmians) [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, January 11, 2006 11:30 AM
Subject: RE: [Syslog] Sec 6.1: Truncation


Anton and all,

I have now changed section 6.1 to:

###
6.1.  Message Length

   Syslog message size limits are dictated by the syslog transport
   mapping in use.  There is no upper limit per se.  Each transport
   mapping MUST define the minimum required message length support.  Any
   syslog transport mapping MUST support messages of up to and including
   480 octets in length.

   Any syslog receiver MUST be able to accept messages of up to and
   including 480 octets in length.  All receiver implementations SHOULD
   be able to accept messages of up to and including 2048 octets in
   length.  Receivers MAY receive messages larger than 2048 octets in
   length.  If a receiver receives a message with a length larger than
   it supports, the receiver MAY discard the message or truncate the
   payload.

   If a receiver truncates messages, the truncation MUST occur at the
   end of the message.  UTF-8 encoding and STRUCTURED-DATA MUST be kept
   valid during truncation.  SD-ELEMENTs MUST NOT partly be truncated.
   If an SD-ELEMENT is to be truncated, the whole SD-ELEMENT MUST be
   deleted.  If the last SD-ELEMENT of a message is deleted, the
   STRUCTURED-DATA field MUST be changed to NILVALUE.
###

I have explicitly stated that there is no intrinsic upper size limit. I
did this, because we had so much confusion/misunderstanding on that fact
in the past. I've also added some details on truncation. The rest is as
suggested by Anton :)

Please review and comment.

Rainer

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Rainer Gerhards
 Sent: Monday, January 09, 2006 4:49 PM
 To: Anton Okmianski (aokmians)
 Cc: [EMAIL PROTECTED]
 Subject: RE: [Syslog] Sec 6.1: Truncation

  Rainer:
 
  I agree - this is better than a convoluted rule.
 
  I think we only have any business in defining truncation for
  relays.  For collectors, we have tried to stay away from
  describing how messages are stored.
 
  For relays, I think it would be useful to state that relay
  can't just drop arbitrary message parts. Your statements
  about some parts ... are lost may be interpreted that way.

 Actually, this was what I meant ;) [I saw a number of use
 cases where it
 would make sense to strip some known-not-so-relavant SD-IDs to be
 strippedd], but ...
 
  I would recommend that we state that any truncation must
  happen at the end of the message, which I think is what
  truncation means to a lot of people anyway. This would
  prevent an implementation which prefers to throw out
  STRUCTURED-DATA before the MSG content.  A consistent
  behavior is useful for interop and, in particular, may help
  in dealing with security issues.
   ^^^
 ... this is more important. I now agree with your point.

 As a side-note, we had the idea that relay operations may become a
 separate document, so I would prefer not to dig too deep into relay
 behaviour. To specify what you recommend, this is not
 necessary, so this
 is not really a discussion topic here.

 Rainer
 
  Thanks,
  Anton.
 
   -Original Message-
   From: Rainer Gerhards [mailto:[EMAIL PROTECTED]
   Sent: Monday, January 09, 2006 3:21 AM
   To: Anton Okmianski (aokmians)
   Subject: RE: [Syslog] Sec 6.1: Truncation
  
   Anton, Darren,
  
   I agree that the truncation rule is probably not really
   useful, even confusing. I think it is hard to predict for any
   potential message if the more interesting content is in
   STRUCTURED-DATA or in the MSG part.
   For example, with our current SD-IDs, I'd prefer to trunctate
   them instead of MSG. Obviously, the case is different for
   your LINKDOWN sample. I also agree with Darren that
   truncation probably happens on the transport layer, the
   application may not even see the full message.
  
   My conclusion, however, is slightly different: I recommend
   now that we remove truncation rules from -protocol. We should
   just say that truncation might happen and that in this case
   some parts of the message are lost - what is at the
   discretion of the receiver.
  
   Rainer
  
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL

RE: [Syslog] Sec 6.1: Truncation

2006-01-12 Thread Anton Okmianski \(aokmians\)
Banzsi:

I agree truncation does not solve the issue - that's why I don't want to 
over-design it.  

Splitting is an option I'd leave to application-layer running above syslog. It 
is not precluded. Just a matter of another RFC with extra sd-elements. 

If we do fragmentation in syslog transport/protocol, we are re-inventing IP  
TCP. We went down the path of defining special sd-elements for this once, but 
decided that in most cases increasing max acceptable message size on receivers 
and using appropriate transport is a better solution for many deployments.  

I'd propose we do truncation as raw octet cut-over, but define an optional 
sd-element for msg-length (in octets?), which people can use to know if there 
was truncation. However, this would mean we would allow a proxy to legitimately 
send malformed messages. Well, a receiver has to handle those somehow coming 
from the original sender anyway. It is not ideal, whichever way you slice it. 

Anton.  

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Balazs Scheidler
 Sent: Thursday, January 12, 2006 5:07 AM
 To: Rainer Gerhards
 Cc: [EMAIL PROTECTED]
 Subject: RE: [Syslog] Sec 6.1: Truncation
 
 On Thu, 2006-01-12 at 10:45 +0100, Rainer Gerhards wrote:
  Anton  all,
  
  You have good points and I have to admit I am still 
 thinking what is 
  the best way. I would appreciate if some other WG members could 
  express their thoughts...
 
 My pragmatic view is that overly long messages should be 
 split instead of truncated. Of course splitting rules are 
 similar to truncating rules in a sense, but the question of 
 generating the syslog header also comes up, e.g.
 
 16Jun 12 13:45:54 host app[12345]: This is a too long message
 
 should become:
 
 16Jun 12 13:45:54 host app[12345]: This is a too...
 16Jun 12 13:45:54 host app[12345]: long message
 
 This way we don't lose information while still limiting the 
 message size. Of course this will still confuse log analysis 
 applications but it can be solved simply by lifting the 
 message size limit if that is configurable in the syslog application.
 
 Maybe we should indicate in an SD-ID that message truncation 
 happened so there would be no ambiguity.
 
 Another question whether we allow relays to modify the SD-ID 
 part of the message or it must be done by the sender alone?
 
 --
 Bazsi
 
 
 
 ___
 Syslog mailing list
 Syslog@lists.ietf.org
 https://www1.ietf.org/mailman/listinfo/syslog
 

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] Sec 6.1: Truncation

2006-01-11 Thread Darren Reed
 Anton and all,
 
 I have now changed section 6.1 to:
 
 ###
 6.1.  Message Length

..

Well written and very sensible.

Darren 

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


Re: [Syslog] Sec 6.1: Truncation

2006-01-11 Thread Balazs Scheidler
On Wed, 2006-01-11 at 22:38 +1100, Darren Reed wrote:
  Anton and all,
  
  I have now changed section 6.1 to:
  
  ###
  6.1.  Message Length
 
 ..
 
 Well written and very sensible.

I like it too :)

-- 
Bazsi


___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog


RE: [Syslog] Sec 6.1: Truncation

2006-01-11 Thread Anton Okmianski \(aokmians\)
Rainer: 

Thanks for the update. Comments below. 

 I have now changed section 6.1 to:
 
 ###
 6.1.  Message Length
 
Syslog message size limits are dictated by the syslog transport
mapping in use.  There is no upper limit per se.  

These two sentences are contradictory.  I'd remove the last one.  The maximum 
limit can be dictated by a transport mapping, like in the case of UDP. 

 Each transport
mapping MUST define the minimum required message length 
 support.  Any
syslog transport mapping MUST support messages of up to 
 and including
480 octets in length.
 
Any syslog receiver MUST be able to accept messages of up to and
including 480 octets in length.  All receiver 
 implementations SHOULD
be able to accept messages of up to and including 2048 octets in
length.  Receivers MAY receive messages larger than 2048 octets in
length.  If a receiver receives a message with a length larger than
it supports, the receiver MAY discard the message or truncate the
payload.
 
If a receiver truncates messages, the truncation MUST occur at the
end of the message.  UTF-8 encoding and STRUCTURED-DATA 
 MUST be kept
valid during truncation.  

You need to be clear what you mean by keeping the UTF-8 encoding. Do you mean 
that octets should not be truncated in a way which corrupts the last character 
(which may have multiple octets)?

It is probably possible to detect such corruption by looking at the first bit 
of the last character and making sure it is not 1, if I recall UTF-8 encoding 
correctly. If it is 1, drop the last octet.  Check the new last one and do the 
same until you find one with first bit set to 0. 

It seems that to ensure that the receiver would need to be pretty smart. I 
wonder if it is a problem.  Another question is whether or not validation like 
this is more appropriate at the higher layer, where every UTF character may be 
validated anyway. 

 SD-ELEMENTs MUST NOT partly be truncated.
If an SD-ELEMENT is to be truncated, the whole SD-ELEMENT MUST be
deleted.  If the last SD-ELEMENT of a message is deleted, the
STRUCTURED-DATA field MUST be changed to NILVALUE.
 ###

I thought the last train of thought was to do a dumb cutover of octets at the 
end. Darren mentioned this is what you will likely get at the application 
layer. Proposed rules (although simpler than before) would still demand quite a 
bit of handling for messages that exceed the max size supported by receiver.  I 
now wonder if implementors would really bother to implement all that logic for 
the case of messages of sizes they are not configured to handle.  

After all the trouble of validating and fixing the message which exceeds 
normative size for receiver, all you'd get is a truncated message, which will 
be well-formed syslog message after truncation, but may not be well-formed as 
far as consuming application is concerned.  

What do you guys think?

Thanks,
Anton. 

 
 I have explicitly stated that there is no intrinsic upper 
 size limit. I did this, because we had so much 
 confusion/misunderstanding on that fact in the past. I've 
 also added some details on truncation. The rest is as 
 suggested by Anton :)
 
 Please review and comment.
 
 Rainer
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Rainer Gerhards
  Sent: Monday, January 09, 2006 4:49 PM
  To: Anton Okmianski (aokmians)
  Cc: [EMAIL PROTECTED]
  Subject: RE: [Syslog] Sec 6.1: Truncation
  
   Rainer:
   
   I agree - this is better than a convoluted rule. 
   
   I think we only have any business in defining truncation 
 for relays.  
   For collectors, we have tried to stay away from describing how 
   messages are stored.
   
   For relays, I think it would be useful to state that relay can't 
   just drop arbitrary message parts. Your statements about 
 some parts 
   ... are lost may be interpreted that way.
  
  Actually, this was what I meant ;) [I saw a number of use 
 cases where 
  it would make sense to strip some known-not-so-relavant 
 SD-IDs to be 
  strippedd], but ...
   
   I would recommend that we state that any truncation must 
 happen at 
   the end of the message, which I think is what truncation 
 means to a 
   lot of people anyway. This would prevent an implementation which 
   prefers to throw out STRUCTURED-DATA before the MSG content.  A 
   consistent behavior is useful for interop and, in particular, may 
   help in dealing with security issues.
^^^
  ... this is more important. I now agree with your point.
  
  As a side-note, we had the idea that relay operations may become a 
  separate document, so I would prefer not to dig too deep into relay 
  behaviour. To specify what you recommend, this is not necessary, so 
  this is not really a discussion topic here.
  
  Rainer
   
   Thanks,
   Anton. 
   
-Original Message-
From: Rainer Gerhards [mailto:[EMAIL

RE: [Syslog] Sec 6.1: Truncation

2006-01-09 Thread Rainer Gerhards
 Rainer:
 
 I agree - this is better than a convoluted rule. 
 
 I think we only have any business in defining truncation for 
 relays.  For collectors, we have tried to stay away from 
 describing how messages are stored.  
 
 For relays, I think it would be useful to state that relay 
 can't just drop arbitrary message parts. Your statements 
 about some parts ... are lost may be interpreted that way. 

Actually, this was what I meant ;) [I saw a number of use cases where it
would make sense to strip some known-not-so-relavant SD-IDs to be
strippedd], but ...
 
 I would recommend that we state that any truncation must 
 happen at the end of the message, which I think is what 
 truncation means to a lot of people anyway. This would 
 prevent an implementation which prefers to throw out 
 STRUCTURED-DATA before the MSG content.  A consistent 
 behavior is useful for interop and, in particular, may help 
 in dealing with security issues.
  ^^^
... this is more important. I now agree with your point.

As a side-note, we had the idea that relay operations may become a
separate document, so I would prefer not to dig too deep into relay
behaviour. To specify what you recommend, this is not necessary, so this
is not really a discussion topic here.

Rainer 
 
 Thanks,
 Anton. 
 
  -Original Message-
  From: Rainer Gerhards [mailto:[EMAIL PROTECTED] 
  Sent: Monday, January 09, 2006 3:21 AM
  To: Anton Okmianski (aokmians)
  Subject: RE: [Syslog] Sec 6.1: Truncation
  
  Anton, Darren,
  
  I agree that the truncation rule is probably not really 
  useful, even confusing. I think it is hard to predict for any 
  potential message if the more interesting content is in 
  STRUCTURED-DATA or in the MSG part.
  For example, with our current SD-IDs, I'd prefer to trunctate 
  them instead of MSG. Obviously, the case is different for 
  your LINKDOWN sample. I also agree with Darren that 
  truncation probably happens on the transport layer, the 
  application may not even see the full message.
  
  My conclusion, however, is slightly different: I recommend 
  now that we remove truncation rules from -protocol. We should 
  just say that truncation might happen and that in this case 
  some parts of the message are lost - what is at the 
  discretion of the receiver.
  
  Rainer
  
   -Original Message-
   From: [EMAIL PROTECTED]
   [mailto:[EMAIL PROTECTED] On Behalf Of Anton 
 Okmianski 
   (aokmians)
   Sent: Friday, January 06, 2006 9:48 PM
   To: [EMAIL PROTECTED]
   Subject: [Syslog] Sec 6.1: Truncation
   
   Rainer and all:
   
   I started reading draft #16. Since we are revisiting 
  everything... I 
   am not very comfortable with the current truncation rules.
   
   Receivers SHOULD follow this order of preference when it 
 comes to 
   truncation:
   
1) No truncation
2) Truncation by dropping SD-ELEMENTs
3) If 2) not sufficient, truncate MSG
   
   I don't think that this is a good recommendation.  I would 
  assume that 
   in many cases people would try to tokenize more important 
 data into 
   structured data and use message text as a secondary user-friendly 
   description. For example, for LINK_DOWN message, I would probably 
   encode link ID in the structured elements as this is 
 something that 
   should be readily available for receivers. The MSGID could be 
   LINK_DOWN and the MSG text may simply be Link down.  If you 
   truncate the structured data, it makes it harder.
   
   I also think, in general it is useful to put more important data 
   first, which is another reason for putting more valuable 
 data into 
   structured data in a more compact way.
   
   Additionally, structured data can be used to provide 
  message length or 
   digest, which can help receiver to determine if message was 
  truncated.
   
   Also, I think this statement is very convoluted:
   
   Please note that it is possible that the MSG field is truncated 
   without dropping any SD-PARAMS.  This is the case if a 
  message with an 
   empty STRUCTURED-DATA field must be truncated.
   
   I think I understand what you are driving at, but I don't 
 see it as 
   adding any requirements or clarification.
   
   This sentence is not clear although I know what you are 
  trying to say:
   
   The limits below are minimum maximum lengths, not 
 maximum length.
   
   I propose replacing the entire section 6.1 with this text:
   
   Syslog message limits are dictated by the syslog transport 
  mapping in 
   use. Each transport mapping MUST define the minimum 
  required message 
   length support. Any syslog transport mapping MUST support 
  messages of 
   up to and including 480 octets in length.
   
   Any syslog receiver MUST be able to accept messages of up to and 
   including 480 octets in length.  All receiver 
  implementations SHOULD 
   be able to accept messages of up to and including 2048 octets in 
   length. Receivers MAY receive messages larger

RE: [Syslog] Sec 6.1: Truncation

2006-01-09 Thread Anton Okmianski \(aokmians\)
Rainer:

I agree - this is better than a convoluted rule. 

I think we only have any business in defining truncation for relays.  For 
collectors, we have tried to stay away from describing how messages are stored. 
 

For relays, I think it would be useful to state that relay can't just drop 
arbitrary message parts. Your statements about some parts ... are lost may be 
interpreted that way. 

I would recommend that we state that any truncation must happen at the end of 
the message, which I think is what truncation means to a lot of people anyway. 
This would prevent an implementation which prefers to throw out STRUCTURED-DATA 
before the MSG content.  A consistent behavior is useful for interop and, in 
particular, may help in dealing with security issues. 

Thanks,
Anton. 

 -Original Message-
 From: Rainer Gerhards [mailto:[EMAIL PROTECTED] 
 Sent: Monday, January 09, 2006 3:21 AM
 To: Anton Okmianski (aokmians)
 Subject: RE: [Syslog] Sec 6.1: Truncation
 
 Anton, Darren,
 
 I agree that the truncation rule is probably not really 
 useful, even confusing. I think it is hard to predict for any 
 potential message if the more interesting content is in 
 STRUCTURED-DATA or in the MSG part.
 For example, with our current SD-IDs, I'd prefer to trunctate 
 them instead of MSG. Obviously, the case is different for 
 your LINKDOWN sample. I also agree with Darren that 
 truncation probably happens on the transport layer, the 
 application may not even see the full message.
 
 My conclusion, however, is slightly different: I recommend 
 now that we remove truncation rules from -protocol. We should 
 just say that truncation might happen and that in this case 
 some parts of the message are lost - what is at the 
 discretion of the receiver.
 
 Rainer
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Anton Okmianski 
  (aokmians)
  Sent: Friday, January 06, 2006 9:48 PM
  To: [EMAIL PROTECTED]
  Subject: [Syslog] Sec 6.1: Truncation
  
  Rainer and all:
  
  I started reading draft #16. Since we are revisiting 
 everything... I 
  am not very comfortable with the current truncation rules.
  
  Receivers SHOULD follow this order of preference when it comes to 
  truncation:
  
   1) No truncation
   2) Truncation by dropping SD-ELEMENTs
   3) If 2) not sufficient, truncate MSG
  
  I don't think that this is a good recommendation.  I would 
 assume that 
  in many cases people would try to tokenize more important data into 
  structured data and use message text as a secondary user-friendly 
  description. For example, for LINK_DOWN message, I would probably 
  encode link ID in the structured elements as this is something that 
  should be readily available for receivers. The MSGID could be 
  LINK_DOWN and the MSG text may simply be Link down.  If you 
  truncate the structured data, it makes it harder.
  
  I also think, in general it is useful to put more important data 
  first, which is another reason for putting more valuable data into 
  structured data in a more compact way.
  
  Additionally, structured data can be used to provide 
 message length or 
  digest, which can help receiver to determine if message was 
 truncated.
  
  Also, I think this statement is very convoluted:
  
  Please note that it is possible that the MSG field is truncated 
  without dropping any SD-PARAMS.  This is the case if a 
 message with an 
  empty STRUCTURED-DATA field must be truncated.
  
  I think I understand what you are driving at, but I don't see it as 
  adding any requirements or clarification.
  
  This sentence is not clear although I know what you are 
 trying to say:
  
  The limits below are minimum maximum lengths, not maximum length.
  
  I propose replacing the entire section 6.1 with this text:
  
  Syslog message limits are dictated by the syslog transport 
 mapping in 
  use. Each transport mapping MUST define the minimum 
 required message 
  length support. Any syslog transport mapping MUST support 
 messages of 
  up to and including 480 octets in length.
  
  Any syslog receiver MUST be able to accept messages of up to and 
  including 480 octets in length.  All receiver 
 implementations SHOULD 
  be able to accept messages of up to and including 2048 octets in 
  length. Receivers MAY receive messages larger than 2048 octets in 
  length. If a receiver receives a message with a length 
 larger than it 
  supports, the receiver MAY discard the message or truncate the 
  payload.
  
  If truncation is performed by the receiver, it MUST first 
 truncate the 
  MSG field as necessary to meet the supported length limit. If 
  truncation of the entire MSG field is not sufficient, then 
  additionally, the STRUCTURED-DATA field MUST be truncated 
 by removing 
  one or more SD-ELEMENT fields. A minimum number of 
 SD-ELEMENT fields 
  MUST be truncated starting from the end as necessary to meet the 
  supported length limit. SD-ELEMENT field

Re: [Syslog] Sec 6.1: Truncation

2006-01-07 Thread Darren Reed
[ Charset ISO-8859-1 unsupported, converting... ]
 Rainer and all:
..
 Receivers SHOULD follow this order of preference when it comes to truncation:
 
  1) No truncation
  2) Truncation by dropping SD-ELEMENTs
  3) If 2) not sufficient, truncate MSG

 I don't think that this is a good recommendation.

Is this likely to be how people implement truncation ?

I'm more inclined to believe that truncation will happen when the
incoming message is too big for a buffer, so you start reading in
data (and dropping it) until you reach the end of the buffer.

The above requirement forces everyone to accept all maximum length
messages before deciding one is too big.

 I also think, in general it is useful to put more important data first,

Agreed.

Darren

___
Syslog mailing list
Syslog@lists.ietf.org
https://www1.ietf.org/mailman/listinfo/syslog