Hi Chris, (re: CCing IDR & GROW)
On 28 Dec 2012, at 12:28, Chris Hall wrote: > Rob Shakir wrote (on Thu 27-Dec-2012 at 18:44): >> >> Any comments very welcome (to me or grow@). > > I'm afraid I still don't get it :-( What am I missing ? > > UPDATE Message Length errors are Critical because they (1) "result in > cases whereby the NLRI attribute cannot be correctly extracted". > > The implication is that a failure to extract all NLRI is Critical. Is > that a requirement ? If the NLRI cannot be determined, then this is a Critical error, yes. I left the wording relatively open on whether this is *all* NLRI, as I am not sure that in the requirements draft we should specify direct solutions to specific issues, to e.g., say how to handle cases where MP_REACH_NLRI and MP_UNREACH_NLRI are in the same message [this is a case that I do not believe is forbidden by rfc2858 - if the working group could clarify whether this is something that we feel the draft needs to handle or can explicitly be omitted, then that would be appreciated]. > > Later: > > (2) "All errors whereby the contained NLRI can be > extracted are referred to as Non-Critical". > > And that includes: > > (3) "where the length of all path attributes contained > within the UPDATE does not correspond to the > total path attribute length." > > That is, at least, more explicit than > draft-ietf-idr-error-handling-03, which glosses over (3). > > But if (3) is non-critical then there is some chance that some NLRI > will not be extracted, which appears to violate (1) and (2). Disclaimer: As I am sure that my comments previously have made clear, I do not maintain a code base for a BGP daemon/implementation - so please feel free to correct my logic below. I do not believe that (3) implies that the NLRI cannot be correctly found. If the sum of total length is incorrect, then we can still extract the individual attributes - we just find that there is not enough data to fill the overall length we were told and/or we have too much attribute data compared to the total attribute length. In the case where the NLRI attribute itself has a length error, then this is a critical error (based on the "Errors parsing the NLRI attribute of an UPDATE message" definition of Critical error), and a similar Critical error occurs in the latter case, where the Total Path Attributes + Withdrawn Routes are not equal to total UPDATE message length. Either way -- again, I would say that this is something that we need to put text together for draft-ietf-idr-error-handling rather than the requirements document (this sounds like a solution, and the requirement does not have a SHOULD or MUST here, it is an "it is expected that…" comment). > Then (4) "In order to maximise the number of cases whereby the NLRI > attributes [plural, now, BTW] can be reliably extracted from a > received message...". Ah. So it is not a Critical Error if "the NLRI > attribute cannot be correctly extracted". No - it is a Critical error if we cannot extract the NLRI. This recommendation is to give an increased chance that the NLRI can be extracted as per the IDR error handling draft. This then (by virtue of resulting in the NLRI being extracted) minimises the number of cases that result in a Critical error. The plural here is to reflect that the existence of >1 type of NLRI attribute. > For me the requirement remains "conflicted". On the one hand it seems > to say that it is a Critical Error if the NLRI cannot be extracted and > parsed. On the other it seems to say it's OK if you cannot extract > some NLRI. If you'll forgive me for removing a significant proportion of your message, I think that we need to take another step back here. It seems to me that the key question that you are highlighting is "What level of confidence do we need to have before we declare that the NLRI cannot be extracted?" -- do you agree? >From an operator perspective, I would like to compromise *certainty* for >*robustness*. You are right, we are compromising correctness here, we might >end up withdrawing an incorrect NLRI and impacting service operation for that >prefix - however, it is somewhat preferable to me to withdraw a a subset of >the NLRI incorrectly, rather than impact all NLRI in one single action. We >clearly need to provide some bounds on how much we compromise the certainty >(and live within the realms of possibility, such that we are not just taking a >shot in the dark). This is what the definitions of Critical and Non-Critical >within the document are intended to provide. Once again I will refer to the >requirement that there is a balance between correctness and robustness - >rather than a locally risk averse approach that results in harmful wider >behaviour. Is it acceptable that we leave this as guidance within the requirements? If not, please could you suggest how the definitions of Critical/Non-Critical could be altered to address your concerns? I would also appreciate further input from IDR as to whether this is sufficient requirement from GROW to allow a solution document to be written? Many thanks, r. _______________________________________________ GROW mailing list [email protected] https://www.ietf.org/mailman/listinfo/grow
