[quagga-dev 14846] Re: [PATCH 00/89] Cumulus Take-3 Electric Bugaloooooooooooooooo

Paul Jakma Tue, 08 Mar 2016 04:57:46 -0800

A quick comment, just on some patches in there that I'm getting a senseof deja vu on mostly. ;) Non-exhaustive.


On Fri, 11 Dec 2015, Donald Sharp wrote:

 ospfd: Ensure deletion is clean
 ospfd: Ensure that OSPF can send out multiple packets per thread
 ospfd: Limit the number of interfaces serviced at one time

The first, I don't understand the point of - where is it possible thatoi can be valid, but not have an address? (What is the problem?)

The latter two need more justification. Note that if we start batchingpacket writes, it becomes more difficult to interleave anything (otherthings, or prioritising some kind of packets).

Again, what's the motivation for this? What was the original problem?Was it hellos being blocked? Maybe then it'd be simpler to just puthellos at the top of the queue instead, or just send them direct fromthe timer?


To expand on the motivation thing:

The commit message should tell the story of why anyone should care aboutthis patch, and not dismiss it as churn. It should sell the patch:


 "Here's this terrible problem the original code has! <ideally, back
  this up with clear examples and even measurements if available>

  Here's the high-level solution!

  Here's a description of the implementation of that solution, how it
  was intended at least!

  (ideally) Here's the evidence that it works! <examples and
  measurements, where possible, ideally>"

The more complex a patch, the more important it is that commit messagefollows a structured:


 "Problem

  Solution

  Implementation

  Results"

form, and the more important it is to give details.

Giving a description of the "Implementation" is particularly useful tosignal to reviewers "I've actually re-read my patch carefully, sothere's a better chance that I've already caught any silly mistakes", aswell as provide a view of what the contributor /thought/ theyimplemented - so reviewers can better catch any other silly mistakesthat got through.

Yeah, I know, everyone hates writing commit messages - doesn't the codespeak for itself? However, if we want to achieve throughput in gettingpatches in, then we have to start offloading work from reviewers andonto contributors. So, instead of:


"One line vague commit message"

And reviewers having to pick through a non-trivial patch and trying tofigure it out for themselves, contributors instead need to startspending more time providing the "sales pitch" (problem, solution) andthe "map" ("here's exactly how the solution was intended to beimplemented") so that the reviewer doesn't have to spend as much timementally constructing the same.


*Contributors are only shooting themselves in the foot when they give
 vague and/or terse messages.*

 zebra: Add internal support for route tags
 bgpd, lib, ospfd, zebra: Add ability to read/write tag value
 bgpd, vtysh: Add support for route tags
 ospfd, vtysh: Add support for Route tags
 ripd, ripngd: add support for route tags
 zebra: add support for route tags

Should this start by adding Piotr's patch first and then building on?Also, does this reflect the comments on that patch set when submittedagain last year?

 ospfd: 16.0 rfc2328 compliance


What was the motivation for this?

RFC3137 and 6987 both allow for routers to treat "infinite" cost linksas infinite, and the intent of the stub-router support definitely is towholly prevent any infinite cost link being used for transit. Quagga isaggressive on that, but not incorrect in terms of those RFCs.

Was the motivation keeping the router accessible, even if the managementIP was also OSPF routed?

The commit message has slightly condescending language that probablycould have been avoided ("pretty simple huh"), and also is confused:


"   In this case, the router will continue
    to forward any traffic sent to it while these "max-metric" LSAs are
    propagated through the network; at which point, the router can be
    taken out of service."

This is irrelevant to that patch. The existing code allows for exactlythis, because it allows the local router to still calculate paths out,and so continue to route any packets received out to and via othernon-max-metric routers.

Further, the spf_next stage of SPF is used to construct just the transitSPF tree. In this, the link in a router-LSA is only used to calculatethe cost to *leave* a vertex V to reach the next candidate W. I.e.,ignoring a max-metric link out of V should not affect reachability to Vitself (i.e. it's router-ID). So, ignoring max-metric links was a verydeliberately chosen way of implementing the "don't transit astub-router" part of RFC3137 (and still in 6987) while still keeping thestub-router itself reachable.

The proposed patch opens up the possibility that ospfd /would/ be ableto start using max-metric routers for transit routes - which, to me,definitely is *not* what the stub-router RFCs intend.

I've queried this patch a few times now, and not heard an explanationback as to what the problem is that it's trying to solve. So, can wehave that explanation and discussion, or can you ensure this patch isnot re-submitted? :)

(E.g., it could be that what you really want is to set the metric to0xfffe... or that something went wrong elsewhere in adding the router -depends on what problem Cumulus saw that led to that patch; if therewasn't an observed problem, then I wonder if it was due to someone beingover-hasty in thinking they saw incorrect code, but not understandingSPF.. (which is not exactly trivial - not just in the code, but evenjust the RFC)).

 bgpd: Add maxmed command

MED is a terminally broken and unfixable "metric" (note: it fails tomeet any normal definition of metric), which can easily causepathological behaviours in iBGP. What BGP operator in their right mind,once fully informed, would want to allow any neighbour ASes to injectsuch a broken and dangerous metric into their iBGP?

My view then would be that MED needs to be deprecated from BGP - asleast its classic use. So should we really be encouraging further use ofit?

bgpd: cluster-id length equality for multipath

We had a discussion on that one already. The sentiment is really good,and I'd even lift the cluster-list check up a lot higher, but it's a bitdangerous on its own.

I've had a draft on something similar years ago, but I didn't think Icould put it into Quagga...


regards,
--
Paul Jakma      [email protected]  @pjakma Key ID: 64A2FF6A
Fortune:
Force it!!!
If it breaks, well, it wasn't working anyway...
No, don't force it, get a bigger hammer.

_______________________________________________
Quagga-dev mailing list
[email protected]
https://lists.quagga.net/mailman/listinfo/quagga-dev

[quagga-dev 14846] Re: [PATCH 00/89] Cumulus Take-3 Electric Bugaloooooooooooooooo

Reply via email to