Re: RPKI invalid logs?

2021-02-20 Thread Job Snijders via NANOG
Dear Hank,

On Sat, Feb 20, 2021 at 07:37:08PM +0200, Hank Nussbacher wrote:
> Is there a place where one can examine RPKI invalid logs for a specific date
> & time 

I have set up a publicly accessible archiver instance in Dallas, and one
in Amsterdam which capture and archive data every 20 minutes.

Please visit for access to downloadable archives http://www.rpkiviews.org/

> or even better logs showing those that dropped RPKI invalid
> announcements?

You can extract the rpki-client.json file from the archive from the
timestamp you are interested in, and pass it as cache file to
https://github.com/job/rpki-ov-checker, and via STDIN feed it a list of
Prefix + OriginAS combos (sourced from MRT data or your internal
administration / expectations).

If you like this service, please consider making a server in Israel
available to rpkiviews.org. All that is required is a
POSIX.1-ish-compliant server (BSD, Linux, or UNIX), and about 6
terabytes of storage (should be good for next 3 years), and a globally
unique publicly reachable IP address. You pick the hostname.

Kind regards,

Job


Re: Famous operational issues

2021-02-16 Thread Job Snijders via NANOG
On Tue, Feb 16, 2021 at 01:37:35PM -0600, John Kristoff wrote:
> I'd like to start a thread about the most famous and widespread Internet
> operational issues, outages or implementation incompatibilities you
> have seen.
> 
> Which examples would make up your top three?

This was a fantastic outage, one could really feel the tremors into the
far corners of the BGP default-free zone:

https://labs.ripe.net/Members/erik/ripe-ncc-and-duke-university-bgp-experiment/

The experiment triggered a bug in some Cisco router models: affected
Ciscos would corrupt this specific BGP announcement ** ON OUTBOUND **.
Any peers of such Ciscos receiving this BGP update, would (according to
then current RFCs) consider the BGP UPDATE corrupted, and would
subsequently tear down the BGP sessions with the Ciscos. Because the
corruption was not detected by the Ciscos themselves, whenever the
sessions would come back online again they'd reannounce the corrupted
update, causing a session tear down. Bounce ... Bounce ... Bounce ... at
global scale in both IBGP and EBGP! :-)

Luckily the industry took these, and many other lessons to heart: in
2015 the IETF published RFC 7606 ("Revised Error Handling for BGP UPDATE
Messages") which specifices far more robust behaviour for BGP speakers.

Kind regards,

Job


Re: Problems with newish IP block assignment issues from ARIN

2021-02-08 Thread Job Snijders via NANOG
On Mon, Feb 08, 2021 at 04:02:14PM -0500, Justin Wilson (Lists) wrote:
> I enabled 134.195.47.1 on one of our routers.

Cool! I noticed the following: from many NLNOG RING nodes I can reach
that IP address, but not from 195.66.134.42:

deepmedia01.ring.nlnog.net:~$ mtr -z -w -r 134.195.47.1
Start: 2021-02-08T21:19:32+
HOST: deepmedia01.ring.nlnog.netLoss%   Snt   Last   Avg  Best  
Wrst StDev
  1. AS39022  vlan100.ccr-1.gs.as39022.net   0.0%100.5   0.5   0.4  
 0.5   0.1
  2. AS???speed-ix.he.net0.0%100.8   1.0   0.7  
 2.5   0.5
  3. AS6939   100ge16-1.core1.lon2.he.net0.0%106.8   7.0   6.7  
 8.1   0.5
  4. AS6939   100ge4-1.core1.nyc4.he.net 0.0%10   83.7  77.7  72.5  
93.8   8.4
  5. AS6939   ve951.core2.nyc4.he.net0.0%10   73.0  73.0  72.6  
74.9   0.7
  6. AS6939   100ge0-31.core2.cmh1.he.net0.0%10   85.7  86.4  85.6  
88.7   1.1
  7. AS6939   100ge9-2.core1.ind1.he.net 0.0%10   93.4  93.4  93.2  
94.6   0.4
  8. AS6939   184.105.30.134 0.0%10   93.0  93.1  92.9  
93.3   0.1
  9. AS??????   100.0100.0   0.0   0.0  
 0.0   0.0

Do you have a BGP route for 195.66.134.0/23 on the router with
134.195.47.1 ?

Do you have a traceroute towards 195.66.134.42?

Kind regards,

Job


Re: Problems with newish IP block assignment issues from ARIN

2021-02-08 Thread Job Snijders via NANOG
Dear Justin,

On Mon, Feb 08, 2021 at 03:14:47PM -0500, Justin Wilson (Lists) wrote:
> It acts like the IP block was blacklisted at some point and got on
> some bad lists but I don’t want ti limit myself to that theory.
> I have opened up a ticket with ARIN asking for any guidance. Has
> anyone ran into this with new space assigned? Any tools, sites, etc. I
> can use to do further troubleshooting.  

Here are some useful tools:

ping.pe
example: http://ping.pe/www.openbsd.org

https://ring.nlnog.net/
good introduction here: 
https://labs.ripe.net/Members/martin_pels_3/10-years-of-nlnog-ring

https://atlas.ripe.net/

> The block in question is 134.195.44.0/22. 

Is there any specific IP address in the range that should always respond
to ICMP Echo Requests? This will help others see if they can reach you
or not.

> It has been RPKI certified and has IRR entries.

Indeed, nice :-) http://irrexplorer.nlnog.net/search/134.195.44.0/22

Kind regards,

Job


Re: Issues with NANOG mailing list operations and subscriptions

2021-01-18 Thread Job Snijders
Hi Sean, Will, group,

On Sun, Jan 17, 2021 at 03:01:22PM -0800, William Herrin wrote:
> On Sun, Jan 17, 2021 at 1:37 PM Sean Donelan  wrote:
> > Some people think its funny to ghost subscribe email addresses, and
> > the NANOG mailing list auomation doesn't catch them in the verification
> > process.
> 
> Hi Sean,
> 
> How is that possible? This is exactly what a correctly implemented
> confirmed opt-in procedure is designed to prevent.

Someone was kind enough to share a plausible theory outlining a
potentially fatal flaw in the current verification process. Some
auto-responders are (by accident) able to respond in such a way they end
up positively passing the automated verification process. 

I've shared some suggestions with the volunteers & nanog staff (who run
this mailing list) to improve the subscription verification process.
I am optimistic in the near future we'll be able to prevent this type of
situation from happening again.

Kind regards,

Job

ps. I'm one of the volunteers who support the NANOG staff in a technical
capacity to operate this mailman instance.


Re: what is the policy about sharing email offlist?

2021-01-18 Thread Job Snijders
Dear all,

On Mon, Jan 18, 2021 at 11:17:06AM -0700, Anne P. Mitchell, Esq. wrote:
> Either Alexandria Ocasio-Cortez' office is on the NANOG list or
> someone is forwarding NANOG email to AOC's press office (in which case
> either spoofed as the original sender or AOC's office sends an ack to
> every email address it can find)..as I received this auto-ack in
> response to my email to to the list.
> 
> Anyone have any insight into this?

The NANOG Mailing-list usage guidelines prohibit auto-reponse postings
to the list or to other users: https://www.nanog.org/resources/usage-guidelines/

The mailing list admin team (adm...@nanog.org) can take action to
prevent auto-responders decrease the signal to noise ratio. As multiple
users have now made a report about an auto-responder, I'm sure the issue
will be resolved soon.

> And either way, what is the policy about forwarding list email to
> someone who is not on the list?

The NANOG mailing list is a so-called 'public' mailing list. Emails send
to the list are distributed to all subscribed email addresses, and in
addition to such this distribution, list postings archived at a publicly
accessible location for anyone prefering to use a web browser:
https://mailman.nanog.org/pipermail/nanog/

It is possible for users to distribute NANOG mailing list postings
through other means, such as additional forwarding and archiving
(example: https://marc.info/?l=nanog)

The NANOG mailing list AUP does not prohibit forwarding of emails which
were sent to NANOG the list to people not on the list. 

Kind regards,

Job


Fw: [lacnog] Update on LACNIC's IRR: Near-Real-Time Mirroring Now Available

2020-11-24 Thread Job Snijders
FYI

- Forwarded message from "Carlos M. Martinez"  -

Date: Mon, 23 Nov 2020 16:18:44 -0300
From: "Carlos M. Martinez" 
To: Latin America and Caribbean Region Network Operators Group

Subject: [lacnog] Update on LACNIC's IRR: Near-Real-Time Mirroring Now
Available

Colleagues,

We are pleased to inform that LACNIC's Internet Routing Registry (IRR) which
came into production earlier this year now supports Near-Real-Time Mirroring
(NRTM) using IRRd Version 4.

The mirroring policy is an open policy and those organizations that wish to
mirror the information contained in LACNIC's IRR can do so.

To set up a mirror, you will need the following information:

NOC: n...@lacnic.net
Dump: ftp://irr.lacnic.net/lacnic.db.gz / http://irr.lacnic.net/lacnic.db.gz
CURRENT SERIAL: ftp://irr.lacnic.net/LACNIC.CURRENTSERIAL
http://irr.lacnic.net/LACNIC.CURRENTSERIAL
NRTM Host: irr.lacnic.net
NRTM Port:43

When LACNIC enables NRTM in the coming days, other IRRs such as RADB and NTT
will begin mirroring the LACNIC source.

We would also like to thank the DashCare team (https://dashcare.nl/), Job
Snijders (NTT) and the RADB team for their support.

Should you have any questions or doubts, please don't hesitate to contact us
at tecnolo...@lacnic.net/

The LACNIC Team

___
LACNOG mailing list
lac...@lacnic.net
https://mail.lacnic.net/mailman/listinfo/lacnog
Cancelar suscripcion: https://mail.lacnic.net/mailman/options/lacnog


- End forwarded message -


Re: inspecting RPKI data: console.rpki-client.org

2020-11-20 Thread Job Snijders
On Fri, Nov 20, 2020 at 12:02:04PM -0500, Tom Beecher wrote:
> In before snark of "OMG "http" links to RPKI info HURF BLURF!"

But Tom, that is exactly the whole point of the RPKI :-)

It's funny, but true! You really can safely use the RPKI data from the
console website in your own production environment, even after it has
been transported via mere HTTP - provided you have the TAL files to
build the chain of trust.

This applies also applies to the console's HTML itself: if you have the
TAL files + rpki-client + rsync + the openssl cli utility + ksh + perl;
you can generate any of the pages yourself and thus confirm their
authenticity and integrity.

Of course I don't expect anyone to jump through those hoops, but the
source code is here: https://github.com/job/console.rpki-client.org

I'll concede HTTPS does provide some privacy while looking at these
gorgeous ASN.1 data structures ;-)

Kind regards,

Job


inspecting RPKI data: console.rpki-client.org

2020-11-20 Thread Job Snijders
Dear all,

I'd like to introduce another tool to inspect RPKI data... the
rpki-client console! Comes with an authentic 90s look & feel :-)

The Frontpage - http://console.rpki-client.org/
---
On the front page you can see stdout + stderr of the most recent
rpki-client run. The log shows which publication points were contacted
and prints any issues encountered with specific RPKI files.

Those of us publishing RPKI data should keep an eye out not to show up
in this type of log with warnings or errors. For example:

rpki-client: cc.rg.net/rpki/RGnet-cc/1opByAd8x8R2F-SzstgaYzVXK8Q.mft: mft 
expired on Oct 12 17:58:45 2020 GMT

However, the above line might be the result of some kind of experiment someone 
is conducting :-)

The RPKI distributed database currently is more than 120,000 (!)
certificate/roa/manifest files, and only a handful of files have some
kind of completeness or expiration date issue. Good job everyone! :-)

The ASN specific pages - http://console.rpki-client.org/AS2914.html
---
You can substitute the 'AS2914' portion in the URL for any ASN to see
which .roa files reference the given ASN. Another example, here one can
see all ROAs which authorize AS 8283 as origin: 
https://console.rpki-client.org/AS8283.html
If you encounter a HTTP 404 error, no ROAs reference the ASN. 

On the 'per ASN page' you can search click the .roa files on the left
side to inspect the ROA. Each object in the RPKI has a unique Subject
Key Identifier (SKI). An example of a SKI is this hexadecimal identifier
'06:96:B3:F7:CC:AD:55:45:A5:3A:64:32:31:2B:7F:E1:2B:7A:15:22' which
maps to a filename like 
'rpki.apnic.net/member_repository/A91A4C60/B526FF74D84111E9A4521413C4F9AE02/12F0D72E7BC111EA8503D815C4F9AE02.roa'

Yeah... compared to DNS names mapping to IPv6 addresses, in the RPKI
neither the path name nor the SKI are easy to remember :-)

The console can show that .roa file in human readable format, just
append .html: 
http://console.rpki-client.org/rpki.apnic.net/member_repository/A91A4C60/B526FF74D84111E9A4521413C4F9AE02/12F0D72E7BC111EA8503D815C4F9AE02.roa.html

Every object in the RPKI is subordinate to another object (all objects
are signed by a parent certificate, except the Trust Anchors). The
parent is identified by the Authority Key Identifier (AKI). So one
object's AKI is another object's SKI! If you click the AKI, the console
brings you to the parent object, from where you can continue to explore
other objects related to parent.

Certificates point to Manifests, and .mft files contain the 'directory
indexes' of the RPKI: 
http://console.rpki-client.org/rpki.apnic.net/member_repository/A91A4C60/B526FF74D84111E9A4521413C4F9AE02/nvnkN242ZTJ1x5Y1mNa0W3CvgJk.mft.html
>From the manifest overview you can jump to the parent, click the
referenced .roa, .cer or .crl files.

All directories on the webserver are 'open', except the root. This
allows you to explore this RPKI cache by browsing through the filesystem
directly, example: 
http://console.rpki-client.org/rpki.apnic.net/member_repository/

Final notes
---
The rpki-client console provides a view on *validated* RPKI data. First
rpki-client runs and prunes bad files, then all HTML is generated. The
console provides a view on the data as used in production Internet
routers. Please note: the console's rendering is delayed by a bit over
an hour compared to the real thing.

Another entry point, you can use your browser's 'find on page' function
to search for anything in all of it on this humongous page:
http://console.rpki-client.org/roas.html

The RPKI is very intricate collection of references, I hope this console
offers another useful perspective on the tree-like structures. Enjoy!

Kind regards,

Job


Re: Newbie Questions: How-to remove spurious IRR records (and keep them out for good)?

2020-11-02 Thread Job Snijders
Dear Pirawat,

On Mon, Oct 26, 2020 at 08:13:19PM +0700, Pirawat WATANAPONGSE wrote:
> I am seeking advice concerning someone else announcing IRR records on
> resources belonging to me.

Change is underway in the IRR ecosystem! The situation we are all used
to is that it is rather cumbersome to get IRR databases to remove IRR
objects. The IRR database operator may not trust your request for object
removal, or is busy doing other things. There was no industry-wide
automated process for IRR object removal.

With the introduction of "RIPE-731" 
(https://www.ripe.net/publications/docs/ripe-731)
in the RIPE region, the "RIPE-NONAUTH" database has slowly been
shrinking. The RIPE-NONAUTH database exclusively contains IRR objects
covering non-RIPE space. As more and more people create RPKI ROAs, which
in turn provide automated evidence whether objects in RIPE-NONAUTH are
valid or not valid. If an object is found to be invalid, it is deleted.

While RIPE-731 addressed the issue of stale objects in the RIPE-NONAUTH
database, of course it did not change anything for non-RIPE databases.
Most non-RIPE databases use software called "IRRd" (the likes of NTTCOM,
RADB, TC, etc). The IRRd software is the main entrypoint into the IRR
system, and recently IRRd v4.1.0 was released which can automatically
delete RPKI invalid IRR route objects.

A youtube video from last week with some information on IRRdv4 can be
seen here: https://www.youtube.com/watch?v=V9fsU0mNcA4

NTT has not yet upgraded from 4.0.0 -> 4.1.0, we are working on that.
RADB is also investigating a migration path. LACNIC & ARIN already are
on the v4 train.

The moment NTT and RADB have deployed 4.1.0 at rr.ntt.net and
whois.radb.net there will be an industry-wide fully automated IRR
cleanup process running which accomplishes two things:

- stale/rogue/erroneous objects (conflicting with RPKI) are deleted
- new objects which are in conflict with RPKI ROAs cannot be created

Using RPKI to clean up the IRR is a continuous process: this mechanism
helps clean up the past, but also going forward ensures that IRR does
not contain new information which is in conflict with published
cryptographically signed RPKI ROAs.

This 2018 video outlines the strategy how to migrate to an improved
state of internet routing security: https://www.youtube.com/watch?v=3BAwBClazWc
https://nlnog.net/static/nlnogday2018/9_routing_security_roadmap_nlnog_2018_snijders.pdf
Reality is now nearly synced up to all slides of the deck :-)

Kind regards,

Job


Re: plea for comcast/sprint handoff debug help

2020-11-02 Thread Job Snijders
On Mon, Nov 02, 2020 at 09:13:16AM +0100, Tim Bruijnzeels wrote:
> On the other hand, the fallback exposes a Malicious-in-the-Middle
> replay attack surface for 100% of the prefixes published using RRDP,
> 100% of the time. This allows attackers to prevent changes in ROAs to
> be seen.

This is a mischaracterization of what is going on. The implication of
what you say here is that RPKI cannot work reliably over RSYNC, which is
factually incorrect and an injustice to all existing RSYNC based
deployment. Your view on the security model seems to ignore the
existence of RPKI manifests and the use of CRLs, which exist exactly to
mitigate replays.

Up until 2 weeks ago Routintar indeed was not correctly validating RPKI
data, fortunately this has now been fixed:
https://mailman.nanog.org/pipermail/nanog/2020-October/210318.html

Also via the RRDP protocol old data be replayed, because because just
like RSYNC, the RRDP protocol does not have authentication. When RPKI
data is transported from Publication Point (RP) to Relying Party, the RP
cannot assume there was an unbroken 'chain of custody' and therefor has
to validate all the RPKI signatures.

For example, if a CDN is used to distribute RRDP data, the CDN is the
MITM (that is literally what CDNs are: reverse proxies, in the middle).
The CDN could accidentally serve up old (cached) content or misserve
current content (swap 2 filenames with each other).

> This is a tradeoff. I think that protecting against replay should be
> considered more important here, given the numbers and time to fix
> HTTPS issue.

The 'replay' issue you perceive is also present in RRDP. The RPKI is a
*deployed* system on the Internet and it is important for Routinator to
remain interopable with other non-nlnetlabs implementations.

Routinator not falling back to rsync does *not* offer a security
advantage, but does negatively impact our industry's ability to migrate
to RRDP. We are in 'phase 0' as described in Section 3 of
https://tools.ietf.org/html/draft-sidrops-bruijnzeels-deprecate-rsync

Regards,

Job


RPKI over RSYNC vs RRDP (Was: plea for comcast/sprint handoff debug help)

2020-10-30 Thread Job Snijders
On Fri, Oct 30, 2020 at 12:47:44PM +0100, Alex Band wrote:
> > On 30 Oct 2020, at 01:10, Randy Bush  wrote:
> > i'll see your blog post and raise you a peer reviewed academic paper
> > and two rfcs :)
> 
> For the readers wondering what is going on here: there is a reason
> there is only a vague mention to two RFCs instead of the specific
> paragraph where it says that Relying Party software must fall back to
> rsync immediately if RRDP is temporarily unavailable. That is because
> this section doesn’t exist.

*skeptical face* Alex, you got it backwards: the section that does not
exist, is to *not* fall back to rsync. But on the other hand, there are
ample RFC sections which outline rsync is the mandatory-to-implement
protocol. Starts at RFC 6481 Section 3: "The publication repository
MUST be available using rsync".

Even the RRDP RFC itself (RFC 8182) describes that RSYNC and RRDP
*co-exist*. I think this co-existence was factored into both the design
of RPKIoverRSYNC and subsequently RPKIoverRRDP. An rsync publication
point does not become invalid because of the demise of an
once-upon-a-time valid RRDP publication point.

Only a few weeks ago a large NIR (IDNIC) disabled their RRDP service
because somehow the RSYNC and RRDP repositories were out-of-sync with
each other. The RRDP service remained disabled for a number of days
until they repaired their RPKI Certificate Authority service.

I suppose that during this time, Routinator was unable to receive any
updates related to the IDNIC CA (pinned to RRDP -> because of a prior
successful fetch prior to the partial IDNIC RPKI outage). This in turn
deprived the IDNIC subordinate Resource Holders the ability to update
their Route Origin Authorization attestations (from Routinator's
perspective).

Given that RRDP is an *optional* protocol in the RPKI stack, it doesn't
make sense to me to strictly pin fetching operations to RRDP: Over time
(months, years), a CA could enable / disable / enable / disable RRDP
service, while listing the RRDP URI as a valid SIA, amongst other valid
SIAs.

An analogy to DNS: A website operator may add  records to indicate
IPv6 reachability, but over time may also remove the  record if
there (temporarily) is some kind of issue with the IPv6 service. The
Internet operations community of course encourages everyone to add 
records, and IPv6 Happy Eyeballs were a concept to for a long time even
*favor* IPv6 over IPv4 to help improve IPv6 adoption, but a dual-stack
browser will always try to make benefit of the redundancy that exists
through the two address families.

RSYNC and RRDP should be viewed in a similar context as v4 vs v6, but
unlike with IPv4 and IPv6, I am convinced that RSYNC can be deprecated
in the span of 3 or 4 years, the draft-sidrops-bruijnzeels-deprecate-rsync
document is helping towards that goal! 

> Be that as it may, operators can rest assured that if consensus goes
> against our logic, we will change our design.

Please change the implementation a little bit (0.8.1). I think it is too
soon for the internet wide 'rsync to RRDP' migration project to be
declared complete and successfull, and this actually hampers the
transition to RRDP.

Pinning to RRDP *forever* violates the principle-of-least-astonishment
in a world where draft-sidrops-bruijnzeels-deprecate-rsync-00 was
published only as recent as November 2019. That draft now is a working
group document, and it will probably take another 1 or 2 years before it
is published as RFC.

Section 5 of 'draft-deprecate-rsync' says RRDP *SHOULD* be used when it
is available. Thus it logically follows, when it is not available, the
lowest common denominator is to be used: rsync. After all, the Issuing
CA put an RSYNC URI in the 'Subject Information Access' (SIA). Who knows
better than the CA?

The ability to publish routing intentions, and for others to honor the
intentions of the CA is what RPKI is all about. When the CA says
delegated RPKI data is available at both an RSYNC URI and an RRDP URI,
both are valid network entrypoints to the publication point. The
resource holder's X.509 signature even is on those 'reference to there'
directions (URIs)! :-)

If I can make a small suggestion: make 0.8.1 fall back to rsync after
waiting an hour or so, (meanwhile polling to see if the the RRDP service
restores). This way the network operator takes advantage of both
transport protocols, whichever is available, with a clear preference to
try RRDP first, then eventually rsync.

RPKI was designed in such a way that it can be transported even over
printed paper, usb stick, bluetooth, vinyl, rsync, and also https (as
rrdp). Because RPKI data is signed using the X.509 framework, the
transportation method really is irrelevant. IP holders can publish RPKI
data via horse + cart, and still make productive use of it!

Routinator's behavior is not RFC compliant, and has tangible effects in
the default-free zone.

Regards,

Job


Re: plea for comcast/sprint handoff debug help

2020-10-30 Thread Job Snijders
On Thu, Oct 29, 2020 at 09:14:16PM +0100, Alex Band wrote:
> In fact, we argue that it's actually a bad idea to do so:
> 
> https://blog.nlnetlabs.nl/why-routinator-doesnt-fall-back-to-rsync/
>
> We're interested to hear views on this from both an operational and
> security perspective.

I don't see a compelling reason to not use rsync when RRDP is
unavailable.

Quoting from the blog post:

"While this isn’t threatening the integrity of the RPKI – all data
is cryptographically signed making it really difficult to forge data
– it is possible to withhold information or replay old data."

RRDP does not solve the issue of withholding data or replaying old data.
The RRDP protocol /also/ is unauthenticated, just like rsync. The RRDP
protocol basically is rsync wrapped in XML over HTTPS.

Withholding of information is detected through verification of RPKI
manifests (something Routinator didn't verify up until last week!),
and replaying of old data is addressed by checking validity dates and
CRLs (something Routinator also didn't do until last week!).

Of course I see advantages to this industry mainly using RRDP, but those
are not security advantages. The big migration towards RRDP can happen
somewhere in the next few years.

The arguments brought forward in the blog post don't make sense to me.
The '150,000' number in the blog post seems a number pulled from thin
air.

Regards,

Job


Recommendation to update RPKI validators

2020-10-29 Thread Job Snijders
Hi all,

About eight months ago I discovered a number of issues in the validation
procedure of most RPKI validator softwares (including the RIPE NCC
Validator, Routinator, and OctoRPKI). The impact of improper
verification of Manifests (and associated aspects of the X.509 system)
in the RPKI can have rather dramatic effects in today's Internet routing
landscape. When handling a manifest, make sure everything is accounted
for!

The mitigation guidance is at present is very simple: just make sure all
deployed RPKI validators are updated to the latest version.

Going forward I hope our industry as a whole will be able to respond
faster to issues of this type. A write-up with examples and details is
available here: http://sobornost.net/~job/manifest_handling_issue.txt

Thank you to all involved who helped fix & progress this issue.

Kind regards,

Job


Re: IRR Explorer Error/Issue

2020-10-07 Thread Job Snijders
Dear Kevin,

I am the maintainer of NLNOG's IRRexplorer and can help.

On Wed, Oct 07, 2020 at 08:37:00PM +, Kevin McCormick wrote:
> There seems to an issue with IRR Explorer.
>
> I check the following prefix and I get the message, “The server
> encountered an internal error and was unable to complete your request.
> Either the server is overloaded or there is an error in the
> application.”
> 
> http://irrexplorer.nlnog.net/search/216.71.119.0/24
> 
> I am also seeing ARIN records are not updating on IRR Explorer.
> 
> That prefix should show the ASN under ARIN for the ARIN IRR route that
> is registered.
> 
> We are also advertising the prefix and which being advertised by
> Hurricane, but IRR Explorer does not see the route advertised by BGP.
> 
> The website has been like this since yesterday when I first checked.

Thanks for the report! I took a look and restarted the routing sources
database replication process. The server appears to be able to serve
requests again.

A new version of irrexplorer is in development (by programmers more
skilled than me!) - hopefully later this year we can all enjoy it.

Kind regards,

Job


Re: CIDR cleanup

2020-10-02 Thread Job Snijders
On Fri, Oct 02, 2020 at 03:39:00AM -0700, Randy Bush wrote:
> > Marco Marzetti (PCCW) wrote an even faster compression tool!
> > https://github.com/lamehost/aggregate-prefixes
> > 
> > Both these python implementations are meant as replacements for ISC's
> > vintage 'aggregate' Unix utility, with the notable difference that they
> > also support IPv6.
> 
> ok, i gotta ask.  has someone tested to see if they all produce the same
> result givem the same input?  i do not mean to imply they do not.  i
> just have to wonder.

Yes, of course. Marco and I collaborated on the tool's regression
testing.

job@bench $ aggregate6 < dfz_ipv4 | md5
066bfea49c4c20fed7d86d355044764a
job@bench $ aggregate-prefixes < dfz_ipv4 | md5
066bfea49c4c20fed7d86d355044764a

job@bench $ aggregate6 < dfz_ipv6 | md5
1193796d41cc47f32230da281e3ad419
job@bench $ aggregate-prefixes < dfz_ipv6 | md5
1193796d41cc47f32230da281e3ad419

Kind regards,

Job


Re: CIDR cleanup

2020-10-02 Thread Job Snijders
On Thu, Oct 01, 2020 at 02:15:01PM -0300, Marcos Manoni wrote:
> Check https://github.com/job/aggregate6 (thank you, Job)

Marco Marzetti (PCCW) wrote an even faster compression tool!

https://github.com/lamehost/aggregate-prefixes

Both these python implementations are meant as replacements for ISC's
vintage 'aggregate' Unix utility, with the notable difference that they
also support IPv6.

Example:

job@bench ~$ pip3 install aggregate-prefix

job@bench ~$ wc -l dfz_ipv4
810607
job@bench ~$ cat dfz_ipv4 | time aggregate-prefixes - | wc -l
141645
1m40.17s real 1m37.39s user 0m01.60s system

Compressing the whole IPv4 DFZ prefix list takes only 100 seconds.

Kind regards,

Job


Re: SPAM for nanog@ senders

2020-09-21 Thread Job Snijders
Dear Łukasz, others,

Can you please send any suspecious emails (including headers) to
the mailing list admin team at ge...@nanog.org?

We'll try to figure out if it happens through an existing subscription.

Kind regards,

Job
(hat: NANOG geeks)

On Mon, Sep 21, 2020 at 12:51:44PM +0200, Octolus Development wrote:
> I did yeah, annoying.
> 
> 
> Best Regards,
> Octolus
> On 9/21/2020 12:50:54 PM, Łukasz Bromirski  wrote:
> NANOGers,
> 
> Have you got email from 'dating.supp...@csvwebsupport.com’ immediately
> after you post to nanog@? First time I thought it’s coincidence, but
> today when I got it, it’s hardly one ;)
> 
> Topic is '[#WHB-257-41491]: Re: XX’ where  is subject taken
> from last e-mail.
> 
> I understand there’s need to connect people in hard, COVID times,
> but I doubt automated spam sender has good intentions with that regard ;)
> 
> So.. somebody is scrapping this list to feed their spamming lists :/
> 
> —
> ./


Re: how would draft-ymbk-opsawg-finding-geofeeds work in noam

2020-09-16 Thread Job Snijders
On Tue, Sep 15, 2020 at 01:52:05PM -0700, Randy Bush wrote:
> perchance is RDAP implemented by all RIRs?

Yes, but in 5 slightly different ways :-)

Kind regards,

Job


Re: Centurylink having a bad morning?

2020-08-30 Thread Job Snijders
I believe from this moment forward things are converging back to normal.

Kind regards,

Job


Re: TCP and UDP Port 0 - Should an ISP or ITP Block it?

2020-08-25 Thread Job Snijders
On Tue, Aug 25, 2020 at 08:27:24AM -0400, K. Scott Helms wrote:
> Comcast is blocking it.  From the table on that page.
> 
> "Port 0 is a reserved port, which means it should not be used by
> applications. Network abuse has prompted the need to block this port."

The 'Transport' column seems to indicate that TCP port 0 is blocked, but
not that UDP port 0 is blocked. I believe there are comcast people on
this mailing list, it would be interesting to hear what the
considerations were to block one but not the other.

> "What about UDP IP fragmentation?"
> 
> I'm not sure I follow this.  The IP packet will be fragmented with UDP
> inside it.  When the IP packet gets put together the UDP PDU will have
> a port number.  It's possible that some packet analyzers or network
> gear will improperly "see" a partial UDP flow as port 0 but that's a
> mischaracterization of the flow.

You are absolutely right. There is no layer-4 header in a fragment.
'port 0' in netflow/ipfix traffic analyzer tools when displayed may be
the result of a lack of ability to label it differently in the
datastructures used. "mischaracterization" is a fitting word :-)

Kind regards,

Job


Re: TCP and UDP Port 0 - Should an ISP or ITP Block it?

2020-08-25 Thread Job Snijders
On Tue, Aug 25, 2020 at 07:27:33AM -0400, K. Scott Helms wrote:
> I think a fairly easy thing to do is see what other large retail ISPs
> have done.  Comcast, as an example, lists all of the ports they block
> and 0 is blocked.  I do recommend that port 0 be blocked by all of the
> ISPs I work with and frankly Comcast's list is a pretty good one to
> use in general, though you will get some pushback on things like SMTP.
> 
> https://www.xfinity.com/support/articles/list-of-blocked-ports

I may be reading the table incorrectly, but it seems to me Comcast is
*not* blocking UDP port 0 according to the above URL?

> Transit providers are a little bit different, but then again port 0 is
> also different since AFAIK it's never had a legitimate use case.  It's
> always been a reserved port.  I'd personally block it if I ran a
> transit, but I'd be more willing to open it up for one of my large
> customers (in a limited way) than I would on the retail side.
> 
> https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xhtml

What about UDP IP fragmentation?

Kind regards,

Job


Re: RPKI TAs

2020-08-03 Thread Job Snijders
On Mon, Aug 03, 2020 at 08:17:55AM -0500, John Kristoff wrote:
> On Sun, 2 Aug 2020 18:52:11 +
> Randy Bush  wrote:
> 
> > not to mention the ARIN stupidity
> 
> Notwithstanding the RPA, downloading ARIN's TAL is straightforward:
> 
> As documented here:
> 
>   
> 
> One can wget, curl, or whatever this:
> 
>   

I dunno, 'straightforward' to me would mean the ARIN TA is installed by
default when you install a RPKI Cache Validator implementation, all
without requiring lawyers well-versed in both your native language AND
in the American legal system.

I can do DNSSEC, RPKI ROV, Signify, Web PKIs like TLS - all without
kludges. Here is a video (10 min) where I show how you can bootstrap a
system from 0 to 100 without relying party agreements:
https://www.youtube.com/watch?v=oBwAQep7Q7o

The highlight of the video is when I access ARIN's website over HTTPS,
after having resolved their webserver's IP address with a DNSSEC
validating recursor... to discover I need to get a lawyer to download a
.tal file which exists to protect *ARIN* members. Shouldn't ARIN members
demand that the process is as frictionless as possible? (both the new
and old RPA are the opposite of frictionless).

ARIN members (the RPKI users) depend on network operators both inside
and outside the ARIN region to honor their ROAs. The internet is global.
The ARIN ROA's will not be honored if the ARIN .tal file is missing. The
ARIN .tal file is missing because it cannot be included in open source
software without making things very awkward.

It is an insane situation. ARIN resource holders using ARIN's RPKI TA
are measurably *less* protected than their RIPE, APNIC, LACNIC and
AFRINIC counterparts.

Get this:

When you transfer your IP space away from ARIN, to *ANY* other RIR,
you'll derive *MORE* benefits from your RPKI ROA signing efforts. You
don't even need to renumber out of your space to improve your routing
security posture!

I believe ARIN's policy to institute a significant legal barrier to RPKI
infrastructure negatively impacts ARIN's own members.

Imagine having to sign a contract with DigiCert to obtain the public key
to be able to visit https://paypal.com. Ha-ha-ha-ha... folly. It would
be bad for business.

Kind regards,

Job


Re: Issue with Noction IRP default setting (Was: BGP route hijack by AS10990)

2020-08-03 Thread Job Snijders
Dear Ryan,

I have come to believe this is a Noction IRP specific issue.

On Sat, Aug 01, 2020 at 01:29:59PM -0700, Ryan Hamel wrote:
> I disagree on the fact that it is not fair to the BGP implementation
> ecosystem, to enforce a single piece of software to activate the
> no-export community by default

I am not exaggerating when I say that *ONLY* the name of this software
is mentioned when incidents like this happen. Other route manipulation
tools either use different (safer) technologies and/or mark routes with
NO_EXPORT.

Every few weeks I am in phone calls with new people who happened
originated hijacks which existed for traffic engineering purposes and
without fail it is always the same software from the same company that
originated the rogue routes.

It seems more efficient if the software were to ship with improved
default settings than me explaining the problem ad-nauseum to every new
engineer after they unsuspectingly stepped into this trap.

Not extremely dangerous by default, is it really too much to ask?

> Also, wasn't it you that said Cisco routers had a bug in ignoring
> NO_EXPORT? Would you go on a rant with Cisco, even if Noction add that
> enabled checkbox by default?

Cisco and Noction are separate companies, regardless of what Noction
does, the Cisco implementations are expected to confirm to their own
documentation and the BGP-4 specifications.

1/ Without setting NO_EXPORT by a default, route manipulation software
   by default is very dangerous.

2/ Even if NO_EXPORT is set, software defects happen from time to time
   and the existence of fake more-specific routes in a specific routing
   domain can have dire consequences (as has been demonstrated time
   after time).

Not setting NO_EXPORT as a default is setting your customers up for
failure. If your car's seatbelt accidentally breaks, it wouldn't
logically follow to also remove the airbags.

> Why are you not on your soap box about BIRD, FRrouting, OpenBGPd,
> Cisco, Juniper, etc... about how they can possibly allow every day
> screw ups to happen

It is interesting you mention these names, as all of them in recent
years went through a process to revisit some unsafe default behavior
and address it. These companies have far larger userbases, so if they
can do it, anyone can do it!

For the longest time many BGP implementations - BY DEFAULT - would
propagate any and all routes from EBGP peers to all other IGBP and EBGP
peers. The community identified this to be a root cause for many
incidents, and eventually came up with a change to the BGP-4
specification which codifies that the default should be safe instead of
dangerous. https://tools.ietf.org/html/rfc8212

- BIRD introduced support for RFC 8212 in BIRD 2 and higher
- FRRouting changed the defaults in 7.4 and higher
- Cisco IOS XR had RFC 8212 right from the start
- OpenBGPD changed its default behavior in version 6.4
- Juniper is still working on this, in the meantime a SLAX script can be
  used to emulate RFC 8212 behavior: 
https://github.com/packetsource/rfc8212-junos

It is well understood how default settings strongly shape the success or
failure of deployments. This is no different.

Kind regards,

Job


Re: BGP route hijack by AS10990

2020-08-03 Thread Job Snijders
On Mon, Aug 03, 2020 at 02:36:25PM +0200, Alex Band wrote:
> According to the information I received from the community[1], you
> should read PR1461602 and PR1309944 before deploying.
> 
> [1] https://rpki.readthedocs.io/en/latest/rpki/router-support.html

My take on PR1461602 is that it can be ignored, as it appears to only
manifest itself in a mostly cosmetic way: initial RTR session
establishment takes multiple minutes, but once RTR sessions are up
things work smoothly.

Under no circumstances should you enable RPKI ROV functionality on boxes
that suffer from PR1309944. That one is a real showstopper.

Kind regards,

Job


Issue with Noction IRP default setting (Was: BGP route hijack by AS10990)

2020-08-01 Thread Job Snijders
On Sat, Aug 01, 2020 at 06:50:55AM -0700, Ca By wrote:
> I am not normally supporting a heavy hand in regulation, but i think it is
> fair to say Noction and similar BGP optimizers are unsafe at any speed and
> the FTC or similar should ban them in the USA. They harm consumers and are
> a risk to national security / critical infrastructure
> 
> Noction and similar could have set basic defaults (no-export, only create
> /25 bogus routes to limit scope), but they have been clear that their greed
> to suck up traffic does not benefit from these defaults and they wont do
> it.

Following a large scale BGP incident in March 2015, noction made it
possible to optionally set the well-known NO_EXPORT community on route
advertisements originated by IRP instances.

"In order to further reduce the likelihood of these problems
occurring in the future, we will be adding a feature within Noction
IRP to give an option to tag all the more specific prefixes that it
generates with the BGP NO_EXPORT community. This will not be enabled
by default [snip]"
https://www.noction.com/blog/route-optimizers
Mar 27, 2015

Due to NO_EXPORT not being set in the default configuration, there are
probably if not certainly many unsuspecting network engineers who end up
deploying this software - without ever even considering - to change that
one setting in the configuration.

Fast forward a few years and a few incidents, on the topic of default
settings, following the Cloudflare/DQE/Verizon incident:

"We do have no export community support and have done for many
years. The use of more specifics is also optional. Neither replaces
the need for filters."
https://twitter.com/noction/status/1143177562191011840
Jun 24, 2019

Community members responded:

"Noction have been facilitating Internet outages for years and
years and the best thing they can say in response is that it is
technically possible to use their product responsibly, they just
don't ship it that way."
https://twitter.com/PowerDNS_Bert/status/1143252745257979905
June 24, 2019

Last year Noction stated:

"Nobody found this leak pleasant."
https://www.noction.com/news/incident-response
June 26, 2019

Sentiment we all can agree with, change is needed!

As far as I know, Noction IRP is the ONLY commercially available
off-the-shelf BGP route manipulation software which - as default - does
NOT set the BGP well-known NO_EXPORT community on the product's route
advertisements. This is a product design decision which causes
collateral damage.

I would like to urge Noction to reconsider their position. Seek to
migrate the existing users to use NO_EXPORT, and release a new version
of the IRP software which sets NO_EXPORT BY DEFAULT on all generated
routes.

Kind regards,

Job


Re: BGP route hijack by AS10990

2020-07-31 Thread Job Snijders
On Fri, Jul 31, 2020 at 03:34:47PM +0200, Mark Tinka wrote:
> On 31/Jul/20 03:57, Aftab Siddiqui wrote:
> > Not a single prefix was signed, what I saw. May be good reason for
> > Rogers, Charter, TWC etc to do that now. It would have stopped the
> > propagation at Telia.
>
> If none of the prefixes had a ROA, no amount of Telia's shiny new "we
> drop invalids" machine would have helped, as we saw with this incident.

Could it be ... we didn't see any RPKI Invalids through Telia *because*
they are rejecting RPKI invalids?

As far as I know the BGP Polluter software does not have a configuration
setting to only ruin the day of operators without ROAs. :-)

I think the system worked as designed: without RPKI ROV @ Telia the
damage might have been worse.

Kind regards,

Job


Re: BGP route hijack by AS10990

2020-07-30 Thread Job Snijders
On Thu, Jul 30, 2020 at 07:09:07PM +0200, Patrick Schultz wrote:
> so, bgp optimizers... again?

We should stop calling them 'optimizers'... perhaps "BGP Polluters"?

Kind regards,

Job


Re: Hurricane Electric has reached 0 RPKI INVALIDs in our routing table

2020-06-17 Thread Job Snijders
Dear Jon, group,

On Wed, Jun 17, 2020 at 10:25:14AM -0400, Jon Lewis wrote:
> On Mon, 15 Jun 2020, Mike Leber via NANOG wrote:
> 
> > I'm pleased to announce Hurricane Electric has completed our RPKI
> > INVALID filtering project and we now have 0 RPKI INVALIDs in our routing
> > table.
> > 
> > Hurricane Electric has 29021 BGP sessions with 22109 prefix filters with
> > 7191 networks directly and 8239 networks including Internet exchanges.
> 
> The flip side of this though is that every time an IP space owner publishes
> an ROA for an aggregate IP block and overlooks the fact that they have
> customers BGP originating a subnet of the aggregate with an ASN not
> permitted by an ROA, HE has "less than a full table".  :(

Do you remember the old BSD paradigm? ... "less is more" 

I think it applies here. We are now in a time where a *smaller* routing
table entry list count is preferable to a 'full' table, because the
fullest table is likely to also include problematic BGP routing
information.

It is important to recognise that RPKI ROA creation is an *OPTIONAL*
protection mechanism. If you create ROAs, you indeed can harm your
network, but at the same time, if you create the ROAs correctly, you
will gain massive benefits.

RPKI ROA creation is a big hammer. Everyone needs to think carefully
about each ROA they create and if it will positively or negatively
impact their network. NTT spend *months* creating ROAs for all the
prefixes, researching for each BGP announcement if the ROA would be good
or bad. We now got virtually all our space covered by ROAs, it'snice.

> i.e. I'm questioning whether the system is mature enough and properly used
> widely enough for dropping RPKI invalids to be a good idea?

Yes. "We made an impossible bird, and it was able to fly". :-)

The global deployment of RPKI ROV in the BGP Default-Free Zone already
is a fact, we made it work! All carriers that keep the Internet
connected together, and care about preventing routing incidents - are
committed to this effort. Thousands of people are now involved at this
point. 

What now remains.. is polishing away some of the sharp edges
[1][2][3][4], and bikeshedding about some of the colors :-)

The below links are like an 'ala carte menu', anyone can engage in
discussions about RPKI at any level they feel comfortable with. Many
people are looking for feedback and input through different forums on
what and how to build it. Pick a platform you enjoy engaging on and
participate (and stick around on this mailing list, all good)! :)

Kind regards,

Job

[1]: https://www.youtube.com/watch?v=oBwAQep7Q7o
[2]: https://mailarchive.ietf.org/arch/msg/sidrops/ayCQbKvJZmE5TGq9IxL9qUM-zQ4/
[3]: https://github.com/RIPE-NCC/rpki-validator-3/issues/158
[4]: https://twitter.com/routinator3000/status/1255439035553779713


Re: Mikrotik RPKI Testing

2020-06-17 Thread Job Snijders
Dear all,

> I noticed that Mikrotik has added RPKI into their very much beta v7
> branch. I would like to ask those of you that know RPKI well to check
> it out and offer Mikrotik feedback on what they've done
> right\wrong\broken. 

Our hero Massimiliano Stucchi in Switzerland started doing the legwork.
He is is sharing the test results here:

http://as58280.net/en/articles/RPKI-on-Mikrotik

Enjoy!

Kind regards,

Job


Re: Reactive RPKI ROV (Was: Hurricane Electric has reached 0 RPKI INVALIDs)

2020-06-17 Thread Job Snijders
Dear Baldur,

On Wed, Jun 17, 2020 at 01:42:36PM +0200, Baldur Norddahl wrote:
> Lets say someone makes an announcement that creates a RPKI invalid and
> it is determined to be a mistake. They then go back and add ROA
> objects to fix the problem. With this reactive RPKI approach then
> continue to block the route because filters where already generated
> and pushed out to routers? Or in other words, if the system can insert
> the filter in less than 60 seconds, how long does it take to get rid
> of the filter again when someone publish valid a ROA ?

What you describe here is what I'd call a "Garbage Collection" process.
Garbage collection has to happen periodically.

Probably not slower than once an hour. See the following link for an
attempt to document that type of aspect of RPKI ROV deployments:
https://tools.ietf.org/html/draft-ietf-sidrops-rpki-rov-timing-00.html

Maybe HE can comment on their current timers?

Kind regards,

Job


Reactive RPKI ROV (Was: Hurricane Electric has reached 0 RPKI INVALIDs)

2020-06-16 Thread Job Snijders
Dear Mike, Ytti, others,

First of all and most importantly: congratulations Mike! I thank you and
your team for having constructed a great mechanism that helps honor the
routing intentions everyone publishes in the RPKI.

On Tue, Jun 16, 2020 at 09:08:41AM +0300, Saku Ytti wrote:
> On Tue, 16 Jun 2020 at 07:51, Mike Leber via NANOG  wrote:
> > These prefix filters are updated automatically both through a system
> > of daily updates and real time updates to prevent RPKI INVALID
> > routes from being carried in our routing table.
> 
> What does real time mean in this context? Does it mean exactly 0s leak
> of INVALID, or 99% less than 30s? Or how do you define it?

My measurement (samplesize = 1) appears to indicate it took less than a
minute between AS 6939 receiving (and accepting) an RPKI invalid route
announcement, and that same route announcement being removed from the AS
6939 routing tables. Subsequently BGP withdraw messages were sent (for
that RPKI invalid route via 6939) to all their peers, which a few more
minutes to be processed and converge in the global routing system.

I think it is important for the community to understand that the
mechanism 6939 currently uses, is a different approach to what other
network operators are doing.

Most RPKI ROV deployments have set it up in such a way that a-priori all
EBGP routers are primed with a full set of VRPs. Feeding the routers the
VRPs through the RPKI-To-Router (RTR) protocol allows those BGP speakers
to reject an RPKI invalid route - before - installing it in the Loc-RIB.

At the same time, we should recognize and praise anyone who managed to
deploy a reactive mechanism due to the lack of RTR support on a device.

The "route collector -> script -> add prefix list to denylist" approach
cannot be avoided if you have gear in the network that does not support
RPKI OV as specced out in RFC 6811. 

The reactive mechanism must be viewed in context of other protection
mechanisms that are deployed such as Peerlock, Maximum Prefix Limits,
and IRR+RPKI+WHOIS based explicit allowlists, all of which 6939 has
done. I actually had to jump through some hoops in the IRR system to
trick 6939 into accepting my RPKI invalid route announcement. :-)

Since it is with words that we construct the magic of our reality, let's
assign a name specific to this engineering effort:

Reactive RPKI ROV
=

Reactive RPKI ROV means that a network operator has set up a
RPKI-capable route collector which peers with all BGP nodes that do not
support RPKI. The route collector logs all RPKI route announcements it
receives, and these messages can be used as input to an automated
process to update prefix-list filters on the BGP node that received the
RPKI invalid route announcement. The free OpenBGPD or BIRD software can
be used as such route collectors. As is evident from my 'samplesize=1'
study, that whole process can be completed in under one minute.

The alternative to the "Reactive RPKI ROV" approach is what we've
already done for years: emailing a NOC and request manual intervention
to block a problematic route. At the best of times the 'calling the NOC'
approach takes hours. As such, Reactive RPKI ROV is obviously far
preferable to manual approaches.

It would be awesome if the community openly shares notes on how to
construct Reactive RPKI ROV deployments to improve routing for everyone.
Maybe at some point some open source software pops up somewhere to make
it easier for everyone? The future is bright, I'm optimistic we tame the
Default-Free Zone beast :)

So Mike, please consider to submit a presentation proposal to one of the
network operator groups to outline in as much detail as possible how you
did it. I'd love to learn from your experience!

> So my definition of real time here would be 99% <5min.

I think it should be 99% <1 min, because that's how high 6939 set the bar :-)

Kind regards,

Job


academic paper on Peerlock BGP protection mechanism

2020-06-15 Thread Job Snijders
Dear colleagues,

  


  
An interesting paper has been made available as pre-print: "Peerlock:   

  
Flexsealing BGP" by Tyler McDaniel, Jared M. Smith, and Max Schuchard   

  
from the University of Tennessee.   

  


  
The paper probably is the most formal description of Peerlock so far.   

  
They even conducted active measurements to reverse-engineer what the

  
state of Peerlock deployment is in the global Intenet routing system.   

  


  
Abstract: https://arxiv.org/abs/2006.06576  

  
PDF: https://arxiv.org/pdf/2006.06576.pdf   

  


  
Recommended reading!

  


  
Kind regards,   

  


  
Job 


Re: Partial vs Full tables

2020-06-10 Thread Job Snijders
On Tue, Jun 9, 2020, at 08:04, Mark Tinka wrote:
> On 5/Jun/20 18:49, Saku Ytti wrote:
> > The comparison isn't between full or default, the comparison is
> > between static default or dynamic default. Of course with any default
> > scenario there are more failure modes you cannot route around. But if
> > you need default, you should not want to use dynamic default.
> 
> I've found this to be easier to do if your network is reasonably
> "centralized", i.e., there is one or two (or small handful) of "entry
> and exit" points.
> 
> With a stretchy, relatively flat network that neither has a definite
> entry nor exit point, it's a bit difficult to decide which failure mode
> should take the default route away.

A strong case to take the default away is when the PE the customer is connected 
to has become entirely isolated from the rest of the network. This can happen 
as a result of multiple fiber-cuts, or the classic "oops, all the diverse 
fibers went through that one duct". 

One trick is to have each PE originating a default which depends on a route 
that comes from another PE (any other PE). This way a PE that for whatever 
reason has become entirely disconnected from the Autonomous System will cease 
advertising default. Make PEs with an odd-numbered loopback address depend on 
"ROUTE A" and PEs with an even-numbered loopback depend on "ROUTE B" - where A 
is originated only by even-numbered PEs and B is only originated by 
odd-numbered PEs. More advanced sharding strategies can be imagined, many 
additional failure cases too.

Back to basics: as Ytti suggested earlier in the thread, it might be more 
sensible to generate your own default route based on a 'stable anchor prefix' 
coming from the ISP rather than accepting the default your ISP originates 
towards you.

As an example: any NTT customers requesting to receive a default-route from AS 
2914, will - in addition to 0.0.0.0/0 - also receive a route announcement for 
129.250.0.0/16 (2001:418::/32 in IPv6), and if any customer loses visibility on 
129.250.0.0/16 via the direct Customer<>NTT sessions, one probably doesn't want 
to point default in that direction.

- If you originate defaults to your customers: try to make it so that the 
default is withdrawn if the node has become isolated.

- If you want to point default at a service provider: anchor it to a stable 
prefix rather than their 0.0.0.0/0 route.

The above two suggestions may seem at odds with each other :-)

Kind regards,

Job


Re: Update your ARIN IRR data access methods (was: Fwd: [arin-announce] New Internet Routing Registry Release)

2020-06-10 Thread Job Snijders
Dear John, group,

On Wed, Jun 10, 2020 at 06:51:53PM +, John Curran wrote:
> ARIN has released its updated IRR system - if you are relying on
> ARIN’s IRR data, please refer to details below and update access
> methods accordingly.

Ack - NTT has done so.

The 'rr.ntt.net' instance now carries a copy of ARIN-NONAUTH and the
'ARIN' source has been fully reseeded from the published database dumps
at ftp://ftp.arin.net/.

Operators (anywhere on the planet) with a local NRTM cache of ARIN's IRR
will want to make sure the output of the following debugging command
produces output similar to the below.

$ echo '!j-*' | nc rr.ntt.net 43 | egrep "ARIN:|ARIN-NONAUTH"
ARIN:Y:41907-41925
ARIN-NONAUTH:Y:51746-51746

Operators should verify whether auxiliary tooling is aware of this new
IRR source label. An example: in PeeringDB this new source has to be
added: https://github.com/peeringdb/peeringdb/pull/742

Another thing to confirm: any place where one would pass on 'ARIN' as
argument to bgpq4's '-S' command line option may need to review whether
'ARIN' should be changed to 'ARIN,ARIN-NONAUTH' to minimize the
potential for disruption.

Please relay my congratulations to the team on this important milestone!
This was wonderfully unexciting :-)

Kind regards,

Job


Re: "Is BGP safe yet?" test

2020-04-20 Thread Job Snijders
On Mon, Apr 20, 2020, at 21:54, Amir Herzberg wrote:
> Randy said, > From a practical standpoint, this doesn't actually tell 
> the whole truth
> > 
> > indeed. route origin validation, while a good thing, does not make
> > bgp safe from attack. this marketing fantasy is being propagated;
> > but is BS.
> > 
> > origin validation was designed to reduce the massive number of problems
> > cause by fat figured configuration errors by operators. it will not
> > even get all of those; but it will greatly improve things.
> > 
> > but it provides almost zero protection against malicious attack. the
> > attacker merely has to prepend (in the formal, not cisco display) the
> > 'correct' origin AS to their malicious announcement. 
> 
> Randy, I agree of course, that supporting ROV is far from sufficient to 
> ensure BGP security. However, I disagree that this is `zero protection' 
> since the effectiveness of the attack may be much reduced when the 
> attacker has to prepend. Note also that if one combines ASPA, the 
> protection would be even better. The simulation results in our 
> SIGCOMM'2016 give some idea of these benefits (imprecise, of course).

Folks, https://gph.is/1iwqrDk ;-)

I think you can best capitalise on Origin Validation when OV is combined with 
other techniques such as AS_PATH filters (based on Peerlock or ASPA) or in some 
cases direct peering: 
https://www.slideshare.net/apnic/improving-the-peering-business-case-with-rpki

In local scope (IP traffic that will only travel a few milliseconds) I expect 
to see a substantial increase of robustness as more and more networks deploy OV 
(and peer directly with networks that matter to them!), however the long paths 
will remain comparatively more vulnerable. Much like any multi-company 
logistics system spanning the globe.

In the span of just two years we went from "you only need to overcome a single 
obstacle to insert bad routing information in the system", to a situation where 
more and more things need to go wrong before rogue announcements are seen 
universally. This is incredible progress in on a scale most people did not 
imagine possible. Are we there yet? No, but RPKI OV is a critical prerequisite 
to further progress.

>From the last few days it seems to me we still have work ahead of us: folks 
>need to receive training from their peers so they themselves can make informed 
>decisions about RPKI. We should more openly compare notes about software 
>defects and how to workaround them. We should talk about (privately) how to 
>manoeuvre OV deployment projects along in large corporations, most companies 
>don't have a manual for how to deploy something like RPKI OV :-)

One of the best sources of documentation on RPKI is 
https://rpki.readthedocs.io/ - the docs are actively maintained to capture all 
common operational questions that pop-up over time.

Kind regards,

Job


RPKI OV implementation in route-map

2020-04-01 Thread Job Snijders
Dear Mark, group,

On Tue, Mar 31, 2020 at 03:50:23PM +0200, Mark Tinka wrote:
> On 31/Mar/20 15:21, Dorian Kim wrote:
> > Unfortunately we don’t have any testing done or experience with RPKI
> > on XE or Classic boxes as we don’t have any deployed outside of OOB
> > infrastructure.
>
> Cherish your blessings, and for the time being, keep them that way :-).

Since it was a quiet day in early April, Ben and I whipped up something
to generate config in industry standard format to mimic the RFC 6811
RPKI based BGP Origin Validation procedure. It uses the 'route-map'
configuration construct found in some older BGP implementations.

https://github.com/job/rpki-ov-route-map

We didn't test this in production, but I reckon you can upload the
generated output into the router's 'running-config' using a hourly
crontab, TFTP, RANCID, and expect(1). Here is an example config to
copy+paste. If we don't hear back from you we'll assume success. 

(warning: large text file)

https://raw.githubusercontent.com/job/rpki-ov-route-map/master/example-route-map-configuration.txt

After applying the above you can reference 'rpki-ov' at each of your
EBGP peers as ingress policy: "neighbor x.x.x.x route-map rpki-ov in".

Be careful though, performance may not be as good as a native RPKI OV
implementation!

Cheers,

Job & Ben


NTT/AS2914 enabled RPKI OV 'invalid = reject' EBGP policies

2020-03-25 Thread Job Snijders
Dear group,

Exciting news! Today NTT's Global IP Network (AS 2914) enabled RPKI
based BGP Origin Validation on virtually all EBGP sessions, both
customer and peering edge. This change positively impacts the Internet
routing system.

The use of RPKI technology is a critical component in our efforts to
improve Internet routing stability and reduce the negative impact of
misconfigurations or malicious attacks. RPKI Invalid route announcements
are now rejected in NTT EBGP ingress policies. A nice side effect:
peerlock AS_PATH filters are incredibly effective when combined with
RPKI OV.

For NTT, this is the result of a multiyear project, which included
outreach, education, collaboration with industry partners, and
production of open source software shared among colleagues in the
industry.

Shout out to Louis & team (Cloudflare) for the open source GoRTR
software and the OpenBSD project for rpki-client(8).

I hope some take this news as encouragement to consider RPKI OV
"invalid == reject"-policies as safe to deploy in their own BGP
environments too. :-)

If you have questions, feel free to reach out to me directly or the
NTT NOC at .

Kind regards,

Job


Re: interesting troubleshooting

2020-03-20 Thread Job Snijders
On Fri, Mar 20, 2020 at 05:57:19PM -0400, Jared Mauch wrote:
> You also need to watch out to ensure you’re not on some L2VPN type
> product that bumps up against a barrier.  I know it’s a stressful time
> for many networks and systems people as traffic shifts. 

A few years ago we did a presentation about what can happen if hashing
for load balancing purposes doesn't work well (be it either IP or L2VPN
traffic). I think some of the information is still relevant as there
really isn't much difference between the problem existing in the
underlay network's implementation of algorithms or the properties of the
enveloppe that encompasses the overlay network packet.

video of younger job + jeff: https://www.youtube.com/watch?v=cXSwoKu9zOg
slides: 
https://archive.nanog.org/meetings/nanog57/presentations/Tuesday/tues.general.SnijdersWheeler.MACaddresses.14.pdf

Kind regards,

Job


Re: interesting troubleshooting

2020-03-20 Thread Job Snijders
On Fri, Mar 20, 2020 at 05:33:31PM -0400, Nimrod Levy wrote:
> With the increase in remote workers and VPN traffic that won't hash across
> multiple paths, I thought this anecdote might help someone else track down
> a problem that might not be so obvious.

Do we know which specific VPN technologies specifically are harder to
hash in a meaningful way for load balanacing purposes, than others?

If the outcome of this troubleshooting is a list of recommendations
about which VPN approaches to use, and which ones to avoid (because of
the issue you described), that'll be a great outcome.

Kind regards,

Job


Re: Need help removing a old/outdated/incorrect proxy route object

2020-03-17 Thread Job Snijders
I can help! Will follow-up off list.

For future reference: db-ad...@rr.ntt.net is also a good place to direct any 
questions about NTT's IRR service "NTTCOM"

Kind regards,

Job

On Tue, Mar 17, 2020, at 20:54, Sadiq Saif wrote:
> Hi all,
> 
> I am looking for help with removal of a old/outdated/incorrect proxy 
> route object for one of my prefixes, 192.195.251.0/24.
> 
> The object in question:
> route:  192.195.251.0/24
> descr:  Proxy-registered route object
> origin: AS135091
> remarks:This is a HGC customer route-object
> remarks:which is being exported under this origin AS.
> remarks:
> remarks:This route object was created because no existing
> remarks:route object with the same origin was found.
> remarks:
> remarks:Please contact r...@hutchcity.com if you have any
> remarks:questions regarding this object.
> notify: r...@hutchcity.com
> mnt-by: MAINT-AS9304
> changed:r...@hutchcity.com 20171209
> source: NTTCOM
> 
> I reached out to the address on file already but the mail server there 
> is not reachable. Additionally I have no recollection of ever having 
> used services from any of the AS mentioned.
> 
> The correct and only origin for that should be AS393949 as is in the 
> ARIN IRR route object and also the ROA.
> 
> Can somebody help me with this?
> 
> Thanks in advance.
> 
> -- 
>   Sadiq Saif/AS393949
>   https://sadiqsaif.com/
>


Re: AT is suspending broadband data caps for home internet customers due to coronavirus

2020-03-17 Thread Job Snijders
On Tue, Mar 17, 2020, at 19:38, Dan White wrote:
> By "ahead of us", I'm hoping to glean some operational experience from
> European, or networks in larger cities with a more impactful lock
> down.

It is all fairly new here too. Some of the things that have come to mind so far:

- the supply chain for components (linecards / fabric cards) may be hampered, 
shipments are slowed down, probably due to staffing issues at each hop.

- for buildout projects which require a small crew to assemble/construct/lift 
(heavy) things, you may no longer be able to form such crews. One might have to 
entertain the notion that all physical work has to fit the capabilities of a 
single person

- Flying your own staff around to do physical work is no longer a responsible 
option

- Availability of remote hands is reduced (or in some places even entirely 
unavailable)

I'm sure this list will continue to grow as we learn more about how things used 
to work and what no longer works.

Kind regards,

Job


Re: RADB account deletions

2020-03-03 Thread Job Snijders
On Tue, Mar 03, 2020 at 11:22:35AM -0700, Clinton Work wrote:
> It looks like the former Allstream RADB account (MAINT-AS15290) and
> all associated route objects were removed from RADB today.   The
> deletion mainly impacts Canadian route objects registered by the
> former Allstream (now Zayo).   Q9 Networks MAINT-AS12188 was deleted
> at the same time.   

For those interested, I think this is roughly what was deleted:

snapshot/20200302-0001/whois.radb.net$ awk 'BEGIN {RS=""; FS="\n"} 
/MAINT-AS15290/ {print $0 "\n"}' radb.db > MAINT-AS15290.2020.03.02.txt

file @ http://instituut.net/~job/MAINT-AS15290.2020.03.02.txt

I'm sure RADB staff can restore was appropriate if they are contacted.

Kind regards,

Job


Re: akamai yesterday - what in the world was that

2020-02-11 Thread Job Snijders
> Any word on what the update was for? It caused quite a jump in traffic on our 
> network.

On twitter "68 GB" was trending
https://twitter.com/search?q=%2268%20GB%22=trend_click

Kind regards,

Job


Re: new tool: rpki-ov-checker

2020-02-06 Thread Job Snijders
Oops, I see a fat typo slipped in - the correct URL is
https://github.com/job/rpki-ov-checker :-)

Kind regards,

Job

On Thu, Feb 6, 2020 at 20:35 Job Snijders  wrote:

> Dear ops,
>
> I wrote a simple tool to figure out what kind of invalid a rpki invalid
> is, this can aid people in understanding the impact of "invalid ==
> reject" routing policies. Only "invalid_unreachable" routes present
> an operational issue in my opinion, IP addresses covered by "notfound"
> or "valid" less specific routes will still be reachable.
>
> You pass it a file name (or via stdin) with one prefix and origin ASN
> per line (white space separated) representing your full BGP RIB, and
> then you can grep specific for the task at hand to extract the info you
> need:
>
> $ rpki-ov-checker full_rib | fgrep -f customer_prefixes | grep invalid |
> sort -R | head
> invalid_covered_by_notfound 123.101.0.0/21 4809 covering route:
> 123.101.0.0/16 4134
> invalid_covered_by_valid 46.3.74.0/24 134121 covering route: 46.3.0.0/16
> 207636
> invalid_unreachable 83.231.209.0/24 3949
> invalid_unreachable 124.30.247.0/24 9583
> invalid_covered_by_valid 125.21.232.0/24 9730 covering route:
> 125.21.0.0/16 9498
> invalid_unreachable 120.29.92.0/24 17639
> invalid_unreachable 31.40.164.0/24 200872
> invalid_covered_by_notfound 45.12.139.0/24 40676 covering route:
> 45.12.136.0/22 35913
> invalid_covered_by_valid 122.160.178.0/24 24560 covering route:
> 122.160.0.0/16 24560
> invalid_covered_by_valid 61.90.251.0/24 21734 covering route:
> 61.90.192.0/18 7470
>
> NTT is using this to figure out who we need to help fix their ROA or
> correct their BGP announcements.
>
> Get the goods at https://githqub.com/job/rpki-ov-checker
>
> Enjoy!
>
> Kind regards,
>
> Job
>


new tool: rpki-ov-checker

2020-02-06 Thread Job Snijders
Dear ops,

I wrote a simple tool to figure out what kind of invalid a rpki invalid
is, this can aid people in understanding the impact of "invalid ==
reject" routing policies. Only "invalid_unreachable" routes present
an operational issue in my opinion, IP addresses covered by "notfound"
or "valid" less specific routes will still be reachable.

You pass it a file name (or via stdin) with one prefix and origin ASN
per line (white space separated) representing your full BGP RIB, and
then you can grep specific for the task at hand to extract the info you
need:

$ rpki-ov-checker full_rib | fgrep -f customer_prefixes | grep invalid | sort 
-R | head
invalid_covered_by_not-found 123.101.0.0/21 4809 covering route:
123.101.0.0/16 4134
invalid_covered_by_valid 46.3.74.0/24 134121 covering route: 46.3.0.0/16 207636
invalid_unreachable 83.231.209.0/24 3949
invalid_unreachable 124.30.247.0/24 9583
invalid_covered_by_valid 125.21.232.0/24 9730 covering route: 125.21.0.0/16 9498
invalid_unreachable 120.29.92.0/24 17639
invalid_unreachable 31.40.164.0/24 200872
invalid_covered_by_notfound 45.12.139.0/24 40676 covering route:
45.12.136.0/22 35913
invalid_covered_by_valid 122.160.178.0/24 24560 covering route:
122.160.0.0/16 24560
invalid_covered_by_valid 61.90.251.0/24 21734 covering route:
61.90.192.0/18 7470

NTT is using this to figure out who we need to help fix their ROA or
correct their BGP announcements.

Get the goods at https://githqub.com/job/rpki-ov-checker

Enjoy!

Kind regards,

Job


Re: Microsoft mail delivery issue

2020-01-31 Thread Job Snijders
Dear Paul,

I recommend subscribing and reaching out to the “mailop” mailing list. You
may not see replies from the big mail operators in the archives, but I
suspect a lot of relevant people pay attention to this specific list.

https://chilli.nosignal.org/cgi-bin/mailman/listinfo/mailop

Kind regards,

Job

On Fri, Jan 31, 2020 at 21:43 Paul Kelly - Blacknight 
wrote:

> Hi There,
>
>
>
> If there are any Microsoft mail admins on the list can they please contact
> me ASAP. We’re having deliverability problems into you and all the usual
> tools don’t help with fixing problems, only diagnosing them after the fact.
>
>
>
> Thanks,
>
>
> Paul
>
>
>
> Paul Kelly
>
> CTO
>
> Blacknight Internet Solutions Limited
>
> Cloud Hosting, Colocation, Dedicated servers, IP Transit Services
>
> ISO 27001:2013 Certified
>
> Tel: +353(0)599183072
>
> Lo-call: 1850 929 929
>
> DDI: +353 (0) 59 9183091
>
> Skype: flamegrill
>
>
>
>
>
> e-mail: p...@blacknight.com
>
> web: http://www.blacknight.com
>
>
>
>
>
> Blacknight Internet Solutions Ltd, Unit 12A,Barrowside Business Park,
> Sleaty Road, Graiguecullen, Carlow, Ireland. Company No.: 370845
>
>
>


Re: Rogue objects in routing databases

2020-01-24 Thread Job Snijders
Hi!

This came up on our radar somewhere in the last 24 hours too. It indeed
does look very curious. Thank you for your analysis and report.

NTT is taking steps to figure out what is behind this. Our current
working theories are that perhaps the IRR maintainer account was
compromised, or some kind of automation script gone rogue, or perhaps
there is adverserial intent and this is stage setting.

I'm not sure we will be able to report our findings back to this group,
but we are actively investigating.

Kind regards,

Job

On Sat, Jan 25, 2020 at 12:06:51AM +0100, Florian Brandstetter wrote:
> It appears that there is currently an influx of rogue route
> objects created within the NTTCOM and RaDB IRR databases, in
> connection to Quadranet (AS8100) and China Mobile
> International (CMI).
> 
> Examples of affected networks are:
> 
> 193.30.32.0/23
> 45.129.92.0/23
> 45.129.94.0/24
> 
> Networks, which have seemingly no affiliation with
> Quadranet, nor China Mobile International (CMI), which
> merely appears to be an upstream of Quadranet and hence
> creates the route objects in an automated manner.
> 
> Another person has already reached out to Quadranet to find
> out the root cause of the creation of these objects. Their
> support gave an ETA of 24-72 hours.
> 
> The route objects are all identical:
> 
> route:  193.30.32.0/23
> descr:  CMI  (Customer Route)
> origin: AS8100
> mnt-by: MAINT-AS58453
> changed:qas_supp...@cmi.chinamobile.com 20200117
> source: RADB
> 
> There appears to be a correlation with the affected
> networks, a fair share of them is part of AS-SBAG, which in
> turn is part of AS-VMHAUS, which in turn is part of AS-
> QUADRANET and could yield the importing of these prefixes.
> AS-VMHAUS appears to be a customer of Quadranet, listed
> within AS-QUADRANET-CUSTOMER-ASSET.
> 
> These networks do however have no direct connection to
> Quadranet, and are not affiliated with Quadranet, nor are
> currently connected to Quadranet, which, entirely ignoring
> that the `origin` points to Quadranet, makes the route
> object illicit.
> 
> Basically this has given AS8100, whether that be
> legitimately Quadranet, or somebody impersonating/spinning
> up a rogue AS8100, theoretical control over a massive amount
> of prefixes, as these can be advertised without restrictions
> and very likely reach a fairly high percentage of global
> visibility.
> 
> --
> Florian Brandstetter
> President & Founder
> SquareFlow Network LTD.
> 


Re: Dual Homed BGP

2020-01-24 Thread Job Snijders
Dear Brian,

On Fri, 24 Jan 2020 at 17:40, Brian  wrote:

> Hello all. I am having a hard time trying to articulate why a Dual Home
> ISP should have full tables. My understanding has always been that full
> tables when dual homed allow much more control. Especially in helping to
> prevent Async routes.
>

The advantage of receiving full routing tables from both providers is that
in cases where one of the two providers is not yet fully converged, your
routers will use the other provider for those missing destinations. This
may happen during maintenance or router boot-up in your upstream’s network.

Another advantage of receiving full routes is that you can manipulate
LOCAL_PREF per destination, or compose routing policy based on per-route
attributes such as BGP communities your upstreams set. It can happen that a
provider is great for 99% of destinations, except a few - without full
tables such granular traffic-engineering can be cumbersome.

Virtually all internet routing is asymmetric, I wouldn’t consider that an
issue.

Am I crazy?
>

I dropped out of university, never completed my psychology studies, I fear
I am unqualified to answer this question. ;-)

Kind regards,

Job


Re: Wikipedia drops support for old Android smartphones; mandates TLSv1.2 to read

2019-12-31 Thread Job Snijders
On Tue, Dec 31, 2019 at 17:26 Seth Mattinen  wrote:

> On 12/31/19 8:10 AM, joel jaeggli wrote:
> > Argumentation on the basis of a tu quoque fallacy doesn't really add
> > much to the dicussion. Depreciating potentialy dangerous and definitely
> > obsolete protocols does not make you a hypocrite.
>
>
> Then how about privilege?
>
> If someone is living in a less-privileged situation (oppressive regime,
> state controlled ISP, extreme poverty, whatever) there's also a good
> chance that such people may not able to acquire newer/updated technology
> easily, perhaps not even legally at great risk. I will disagree with
> anyone's assertion that people in such conditions deserve to be
> disenfranchised.



I’m not entirely sure an argument based on privilege applies cleanly here.
There are freely supported (open source) TLS 1.2 / TLS 1.3 implementations
available for download - at no cost - that run on commodity hardware, even
as old as i386 cpu chips.

Kind regards,

Job


Re: Holiday route leak

2019-12-30 Thread Job Snijders
Dear all,

On Fri, Dec 27, 2019 at 04:06:24PM -0500, Christopher Morrow wrote:
> If there are AS46844 folk listening around their eggnog ... it'd be
> nice if you would stop leaking prefixes: https://imgur.com/a/Js0YvP2
> 
> this from the current view at: https://bgp.he.net/AS15169#_graph6
> 
> I believe at least: 2620:0:1000::/40
> 
> was leaking around your noction filters.
>
> It is also possible that AS11878 should check their in/out filtering
> as well, since thats' the path I see in the he.net data...
> 
> thanks!
> -chris
> 
> it looks like this is a noction box doing some internal TE things and
> leaking around filters...though normally that appears as a subnet, not
> an exact route match, so perhaps not this time?

Can anyone offer ground-truth confirmation that the Noction IRP software
actually supports IPv6?

Kind regards,

Job


Re: Starting to Drop Invalids for Customers

2019-12-10 Thread Job Snijders
Dear Arturo, group,

On Tue, Dec 10, 2019 at 20:51 Arturo Servin  wrote:

>
> Invalid according to RPKI or IRR? Or both?
>

In this context the use of the word “invalid” refers to the result of
validation procedure described in RFC 6811 - which is to match received BGP
updates to the RPKI and attach either of “valid”, “invalid”, or “not-found”.

In IRR, the challenge has always been that “route:” objects describe a
state of the network that may exist, but the semantics of “route:” objects
don’t allow extrapolation towards what should definitely *not* exist in the
BGP Default-Free Zone.

RPKI ROAs (compared to IRR objects) carry different meaning: the existence
of a ROA (both by definition and common implementation) supersedes other
data sources (IRR, LOAs, or comments in whois records, etc), and as such
can be used on any type of EBGP session for validation of the received
Internet routing information.

Kind regards,

Job

>


Re: Comcast & NTT packet loss today

2019-12-03 Thread Job Snijders
Hi all,

We are following up off-list!

This may be a good moment to mention that the excellent people at the NTT
NOC are always available at n...@ntt.net, or the phone numbers listed in
PeeringDB. :-)

Kind regards,

Job

On Tue, Dec 3, 2019 at 23:19 Ben Cannon  wrote:

> We’re trying to figure out wether this is an isolated or wider incident,
> looking like one of our customer’s flows is fragging between NTT and
> Comcast.
>
>  5  ae-0.a02.snjsca04.us.bb.gin.ntt.net (129.250.2.3)  5.571 ms  13.026
> ms  9.514 ms
>  6  ae-0.comcast.snjsca04.us.bb.gin.ntt.net (129.250.66.34)  119.688 ms
> 117.368 ms  111.902 ms
>  7  be-12578-cr01.9greatoaks.ca.ibone.comcast.net (68.86.88.17)  118.845
> ms  117.035 ms  114.762 ms
>  8  * be-7922-ar01.hayward.ca.sfba.comcast.net (68.86.94.154)  120.025
> ms  114.310 ms
>  9  * be-397-rar01.fairfield.ca.sfba.comcast.net (96.108.99.10)  119.178
> ms *
> 10  162.151.79.66 (162.151.79.66)  123.215 ms  121.184 ms  124.125 ms
>
> If anyone from either entity would like to contact me off list for
> troubleshooting please feel free.
>
>
> -Ben Cannon
> CEO 6x7 Networks & 6x7 Telecom, LLC
> b...@6by7.net
>
>
>
>


A new open source RPKI CA solution: NLnet Labs' Krill

2019-12-03 Thread Job Snijders
Dear fellow network operators,

It appears Santa brought presents early this year! I'd like to draw
attention to the below forwarded message and provide my take on it.

Some of you represent organisations that interact with multiple RIRs,
and have concluded it can be challenging to figure out the RPKI ROA
provisioning process for each individual RIR and integrate those
different processes with your internal business process.

Every RIR provides their members with what is called a 'hosted' RPKI
service. The 'hosted' RPKI service means the RIRs offer web interfaces
which operators use to create & publish RPKI ROAs. However, the devil is
in de details: concepts such as 'who holds the private keys?' or the API
specification differ from RIR to RIR. In this context the differences
aren't necessarily good or bad, they are just different.

For many operators the RIR hosted model is excellent, but ... there also
is a class of users who would perhaps benefit from something more
'unified', and this is where Krill comes in!

The use case where Krill really shines is that you can ask your RIR to
delegate your resources to your Krill instance, and then build your
tooling to interact with just Krill (instead of building RIR-specific
software)!

To me the very existence of Krill is a sign of a maturing RPKI
ecosystem. If I stare deeply into my crystal ball I can already see the
rise of third-party hosted RPKI solutions for provisioning & monitoring
RPKI objects, or integrations with IPAM systems such as 6connect. I
believe these would be positive developments for the operational
Internet community.

In short: if RPKI is on your company's roadmap, give Krill a spin! :)

get the goods: https://github.com/NLnetLabs/krill
documentation: https://rpki.readthedocs.io/en/latest/krill/

Kind regards,

Job

- Forwarded message from Alex Band  -

Date: Tue, 3 Dec 2019 12:33:51 +0100
From: Alex Band 
To: r...@nlnetlabs.nl
Subject: [RPKI] Krill 0.4.0 'The Krill Factor' released and running in
production

Dear mailing list,

We are incredibly proud to introduce Krill 0.4.0 'The Krill Factor'. This
release is the culmination of one and a half years of designing, building,
testing and documenting our RPKI Certificate Authority (CA) and
Publication Server solution.

The first three releases of Krill were meant to test the implementation.
With Krill 0.4.0 'The Krill Factor', we are confident that the software
can be used reliably with all five Regional Internet Registries (RIRs) and
its Route Origin Authorisations (ROAs) are correctly validated by all
Relying Party software implementations. As a result, NLnet Labs is now
running Krill in production under the RIPE NCC parent CA.

With Krill 0.4.0 'The Krill Factor', operators can now generate and
publish RPKI cryptographic material themselves to authorise their BGP
announcements. It supports running RPKI under all five RIRs simultaneously
and transparently, so if you have IP address space in multiple regions you
can manage it as a single pool. Krill can also delegate to child
organisations or customers who, in turn, run their own CA. The built-in
publication server lets operators publish certificates and ROAs from their
own infrastructure. Alternatively, you can use a third party which offers
RPKI publication as a service. In short, all essential functions to run
RPKI yourself using Krill are now available.

Krill can be managed using a Command Line Interface (CLI), as well as an
Application Programming Interface (API). An optional web-based user
interface is currently being developed as a separate project, named
Lagosta. With Krill 0.4.0 'The Krill Factor' data storage and the API are
now stable, allowing for seamless updates going forward. This release
serves as a starting point for further development throughout 2020 and
beyond, where we will work on features such as high availability and
support for just-in-time authorisations integrated tightly with internal
routing management.

Starting with Krill 0.4.0 and Routinator 0.6.0 we are offering commercial
support for our RPKI software solutions, in case this is a requirement for
your organisation or if you want to support the future development of the
software. The service-level agreement (SLA) contract and security policy
is on par with our DNS software NSD and Unbound. End of support for the
software will be publicly announced two years in advance. Krill is
licensed under the Mozilla Public License 2.0. Routinator and all
libraries that are built to support the RPKI toolset are licensed under
the BSD 3-Clause License.

Once again, We would like to extend our gratitude to NIC.br, the RIPE NCC
Community Projects Fund, the Dutch National Cyber Security Centre and the
Mozilla Open Source Support Fund for financially supporting the
development of Krill, as well as our Relying Party software package
Routinator. In addition, our thanks go out to DigitalOcean for offering
their cloud infrastructure for our automated test 

Re: SP 800-189 (Draft), Resilient Interdomain Traffic Exchange

2019-10-28 Thread Job Snijders
Dear Douglas,

Thanks for sharing the link. This is an impressive effort!

Can you share with the group what the best way is to share feedback to
effect changes in the document?

Is there a difference between just emailing you or are there official
channels to be considered?

Kind regards,

Job

On Mon, Oct 28, 2019 at 16:04 Montgomery, Douglas C. (Fed) via NANOG <
nanog@nanog.org> wrote:

> https://csrc.nist.gov/publications/detail/sp/800-189/draft
>
>
>
>
>
> /
>
>
>
> This document provides technical guidance and recommendations for
> technologies that improve the security and robustness of interdomain
> traffic exchange. Technologies recommended in this document for securing
> the interdomain routing control traffic include Resource Public Key
> Infrastructure (RPKI), BGP origin validation (BGP-OV), and prefix
> filtering. Additionally, technologies recommended for mitigating DoS and
> DDoS attacks include prevention of IP address spoofing using source address
> validation with access control lists (ACLs) and unicast Reverse Path
> Forwarding (uRPF). Other technologies such as remotely triggered black hole
> (RTBH) filtering, flow specification (Flowspec), and response rate limiting
> (RRL) are also recommended as part of the overall security mechanisms.
>
>
>
> dougm
>
> --
>
> Doug Montgomery, Manager Internet  & Scalable Systems Research @ NIST
>
>
>


Re: Anyone from NTT America here?

2019-10-23 Thread Job Snijders
Dear Stephen,

I’ll work with you off-list to investigate! :-)

Kind regards,

Job
NTT / AS 2914

On Wed, Oct 23, 2019 at 14:23 Ross Tajvar  wrote:

> What was the source/destination?
>
> On Wed, Oct 23, 2019, 2:10 PM Stephen Satchell  wrote:
>
>> Routing loop
>>
>> >  11.|-- 129.250.24.196 0.0% 1   28.9  28.9  28.9  28.9
>>  0.0
>> >  12.|-- 129.250.130.2540.0% 1   29.0  29.0  29.0  29.0
>>  0.0
>> >  13.|-- 129.250.130.2530.0% 1   29.4  29.4  29.4  29.4
>>  0.0
>> >  14.|-- 129.250.130.2540.0% 1   29.6  29.6  29.6  29.6
>>  0.0
>> >  15.|-- 129.250.130.2530.0% 1   28.5  28.5  28.5  28.5
>>  0.0
>> >  16.|-- 129.250.130.2540.0% 1   29.0  29.0  29.0  29.0
>>  0.0
>> >  17.|-- 129.250.130.2530.0% 1   28.6  28.6  28.6  28.6
>>  0.0
>> >  18.|-- 129.250.130.2540.0% 1   27.9  27.9  27.9  27.9
>>  0.0
>> >  19.|-- 129.250.130.2530.0% 1   28.4  28.4  28.4  28.4
>>  0.0
>> >  20.|-- 129.250.130.2540.0% 1   27.9  27.9  27.9  27.9
>>  0.0
>> >  21.|-- 129.250.130.2530.0% 1   28.2  28.2  28.2  28.2
>>  0.0
>> >  22.|-- 129.250.130.2540.0% 1   29.0  29.0  29.0  29.0
>>  0.0
>> >  23.|-- 129.250.130.2530.0% 1   27.9  27.9  27.9  27.9
>>  0.0
>> >  24.|-- 129.250.130.2540.0% 1   28.6  28.6  28.6  28.6
>>  0.0
>> >  25.|-- 129.250.130.2530.0% 1   28.7  28.7  28.7  28.7
>>  0.0
>>
>


Re: IPv6 Thought Experiment

2019-10-02 Thread Job Snijders
It appears in your thought experiment, a stick is dressed up like a carrot.

I’m not a fan of deploying purely punitive strategies to promote adoption;
technologies should stand on their own and be able to convince the
potential users based on their merit, not based on penalties.


Re: Elad Cohen (was: Re: Cogent sales reps who actually respond)

2019-09-18 Thread Job Snijders
It would be good to see some receipts, offered by the selling party.


Re: new BGP hijack & visibility tool “BGPalerter”

2019-08-15 Thread Job Snijders
Hi Ryan, Alarig,

> On 14/08/2019 19:06, Ryan Hamel wrote:
> > I appreciate the effort and the intent behind this project, but why
> > should the community contribute to an open source project on GitHub
> > that is mainly powered by a closed source binary?
>
On Wed, Aug 14, 2019 at 07:13:47PM +0200, Alarig Le Lay wrote:
> You can build it yourself, see
> https://github.com/nttgin/BGPalerter#more-information-for-developers
> 
> I think that the binaries are here for thoses that don’t want to install
> all the build-chain.

Indeed, the binary files in the github repository in the 'bin/'
directory are merely provided as a convenience service so interested
people don't need to compile the software themselves in order to run
tests. This project is 100% open source.

At some point in the future ready made binaries should move to a
different place, for example perhaps we can distribute packages through
the PPA mechanism for debian/ubuntu. It would be cool if we get to the
point where one can install the software by simply issuing a command
like "apt install bgpalerter". Help with packaging is most welcome! :-)

Kind regards,

Job


new BGP hijack & visibility tool “BGPalerter”

2019-08-14 Thread Job Snijders
Dear NANOG,

Recently NTT investigated how to best monitor the visibility of our own and
our subsidiaries’ IP resources in the BGP Default-Free Zone. We were
specifically looking how to get near real-time alerts funneled into an
actionable pipeline for our NOC & Operations department when BGP hijacks
happen.

Previously we relied on a commercial “BGP Monitoring as a Service”
offering, but with the advent of RIPE NCC’s “RIS Live” streaming API [1] we
saw greater potential for a self-hosted approach designed specifically for
custom integrations with various business processes. We decided to write
our own tool “BGPalerter” and share the source code with the Internet
community.

BGPalerter allows operators to specify in great detail how to distribute
meaningful information from the firehose from various BGP data sources (we
call them “connectors”), through data processors (called “monitors”),
finally outputted through “reports” into whatever mechanism is appropriate
(Slack, IRC, email, or a call to your ticketing system’s API).

The source code is available on Github, under a liberal open source license
to foster community collaboration:

https://github.com/nttgin/BGPalerter

If you wish to contribute to the project, please use Github’s “issues” or
“pull request” features. Any help is welcome! We’d love suggestions for new
features, updates to the documentation, help with setting up a CI
regression testing pipeline, or packaging for common platforms.

Kind regards,

Job & Massimo
NTT Ltd

[1]: https://ris-live.ripe.net/


Re: RPKI adoption

2019-08-14 Thread Job Snijders
Dear all,

On Wed, Aug 14, 2019 at 10:36:44AM +, John Curran wrote:
> On 14 Aug 2019, at 2:26 AM, Matthew Petach  wrote:
> > ...
> > Now, at the risk of bringing down the ire of the community on my
> > head...ARIN could consider tying the elements together, at least for
> > ARIN members.  Add the RPKI terms into the RSA document.  You need
> > IP number resources, congratulations, once you sign the RSA, you're
> > covered for RPKI purposes as well.
> 
> Matthew - 
> 
>   Yes indeed - this is one of several potential improvements that we’re 
> also investigating. 

I've attempted to produce a humorous world map chart to help clarify
there is a degree of asymmetry our community may need to consider:


http://instituut.net/~job/screenshots/e079d90a-3047-4034-8e7c-9caf6e387f1a.png

The ARIN members (mostly located in the red area) would like all
not-ARIN-members (located in the blue area, the rest of the world) to
use and honor their ROAs published through ARIN's RPKI service.

If not for the purpose of facilitating BGP Origin Validation on as many
as possible of Internet's routers to protect one's IP resources, why
else would anyone publish RPKI ROAs through their RIR?

In other words: ARIN members want something (something very reasonable!)
from "the rest of the world", but in order to accomplish that
'something', unfortunately "the rest" needs to agree to the ARIN RPA.
This has proven to be somewhat of an adoption barrier.

It would be fantastic when "the rest" are not required to do any such
thing and the ARIN RPKI TAL can be distributed without restrictions or
limitations.

I would love to see any solution that removes all potential friction for
"the rest of the world", even if that shifts some additional burden to
ARIN members themselves; because it's ARIN members that want something
from the world, less so the other way around.

On Wed, Aug 14, 2019 at 4:42 AM John Curran  wrote:
> Interestingly enough, those same indemnification clauses are in the
> registration services agreement that they already signed but
> apparently they were not an issue at all when requesting IP address
> space or receiving a transfer.

Your observation (if correct) indeed is very interesting, and perhaps
demonstrates that RPKI business is something between ARIN and ARIN's
members, and less so between ARIN and all other potential relaying
parties on this planet. Or phrased differently: perhaps only ARIN
members should be the ones incurring the cost and burden of reviewing
and accepting ARIN's agreements.

I'd like to express my appreciation to ARIN's staff & ARIN's Board of
Trustees for dedicating their time and resources to research how to
improve in this context.

Kind regards,

Job

ps. Ofcourse this map is an oversimplification of the situation,
apologies for any inaccuracies.


Re: 44/8

2019-07-18 Thread Job Snijders
On Fri, Jul 19, 2019 at 3:16 AM Adam Korab  wrote:
>
> On 07/18/2019 at 23:08, Job Snijders wrote:
> > A potential upside is that hamnet operators maybe have access to some RPKI
> > services now!
>
> OK, I'll bitehow do you mean?

Ah, let me clarify, I didn't mean this as a tongue-in-cheek remark.

Previously no RIR "managed" the space in the conventional sense of the
word. In the case of 44.0.0.0/8, the consequences seemed to be that
none of the RIRs were in a position to provide RPKI services (ROAs)
for 44.0.0.0/8 or any more specific block within that /8.

I saw that the IANA registry was updated
https://www.iana.org/assignments/ipv4-address-space/ipv4-address-space.xhtml
it now shows "Administered by ARIN". My interpretation is that now a
pathway exists towards ARIN facilitating the creation of RPKI ROAs
which cover (parts of) 44.0.0.0/8.

In order to get RPKI services in context of ARIN, it appears a RSA or
LRSA needs to exist. I suspect a LRSA-style agreement was
instantiated, opening the door for RPKI services.

Kind regards,

Job


Re: 44/8

2019-07-18 Thread Job Snijders
A potential upside is that hamnet operators maybe have access to some RPKI
services now!


Re: Performance metrics used in commercial BGP route optimizers

2019-07-16 Thread Job Snijders
On Tue, Jul 16, 2019 at 01:24:11PM -0500, Mike Hammett wrote:
> All of the same tragedy can happen without BGP optimizers, and does. 

I disagree. You are skipping over crucial distinction we should make
between common 'route leaks' (incorrect propagation of valid routing
information), and the poison that is 'bgp optimiser hijacks'
(propagating of invalid/nonexistent routing information).

In the first case, a simple leak of existing real routing information,
you'll often see that the outcomes of the leak have a longer AS_PATH,
and that the leaking ASN has an actual path towards the destination. In
the best case the leaked routes are ignored because they don't become
the best path, in the worst case anyone using those leaked paths suffers
from congestion.

In the second case, leaked routes that came from a so-called 'bgp
optimiser', during the leak there is no forwarding path to the actual
destination. The packets circulate in a loop and never arrive at the
intended destination. This is hard downtime for the affected prefixes.
We also often see that the AS_PATH is entirely fabricated by "BGP
optimisers", further increasing the risk of the hijacked route
announcements being used.

> BGP optimizers only harm the global Internet when route filters don't
> do their job. (Un)Fortunately, many other things also harm the global
> Internet when route filters don't do their job. Things other than BGP
> optimizers harm the global Internet more frequently via the same
> vector (lack of proper route filters). 
> 
> A given set of bugs are unlikely to affect both Optimizer edge egress
> filters and upstream ingress filters. If so, the Internet as a whole
> has much graver things to worry about. 

I believe it is a fallacy to state that "because other things can harm
the Internet" it would be somehow become OK to use a BGP optimiser. It
is not, it is extremely dangerous for those networks whose prefixes are
being 'optimised' (née hijacked).

Every day we see negative effects as a result from "bgp optimizers". We
also have observed that some of the 'bgp optimizers' have consciously
chosen to not apply even the most basic of harm reduction methods, see
https://twitter.com/JobSnijders/status/1143205986787831819

We can't stop people from deploying this type of software, the Internet
simply doesn't provide that kind of regulatory environment, but one
should be fully aware of the terrible risks involved when doing so.
Networks should be cognizant of peers they suspect are using such
software to steer traffic.

Kind regards,

Job


Re: Performance metrics used in commercial BGP route optimizers

2019-07-16 Thread Job Snijders
On Tue, Jul 16, 2019 at 6:10 PM Ryan Hamel  wrote:
>
> Nowhere near the number as an engineer fat fingering a route.

How are you able to make that assertion?

> There are ISPs that accept routes all the way to /32 or /128, for traffic 
> engineering with ease, and/or RTBH.

This strikes me as a bit of a red herring. Aren't the damaging effects
of "BGP optimisers" *amplified* (not caused!) by ISPs who accept "all
routes"? An ISP accepting incorrect routing information still is a
step below entities actively generating and distributing incorrect
routing information.

Kind regards,

Job


Re: Performance metrics used in commercial BGP route optimizers

2019-07-16 Thread Job Snijders
On Tue, Jul 16, 2019 at 3:33 PM Mike Hammett  wrote:

> More like do whatever you want in your own house as long as you don't
> infringe upon others.
>

That's where the rub is; when using "BGP optimisers" to influence public
Internet routing, you cannot guarantee you won't infringe upon others.


> The argument against route optimizers (assuming appropriate ingress\egress
> filters) is a religious one and should be treated as such.
>

The argument against "BGP optimizers" is that we *cannot* assume
appropriate ingress or egress filters. It appears to me like fallacy to
suggest a line of reasoning ala  "if you do things correctly, things won't
go wrong". Clearly we've observed many times over that things *do* go wrong.

Some examples: almost every year one of the major BGP vendors has a serious
bug related to the functionality to NO_EXPORT in some release. Also,
routinely we observe there are software defects that cause a device to
behave different (read 'leak') than how the operator had explicitly
configured the device. These are facts, not religious statements.

Perhaps in a bug-free world there is room for dangerous activities, but
there is no such thing as bug-free. And I haven't even covered the human
error angle. We must robustly architect our networks to mitigate or dampen
the negative effects of issues at all layers of the stack.

I consider it wholly inappropriate to write-off the countless hours spend
dealing with fallout from "BGP optimizers" and the significant financial
damages we've sustained as "religious arguments".

Kind regards,

Job


Re: Level3/CenturyLink IRR Contact

2019-07-08 Thread Job Snijders
I will ping you off list with contact details.

Kind regards,

Job

On Mon, Jul 8, 2019 at 6:20 PM Joe Nelson  wrote:
>
> Does anyone know who to contact to have old information removed from 
> Level3/CenturyLink's IRR.  My ASN still shows in their registry with stale 
> information from an old customer of theirs but I can't seem to find anyone at 
> CenturyLink that even knows what an IRR is so I'm just going in circles.  I'd 
> like to just have the stale info removed so when I add my info to Merit, 
> there isn't a conflict.
>
> Thanks,
>
> Joe


Re: CloudFlare issues?

2019-07-04 Thread Job Snijders
> Anyway, you can now enjoy https://rpki.net/s/rpki-test even more! :-)

my apologies, I fumbled the ball on typing in that URL, I intended to
point here: https://www.ripe.net/s/rpki-test


Re: CloudFlare issues?

2019-07-04 Thread Job Snijders
On Thu, Jul 4, 2019 at 8:46 PM Francois Lecavalier
 wrote:

> It's been close to 3 hours now since I dropped them - radio silence.

I am going to assume that "radio silence" for you means that your
network is fully functional and none of your customers have raised
issues! :-)

> Whoever fears implementing RPKI/ROA/ROV, simply don't.  It's very easy to 
> implement, validate and troubleshoot.

Thank you for sharing your report. I believe it is good to share rpki
stories with each other, not just to celebrate the deployment of an
exciting technology, but also to help provide debugging information
ahead of time should there be issues between provider A and B due to a
ROA misconfiguration. Announcing to the public that one has deployed
RPKI - in this stage of the lifecycle of the tech - probably is a
productive action to consider.

Anyway, you can now enjoy https://rpki.net/s/rpki-test even more! :-)

Kind regards,

Job


Re: CloudFlare issues?

2019-07-04 Thread Job Snijders
Dear Francois,

On Thu, Jul 04, 2019 at 03:22:23PM +, Francois Lecavalier wrote:
> Following that Verizon debacle I got onboard with ROV, after a couple
> research I stopped my choice on the drum roll CloudFlare GoRTR
> (https://github.com/cloudflare/gortr).  If you trust them enough they
> provide an updated JSON every 15 minutes of the global RIR aggregate.

At this point in time I think the ideal deployment model is to perform
the validation within your administrative domain and run your own
validators. You can combine routinator with gortr, or use cloudflare's
octorpki software https://github.com/cloudflare/cfrpki

> I'll see down the road if we'll fetch them ourselves but at least it
> got us up and running in less than an hour.  It was also easy for us
> to deploy as the routers and the servers are on the same PoP directly
> connected, so we don't need the whole encryption recipe they provide
> for mass distribution.

yeah, that is true!

> But I also have a question for all the ROA folks out there.  So far we
> are not taking any action other than lowering the local-pref - we want
> to make sure this is stable before we start denying prefixes.  So the
> question, is it safe as of this date to : 1.Accept valid, 2. Accept
> unknown, 3. Reject invalid?  Have any large network who implemented it
> dealt with unreachable destinations?  I'm wondering as I haven't found
> any blog mentioning anything in this regard and ClouFlare docs only
> shows example for valid and invalid, but nothing for unknown.

I believe at this point in time it is safe to accept valid and unknown
(combined with an IRR filter), and reject RPKI invalid BGP announcements
at your EBGP borders. Large examples of other organisations who already
are rejecting invalid announcements are AT, Nordunet, DE-CIX, YYCIX,
XS4ALL, MSK-IX, INEX, France-IX, Seacomm, Workonline, KPN International,
and hundreds of others.

You can run an analysis yourself to see how traffic would be impacted in
your network using pmacct or Kentik, see this post for more info:
https://mailman.nanog.org/pipermail/nanog/2019-February/099522.html

> My assumption is that 1.Accept valid, 2. Accept unknown, 3. Reject
> invalid shouldn't break anything.

Correct! Let us know how it went :-)

Kind regards,

Job


BGP filtering study resources (Was: CloudFlare issues?)

2019-06-25 Thread Job Snijders
Dear Stephen,

On Tue, Jun 25, 2019 at 07:04:12AM -0700, Stephen Satchell wrote:
> On 6/25/19 2:25 AM, Katie Holly wrote:
> > Disclaimer: As much as I dislike Cloudflare (I used to complain
> > about them a lot on Twitter), this is something I am absolutely
> > agreeing with them. Verizon failed to do the most basic of network
> > security, and it will happen again, and again, and again...
> 
> I used to be a quality control engineer in my career, so I have a
> question to ask from the perspective of a QC guy:  what is the Best
> Practice for minimizing, if not totally preventing, this sort of
> problem?  Is there a "cookbook" answer to this?
> 
> (I only run edge networks now, and don't have BGP to worry about.  If
> my current $dayjob goes away -- they all do -- I might have to get
> back into the BGP game, so this is not an idle query.)
> 
> Somehow "just be careful and clueful" isn't the right answer.

Here are some resources which maybe can serve as a starting point for
anyone interested in the problem space:

presentation: Architecting robust routing policies
pdf: 
https://ripe77.ripe.net/presentations/59-RIPE77_Snijders_Routing_Policy_Architecture.pdf
video: 
https://ripe77.ripe.net/archive/video/Job_Snijders-B._BGP_Policy_Update-20181017-140440.mp4

presentation: Practical Everyday BGP filtering "Peerlocking"
pdf: http://instituut.net/~job/NANOG67_NTT_peerlocking_JobSnijders.pdf
video: https://www.youtube.com/watch?v=CSLpWBrHy10

RFC 8212 ("EBGP default deny") and why we should ask our vendors like
Cisco IOS, IOS XE, NX-OS, Juniper, Arista, Brocade, etc... to be
compliant with this RFC:
slides 2-14: 
http://largebgpcommunities.net/presentations/ITNOG3-Job_Snijders_Recent_BGP_Innovations.pdf
skip to the rfc8212 part: https://youtu.be/V6Wsq66-f40?t=854
compliance tracker: http://github.com/bgp/RFC8212

The NLNOG Day in Fall 2018 has a wealth of RPKI related presentations
and testimonies: https://nlnog.net/nlnog-day-2018/

Finally, there is the NLNOG BGP Filter Guide: http://bgpfilterguide.nlnog.net/
If you spot errors or have suggestions, please submit them via github
https://github.com/nlnog/bgpfilterguide

Please let me or the group know should you require further information,
I love talking about this topic ;-)

Kind regards,

Job


Re: CloudFlare issues?

2019-06-24 Thread Job Snijders
On Mon, Jun 24, 2019 at 08:18:27AM -0400, Tom Paseka via NANOG wrote:
> a Verizon downstream BGP customer is leaking the full table, and some more
> specific from us and many other providers.

It appears that one of the implicated ASNs, AS 33154 "DQE Communications
LLC" is listed as customer on Noction's website:
https://www.noction.com/clients/dqe

I suspect AS 33154's customer AS 396531 turned up a new circuit with
Verizon, but didn't have routing policies to prevent sending routes from
33154 to 701 and vice versa, or their router didn't have support for RFC
8212.

I'd like to point everyone to an op-ed I wrote on the topic of "BGP
optimizers": https://seclists.org/nanog/2017/Aug/318

So in summary, I believe the following happened:

- 33154 generated fake more-specifics, which are not visible in the DFZ
- 33154 announces those fake more-specifics to at least one customer 
(396531)
- this customer (396531) propagated to to another upstream provider (701)
- it appears that 701 did not sufficient prefix filtering, or a 
maximum-prefix limit

While it is easy to point at the alleged BGP optimizer as the root
cause, I do think we now have observed a cascading catastrophic failure
both in process and technologies. Here are some recommendations that all
of us can apply, that may have helped dampen the negative effects:

- deploy RPKI based BGP Origin validation (with invalid == reject)
- apply maximum prefix limits on all EBGP sessions
- ask your router vendor to comply with RFC 8212 ('default deny')
- turn off your 'BGP optimizers'

I suspect we, collectively, suffered significant financial damage in
this incident.

Kind regards,

Job


Re: Traffic ratio of an ISP

2019-06-20 Thread Job Snijders
On Thu, Jun 20, 2019 at 4:21 PM Steller, Anthony J
 wrote:
> because it really don’t matter in the whole scheme of things.

Indeed, it doesn't matter. The "traffic ratio" field in PeeringDB
probably should be deprecated, there is no formal definition nor is
are there any operational consequences to changing the contents of
that field. The contents of the field are entirely arbitrary.

If the traffic ratio is relevant (I am not saying it is or isn't),
such traffic ratios probably should be viewed in exclusively in
context of specific ASN pairings. Maybe between you and me we'll see
the dominant traffic direction being one way, and with another ASN
pairing we see the opposite. There is no telling other than through
observation, any such observations are unlikely to be shared with the
general public.

Kind regards,

Job


Re: provider email maintenance standard

2019-06-17 Thread Job Snijders
Dear Matt,

See this URL instead:
https://github.com/jda/maintnote-std/blob/master/standard.md

NTT / AS 2914’s NOC follows this process to keep customers and partners
informed about maintenances.

Kind regards,

Job

On Mon, Jun 17, 2019 at 15:32 Matt Harris  wrote:

> On Mon, Jun 17, 2019 at 8:27 AM Martin Pels 
> wrote:
>
>> https://www.facebook.com/groups/maintnote/
>>
>
> You may want to explain what you're linking to, since that url points to
> context which is locked to those of us who are not authorized to view it.
> It's very helpful when dropping a link to provide some context, and
> especially if the link points to content which is locked behind an
> authentication page.
>
>


Re: someone is using my AS number

2019-06-15 Thread Job Snijders
On Sat, Jun 15, 2019 at 4:45 PM Owen DeLong  wrote:
> > On Jun 15, 2019, at 5:43 AM, Job Snijders  wrote:
> >> On Sat, Jun 15, 2019 at 2:38 PM Owen DeLong  wrote:
> >
owen> >> What I heard you say is: “I’m not going to offer a solution
to your problem, but you shouldn’t use the one you have that currently
works because some things my friends and I are doing react poorly to
it and you may suffer some consequences as a result.”
> >
job> > I have no idea how you would arrive at such a contrived convoluted
job> > interpretation. I'm sorry I can't help further your understanding of
job> > how modern day Internet routing works.

owen> I was pointing out that while you told the guy not to use a tool
that’s been working for him, you didn’t actually answer his question,
nor did you offer any useful alternative.

Your summary of this thread is somewhat incomplete. I'll try myself:

OP started with - "help, my ASN was used without my permission, what
do I do?" - to which NANOG answered "let us know your ASN and we'll
use our rolodex". Awesome, the community tried to help Philip Lavine.

Then in a follow-up (general context) question from Joe Abley: "what
actually can go wrong when the AS_PATH is modified for traffic
engineering purposes?", to which three factually correct answers were
provided:

1/ it may not help you achieve your traffic engineering goal (you
can't know if as-path loop avoidance is enabled or not)
2/ it makes security incident attribution processes harder because
poisoned AS_PATH contain fabricated information
3/ it can lead to hard outages because of interaction with EBGP
routing security filters (such as peer-lock)

Again a productive mail exchange, Joe Abley asked a good question and
the resulting public discussion hopefully helped others learn
something.

Next up: Warren offered in a separate subthread "sometimes it seems
AS_PATH poisoning is the only solution for traffic-engineering, what
else can we do". To which I add: "we should keep in mind that this
'only solution' may result in hard outages", (I assume hard outages
are considered worse than the state of things without traffic
engineering). If BGP communities and telephone requests are not
available, and AS_PATH poisoning seems to be the "only solution",
well, then that is the only "solution" (but poisoning caveats still
apply). There probably is no answer to Warren's question, at least I
couldn't provide one because communities & phone were taken away.

So, you turned something I intended as a simple addition to Warren's
message (a point that hadden't yet been mentioned), into a vague
statement about "Job and his friends". EBGP AS_PATH filters
("peerlock-style") have existed in many forms, since long before I
even had a job in this sector. It is absolutely unclear to me what you
are trying to achieve.

Kind regards,

Job


Re: someone is using my AS number

2019-06-15 Thread Job Snijders
On Sat, Jun 15, 2019 at 09:31:03AM -0400, Jon Lewis wrote:
> On Sat, 15 Jun 2019, Job Snijders wrote:
> > There is no signal from the remote ASN (the one that receive the
> > route announcement) to the Originator ASN about the remote ASN's
> > loop detection policies. Therefor, since you can't know what the
> > remote side will do ahead of time. The only recourse left at that
> > point is active probing (trial & error). Trial and error, where the
> > 'error' state may be an hard outage, means that the method is
> > unreliable.
> 
> How does as-path poisoning failing (i.e. the AS you wanted to ignore a
> route accepts it) cause a hard outage?  

Formatting warning, what follows is an ASCII art diagram:


  |   "the rest"  |
  +---+
  |   |
   +--++--+
   |  ||  |
+--+ 2914 ++ 7018 |
|  |  ||  |
|  +--+---++---+--+
| ||
| |+-+ |
| || | |
| ++ NSP +-+
|  | |
|  +---+-+
|  |
|  +---+---+
|  |   |
+--+ ISP A |
   |   |
   +---+

In the above the ISP called "ISP A" is multihomed to "NTT" and an entity
called "NSP", the NSP is multi-homed to both NTT and AT I attempted
to make this a realistic scenario.

In the above situation the ISP A entity might want to force certain
traffic over the NTT link, and instead of using BGP communities they use
BGP AS_PATH poisoning.

The moment they mangle the AS_PATH on their announcement and insert 2914
in their announcement towards NSP, the following can happen:

When ISP A would want to poison the path, ISP A may expect the following
paths to be visible from the ATT and NTT routes:

AS_PATH   | footnotes
7018_NSP_ISPA_2914_ISPA   | 1
2914_7018_NSP_ISPA_2914_ISPA  | 1
7018_2914_NSP_ISPA_2914_ISPA  | 2
2914_NSP_ISPA_2914_ISPA   | 2
NSP_ISPA_2914_ISPA| 3
7018_2914_ISPA| 4
2914_ISPA | 4

footnotes:
1) rejected on AT routers due to peerlock (2914 is seen in the AS_PATH)
2) rejected by NTT routers due to as-path loop detection, thus never
   propagated to AT Neither NTT or AT will ever use this path.
3) potentially rejected by NSP due to presence of an upstream ASN in
   AS_PATH, thus neither NTT or AT will ever this path.
4) accepted by both AT and NTT. note that this effectively is
   ISP A single homing

In both scenarios it was ISP A's goal to receive less traffic over the
NSP-ISP A link, and the moment they deployed this policy, they'll think
it was successful, because traffic comes in via the NTT-ISP A link.

Now imagine (weeks after doing the AS_PATH poisoning), the link between
2914-ISP A is taken down (maintenance, outage, or whatever) - at that
moment ISP A will discover that their AS_PATH mangling resulted in a
hard outage. There was not switching to the paths via NSP. In fact, the
NSP may not even have accepted the routes in the first place because
many NSPs reject their upstream's ASNs when seen in routes received from
their downstreams.

In this thread, there is some hints of anecdata about when this trick
works 'as intended', but what I'm trying to point out no shortage of
examples where it leads to a problematic situation. In this thread we
seem to have some unclarity about what 'reliable' or 'unreliable' means.
AT will never proactively notify ISP A about changes to their AS_PATH
filters, so what works today may be entirely broken tomorrow.

I'm not disputing that AS_PATH poisoning can't be used to accomplish
traffic engineering objectives, but it is similar to relying on linux
operating system 0-days to obtain root access on a server. Sure,
sometimes it may work, but I hope we can agree that for business
purposes it is not a reliable or recommend way to achieve your goals via
exploits.

> When used for TE, a failure just means a route/path you wanted some
> remote network to ignore is not ignored and might be used.  i.e. Your
> TE may not work as desired, but the packets will still get to you,
> just not necessarily via the path you wanted them to take.

It depends! Keep in mind that traffic engineering based on AS_PATH
mangling is exploiting a property of the default as-path loop detection
behaviour in BGP implementations. This means we're relying on a second
order effect. The loop detection exists to stop the propagation of
loops. This means that such paths won't be considered at even a lower
priority, but are just rejected. The moment paths are rejected at any
point anywhere in the BGP graph, you risk unreachablity.

I hope this helped clarify a bit.

Kind regards,

Job


Re: someone is using my AS number

2019-06-15 Thread Job Snijders
On Sat, Jun 15, 2019 at 05:32:21AM -0700, Owen DeLong wrote:
> > What is the principal harm of doing this? Honest question. I'm not 
> > advocating for anything, just curious.
> > 
> > Excellent question.
> > 
> > 1/ We can’t really expect on the loop detection to work that way at
> > the “jacked” side. So if this is innocent traffic engineering, it is
> > unreliable at best.
> 
> Why not? 

There is no signal from the remote ASN (the one that receive the route
announcement) to the Originator ASN about the remote ASN's loop
detection policies. Therefor, since you can't know what the remote side
will do ahead of time. The only recourse left at that point is active
probing (trial & error). Trial and error, where the 'error' state may be
an hard outage, means that the method is unreliable.

> Since this TE method is unlikely to be used to control propagation
> to/through a stub ASN, it ought to be pretty reliable for the intended
> purpose.

To all other people - AS_PATH poisoning, as a method to perform traffic
engineering, is *not* reliable and can lead to hard outages.

Regards,

Job


Re: someone is using my AS number

2019-06-15 Thread Job Snijders
On Sat, Jun 15, 2019 at 2:38 PM Owen DeLong  wrote:
> Job,
>
> Permit me to apply some reflective listening to your statement:
>
> What I heard you say is: “I’m not going to offer a solution to your problem, 
> but you shouldn’t use the one you have that currently works because some 
> things my friends and I are doing react poorly to it and you may suffer some 
> consequences as a result.”

I have no idea how you would arrive at such a contrived convoluted
interpretation. I'm sorry I can't help further your understanding of
how modern day Internet routing works.

Kind regards,

Job


Re: someone is using my AS number

2019-06-13 Thread Job Snijders
On Thu, Jun 13, 2019 at 11:18 Warren Kumari  wrote:

> On Thu, Jun 13, 2019 at 9:59 AM Joe Abley  wrote:
> >
> > Hey Joe,
> >
> > On 12 Jun 2019, at 12:37, Joe Provo  wrote:
> >
> > > On Wed, Jun 12, 2019 at 04:10:00PM +, David Guo via NANOG wrote:
> > >> Send abuse complaint to the upstreams
> > >
> > > ...and then name & shame publicly. AS-path forgery "for TE" was
> > > never a good idea. Sharing the affected prefix[es]/path[s] would
> > > be good.
> >
> > I realise lots of people dislike AS_PATH stuffing with other peoples' AS
> numbers and treat it as a form of hijacking.
> >
>
> Actually, I've been meaning to start a thread on this for a while.
>
> I have an anycast prefix - at one location I'm a customer of a
> customer of ISP_X &  ISP_Y & ISP_Z. Because ISP_X prefers customer
> routes, any time a packet touches ISP_X, it goes to this location,
> even though it is (severely) suboptimal -- things would be better if
> ISP_X didn't accept this route in this location.
>
> Now, the obvious answer of "well, just ask your provider in this
> location to not announce it to ISP_X. That's what communities / the
> telephone were invented for!" doesn't work for various (entirely
> non-technical) reasons...
>
> Other than doing path-poisoning can anyone think of a way to
> accomplish what I want? (modulo the "just become a direct customer
> instead of being a customer of a customer" or "disable that site", or
> "convince the AS upstream of you to deploy communities / filters").
> While icky, sometimes stuffing other people's AS in the path seems to
> be the only solution...



Given the prevalence of peerlock-style filters at the transit-free club,
poisoning the path may result in a large outage for your prefix rather than
a clever optimization. Poisoning paths is bad for all parties involved.

Kind regards,

Job


Re: someone is using my AS number

2019-06-13 Thread Job Snijders
Hi Joe,

On Thu, Jun 13, 2019 at 9:59 Joe Abley  wrote:

> Hey Joe,
>
> On 12 Jun 2019, at 12:37, Joe Provo  wrote:
>
> > On Wed, Jun 12, 2019 at 04:10:00PM +, David Guo via NANOG wrote:
> >> Send abuse complaint to the upstreams
> >
> > ...and then name & shame publicly. AS-path forgery "for TE" was
> > never a good idea. Sharing the affected prefix[es]/path[s] would
> > be good.
>
> I realise lots of people dislike AS_PATH stuffing with other peoples' AS
> numbers and treat it as a form of hijacking.
>
> However, there's an argument that AS_PATH is really just a loop-avoidance
> mechanism, not some kind of AS-granular traceroute for prefix propagation.
> In that sense, stuffing 9327 into a prefix as a mechanism to stop that
> prefix being accepted by AS 9327 seems almost reasonable. (I assume this is
> the kind of TE you are talking about.)
>
> What is the principal harm of doing this? Honest question. I'm not
> advocating for anything, just curious.



Excellent question.

1/ We can’t really expect on the loop detection to work that way at the
“jacked” side. So if this is innocent traffic engineering, it is unreliable
at best.

2/ Attribution. The moment you stuff AS 2914 anywhere in the path, we may
get blamed for anything that happens through the IP addresses for that
route. In a way the ASNs in the AS_PATH attribute an an
inter-organizational escalation flowchart.

Kind regards,

Job


Re: someone is using my AS number

2019-06-12 Thread Job Snijders
Indeed, I do not see this in the our current version of the
Default-Free Zone, so there may not be a problem for us to solve at
this moment.

I think your reaching out to NANOG or other operator forums is the
correct action. Someone is bound to know someone who knows someone who
can help.

Kind regards,

Job

On Wed, Jun 12, 2019 at 6:06 PM Töma Gavrichenkov  wrote:
>
> Our records show this happened yesterday and lasted before 2019-06-11
> 20:24:00, for 2.5 hours total. Maybe that was just by accident.
>
> I'm sort of confused why you're speaking of some ISPs in India. The
> incident was more or less local to Finland, wasn't it?
>
> --
> Töma


Re: someone is using my AS number

2019-06-12 Thread Job Snijders
Can you share more details? Perhaps we can put the human social network to
good use.

Other than that this is annoying - are right now operationally impacted?

Kind regards,

Job

On Wed, Jun 12, 2019 at 12:24 Filip Hruska  wrote:

> I would contact upstreams of the upstream then. This is quite a serious
> offence and they should help you.
>
> Regards,
> Filip
>
>
> On 12 June 2019 6:20:42 pm GMT+02:00, Philip Lavine <
> source_ro...@yahoo.com> wrote:
>>
>> yeah I did they are some MSP in India. No help.
>>
>> On Wednesday, June 12, 2019, 9:15:51 AM PDT, Filip Hruska 
>> wrote:
>>
>>
>> Contact the offending upstreams.
>>
>> Filip
>>
>> On 12 June 2019 6:05:58 pm GMT+02:00, Philip Lavine via NANOG <
>> nanog@nanog.org> wrote:
>>
>> What is the procedure to have another party to cease and desist in using
>> my AS number?
>>
>> Thx
>>
>>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>


Re: Networks enforcing RPKI validation

2019-06-07 Thread Job Snijders
Dear Eric,

If you don't mind me showering you with some study resources... here we
go!

On Fri, Jun 07, 2019 at 10:58:48AM -0400, Eric Dugas wrote:
> I was wondering if there was a list of networks that enforce RPKI
> validation and dropping invalids.

The last list that was compiled is available here
https://blog.benjojo.co.uk/post/state-of-rpki-in-2018

I expect that by now the list has doubled. We received many anecdotal
reports since then from people having deployed Origin Validation in
their networks. Perhaps if we ask Ben Cartwright-Cox nice enough he can
run a new report for Q2 2019 :-)

> The shortlist I know is: AT (since February of this year) 

Which is awesome! AT's deployment has definitely lowered the barrier
to deployment for others.

> and of course NTT because of Job

Point of clarificartion: NTT is not there yet, but we are on our way.
NTT does not yet apply RFC 6811 Origin Validation on its EBGP session
and does not yet reject RPKI Invalid BGP announcements.

However, NTT does use RPKI data in its filter generation process, more
information on that topic can be found here:
https://blog.apnic.net/2018/08/01/treating-rpki-roas-as-irr-route6-objects/

The next step will be to use RPKI data to ignore conflicting IRR data,
this way the IRR will be harder to abuse in facilitating
misconfigurations or hijacks. An example of that type of use of RPKI
data can be found here https://ripe78.ripe.net/archives/video/119/
slides: 
https://ripe78.ripe.net/presentations/137-db_wg_ripe78_prop2018-06_snijders.pdf

After that, we'll also use RPKI data to strengthen our EBGP filters in a
similar way to how AT does it. I hope that we'll be done Q1 2020 - but
don't hold me to that date! We move at telco speed sometimes ;-)

An overview of where the industry was and where we're heading can be
found in "Routing Security Roadmap" presentation at
https://nlnog.net/nlnog-day-2018/

Finally - here is a quick and easy browser based tool to attempt to
figure out if the network you are connected to performs RPKI based BGP
Origin Validation (and is default-free) https://ripe.net/s/rpki-test

Kind regards,

Job


Re: Cisco Crosswork Network Insights - or how to destroy a useful service

2019-05-15 Thread Job Snijders
On Wed, May 15, 2019 at 11:52:16AM +, Mann, Jason via NANOG wrote:
> ?Is BGPmon going away?

Yes, see
https://bgpmon.net/wp-content/uploads/2019/01/BGPMon.net-EOL-EOS-faq.pdf

Kind regards,

Job


Re: Cisco Crosswork Network Insights - or how to destroy a useful service

2019-05-15 Thread Job Snijders
On Wed, May 15, 2019 at 11:37:57AM +0100, Carlos Friaças wrote:
> It relies *exclusively* on "RIPE RIS Live", or does it also use other
> sources?

The first useful version will rely exclusively on the "RIS Live"
interface. In a later stage we can consider adding something like the
NLNOG Looking Glass data source.

Kind regards,

Job


Re: Cisco Crosswork Network Insights - or how to destroy a useful service

2019-05-15 Thread Job Snijders
Hi,

I recognise the issue you describe, and I'd like to share with you that
we're going down another road. Nowadays, RIPE NCC offers a streaming API
("RIS Live") which has the data needed to analyse and correlate BGP
UPDATES seen in the wild to business rules you as operator define.

NTT folks are working on https://github.com/nlnog/bgpalerter/ - which
relies on "RIPE RIS Live", this software should become a competitive
replacement to current BGP monitoring tools. Stay tuned, the software
will be more useful in the course of the next few weeks.

Kind regards,

Job


Re: Seeking Feedback on Mitigation of New BGP-driven Attack

2019-05-10 Thread Job Snijders
Dear Jared,

This was a very interesting read. Thank you for sharing it with us. The
paper contained new information for me, if I hope I summarize it correctly:
by combining AS_PATH poisoning and botnets, the botnet’s firing power can
be more precisely aimed at a specific target.

Can you clarify what the definition of a “link” is? Is it the logical
interconnection between two ASNs (many pairs of ASNs interconnect in many
places), or is it a reference to a specific physical interconnection
between two routers, each in a different ASN?

The paper mentions that if the top 20 transit-free (“tier-1”) networks
protect each other against poisoning, the Maestro attack is drastically
reduced in effectiveness. I have good news, amongst this set of networks,
there already is a widely deployed anti poisoning mechanism, sometimes
referred to as “Peerlock”. https://www.youtube.com/watch?v=CSLpWBrHy10 /
https://www.nanog.org/sites/default/files/Snijders_Everyday_Practical_Bgp.pdf
. I think this paper suggests the Peerlock practice should be promoted
more, and perhaps automated.

Kind regards,

Job

On Fri, 10 May 2019 at 15:27, Jared Smith  wrote:

> Hello,
>
> Our research lab at the University of Tennessee (volsec.org) has recently
> completed
> a study on channeling link-flooding attack (transit link DDoS) flows
> via BGP poisoning: the Maestro attack. We are seeking feedback on
> mitigation (see below). A brief summary from the abstract:
>
> "Executed from a compromised or malicious Autonomous System (AS),
> Maestro advertises specific-prefix routes poisoned for selected ASes
> to collapse inbound traffic paths onto a single target link. A greedy
> heuristic fed by publicly available AS relationship data iteratively
> builds the set of ASes to poison. Given a compromised BGP speaker with
> advantageous positioning relative to the target link in the Internet
> topology, an adversary can expect to enhance flow density by more than 30%.
> For a large botnet (e.g., Mirai), the bottom line result is augmenting a
> DDoS by more than a million additional infected hosts. Interestingly, the
> size of the adversary-controlled AS plays little role in this
> amplification effect. Devastating attacks on core links can be executed by
> small, resource-limited ASes."
>
> We are seeking feedback from operators on the attack and the proposed
> mitigations we have identified. While we have worked with our campus BGP
> operators, we are reaching out to the broader community for
> additional insights.
>
> Other than general notes/comments, we have two specific questions that we
> would
> like to include feedback for in the final paper soon to be submitted:
>
> 1) Do you already filter poisoned/path prepend advertisements? This would
> mitigate the attack.
>
> 2) After seeing this attack, would you consider adding poison filtering or
> some other Day mitigation?
>
> The preprint is available at: tiny.utk.edu/maestro. See Section 7 on
> defenses.
>
> Please reply with any thoughts. Thank you in advance for comments,
> insight, and general feedback.
>
> Best,
> Tyler McDaniel, Jared Smith, and Max Schuchard
> UT Computer Security Lab
> volsec.org
>


Re: Routing issues to AWS environment.

2019-05-09 Thread Job Snijders
Dear Nick,

I sympathize with you plight, network debugging can be quite a test of
character at times.

I am snipping some text as I can't comment on on specific details in
this case, but you do raise two excellent questions which I can maybe
help with.

On Thu, May 09, 2019 at 03:05:43PM +, Nick Ellermann wrote:
> Is ignoring AS prepending common?

It is not common, but yes it does happen. Some cloudproviders and CDNs
have broken away from the traditional BGP best path selection and use
SDN controllers to steer traffic. I don't know if in play here or not.

> Given my example issue, what direction would you normally take? 

Your issue reminds me of an issue I encountered some years ago. A member
of the Dutch community reported that seemingly random pairs of IP
addresses could not reach each other across an Internet Exchange fabric.
It drove this person crazy because none of the involved parties could
find anything wrong within their domain. The debugging process was hard
because the person had to ask for pingsweeps, traceroutes, would get
information back without timestamps, didn't have the ability to alter
source and destination ports on packets sent for debugging.
It turned out to be a faulty linecard, that under specific circumstances
would hash traffic into a blackhole. It took WEEKS to find this.

So, I identified a need for a more advanced debugging platform - one
that wouldn't require human-to-human interaction to help operators debug
things, in other words it seemed to make sense to stand up linux shell
servers in lots of networks and share access with each other. This
project is the NLNOG RING and I'd recommend you to participate.

An introduction can be found here
https://www.youtube.com/watch?v=TlElSBBVFLw and a nice use case video is
available here https://www.youtube.com/watch?v=mDIq8xc2QcQ

NTT, Amazon, and many others are part of it, and I assume that you have
SSH access to the problematic destination so I hope you can use tcpdump
there to verify if you can or can't receive packets coming from NLNOG
RING nodes.

You mentioned that altering your announcements (deaggregating,
prepending) resolves the issue, this strongly suggests that something
somewhere is broken and it is a matter of triangulating until you've
find the shortest path that exhibits the problem. Perhaps you can find
something like "Between these two nodes, when I use source port X,
protocol Y, destport Z, traffic doesn't arrive".

Website: https://ring.nlnog.net/

There also is an IRC channel where people perhaps can help you make the
best use of this tool.

Kind regards,

Job


Re: Routing issues to AWS environment.

2019-05-09 Thread Job Snijders
Hi Chuck,

On Thu, May 09, 2019 at 06:34:21AM -0400, Chuck Church wrote:
> Are you sure the problem isn’t NTT? My buddy’s WISP peers with Spirit
> and had a boatload of problems with random packet loss affecting
> initially just SIP and RTP (both UDP). Spirit was blaming NTT.
> Problems went away when Spirit stopped peering with NTT yesterday.
> Path is through Telia now to their main SIP trunk provider.

I don't know the specifics of what you reference, but in a large
geographically dispersed network like NTT's backbone, I can assure you
there will always be something down somewhere. Issues can take on many
forms: sometimes it is a customer specific issue related to a single
interface, sometimes something larger is going on.

It is quite rare that the whole network is on fire, so in the general
case is good to investigate and consider each and every report about
potential issues separately.

The excellent people at the NTT NOC are always available at n...@ntt.net
or the phone numbers listed in PeeringDB.

Kind regards,

Job


Re: NTP for ASBRs?

2019-05-08 Thread Job Snijders
Dear Lars,

On Wed, May 08, 2019 at 09:56:33AM +0200, Lars Prehn wrote:
> do you NTP sync your AS boundary routers?

yes

> If so, what are incentives for doing so? Are there incentives, e.g.
> security considerations, not to do it?

The major advantage of NTP syncing your routers is that it allows you to
more effectively correlate any log messages that these devices emit to
log messages other devices generated.

Did two events happen at separate times, or was it perhaps the same
event at the same time? the incentive is ease of troubleshooting.

on this topic, i strongly recommend to operate all devices in the
Etc/UTC timezone, this makes coordination with external entities much
easier.

Kind regards,

Job


Re: NTP question

2019-05-01 Thread Job Snijders
Dear Mehmet,

On Wed, May 01, 2019 at 03:22:57PM -0400, Mehmet Akcin wrote:
> I am trying to buy a GPS based NTP server like this one
> 
> https://timemachinescorp.com/product/gps-time-server-tm1000a/
> 
> but I will be placing this inside a data center, do these need an
> actual view of a sky to be able to get signal or will they work fine
> inside a data center building? 

This will *not* work if the antenna is placed *inside* the datacenter.

The trick is to order a spot on the roof of the datacenter, have the
facility staff place the antenna there, and run a cable to the NTP
server in your rack.

It'll depend on the facility what the MRC / NRC is for this service will
be.

Kind regards,

Job


Re: Packetstream - how does this not violate just about every provider's ToS?

2019-04-24 Thread Job Snijders
Dear Anne,

On Wed, Apr 24, 2019 at 11:07:51PM -0600, Anne P. Mitchell, Esq. wrote:
> How can this not be a violation of the ToS of just about every major 
> provider? 

Can you perhaps cite ToS excerpts from one or more major providers to
support your assertion?

> Anne P. Mitchell, 
> Attorney at Law
> GDPR, CCPA (CA) & CCDPA (CO) Compliance Consultant
> Author: Section 6 of the CAN-SPAM Act of 2003 (the Federal anti-spam law)
> Legislative Consultant
> CEO/President, Institute for Social Internet Public Policy
> Board of Directors, Denver Internet Exchange
> Board of Directors, Asilomar Microcomputer Workshop
> Legal Counsel: The CyberGreen Institute
> Legal Counsel: The Earth Law Center
> California Bar Association
> Cal. Bar Cyberspace Law Committee
> Colorado Cyber Committee
> Ret. Professor of Law, Lincoln Law School of San Jose
> Ret. Chair, Asilomar Microcomputer Workshop

Are you listing all the above because you are presenting a formal
position supported by all these organisations about ToS? Can you for
instance clarify how signing of as a director for the Denver Internet
Exchange shapes the context of your ToS message?

Or, perhaps you are listing the above for some kind of self-marketing
purposes? If that is the case, please note that it is fairly uncommon to
use the NANOG mailing list to distribute resumes. I know numerous
websites dedicated to the dissemination of work histories, perhaps you
can use those instead of operational mailling list?

Regards,

Job

ps. RFC 3676 section 4.3


Re: SOLVED (was Re: request for help: 192.139.135.0/24)

2019-04-03 Thread Job Snijders
Hi all,

On Wed, Apr 03, 2019 at 10:59:18AM -0400, Jay Borkenhagen wrote:
> I urge folks facing similar problems to publish RPKI ROAs for their IP
> resources. [snip] the verifiable statements in RPKI ROAs can be
> attributed to you as the actual resource holder, thus helping folks
> base their response actions on your intent.
> 
> If you are not facing similar problems today, you could be tomorrow:
> so publish your ROAs now!

Jay is touching upon a very important aspect here: without the RPKI ROA
it would've taken NTT significantly more effort to decide whether
removal of the erroneous IRR route object would've been appropriate or
not. We consider RPKI ROAs a higher source of truth, so drawing
conclusions when faced with unvalidated IRR data is a breeze.

RPKI ROAs can be instrumental in resolving issues of administrative
nature. Keep in mind that ROAs are not just for BGP Origin Validation
but serve other useful purposes too. Publish your ROAs today!

Kind regards,

Job

ps. Usual caveats apply to IP resources managed through ARIN; the ARIN
TAL is not as well distributed as RPKI TALs from other RIRs; this
essentially has lead to a degradation of the quality of ARIN's RPKI
service. This policy proposal may help address operational issues:
https://www.arin.net/participate/policy/drafts/2019_4/


Re: request for help: 192.139.135.0/24

2019-04-02 Thread Job Snijders
Ack for NTT

On Mon, Apr 1, 2019 at 21:36 Christopher Morrow 
wrote:

> (from offline chat and pokery)
>
> It looks like 701/1239/3356 are permitting 4837 to announce this prefix
> because:
> $ whois -h whois.radb.net 192.139.135.0
> route:  192.139.135.0/24
> descr:  managedway company
> origin: AS53292
> mnt-by: MAINT-AS53292
> changed:rsand...@managedway.com 20181128  #23:11:53Z
> source: RADB
>
> route: 192.139.135.0/24
> descr: GLENQCY1
> origin:AS271
> mnt-by:BELL-RC
> changed:   con...@in.bell.ca 19930820
> source:BELL
>
> route:  192.139.135.0/24
> descr:  CMI IP Transit
> origin: AS4808
> admin-c:MAINT-CMI-INT-HK
> tech-c: MAINT-CMI-INT-HK
> mnt-by: MAINT-CMI-INT-HK
> changed:qas_supp...@cmi.chinamobile.com 20160525
> source: NTTCOM
>
> mntner: MAINT-CMI-INT-HK
> descr:  China Mobile International Limited
> country:HK
> admin-c:CMIL1-AP
> upd-to: qas_supp...@cmi.chinamobile.com
> auth:   # Filtered
> mnt-by: MAINT-CMI-INT-HK
> referral-by:APNIC-HM
> last-modified:  2017-11-22T09:00:43Z
> source: APNIC
>
>
> There is some less-than-great management of the associated IRR data.
> It'd be in the best interest of  (Metro Wireless) to start
> asking the various IRR's:
>   bell - con...@in.bell.ca ?
>   radb -
>   nttcom - job?
>   apnic -
>
> to remove the objects in question.
> I'm curious why NTT's still holding this record since there's a competing
> ROA?
>
> On Mon, Apr 1, 2019 at 1:27 PM Jay Borkenhagen  wrote:
> >
> > [No attempts at 01-April humor will be attempted in this message.]
> >
> >
> > Seeking help from routing engineers around the 'net:
> >
> >
> > ARIN documents that 192.139.135.0/24 has been allocated to Metro
> > Wireless International:
> >
> >  https://whois.arin.net/rest/net/NET-192-139-135-0-1
> >
> > Further, the party to whom 192.139.135.0/24 has been allocated has
> > published a ROA in ARIN's hosted RPKI asserting that bgp announcements
> > for that prefix are valid only when originating in AS63251.  To view
> > this, go to your favorite RPKI vantage point that uses ARIN's TAL.  If
> > you don't yet have a favorite, feel free to telnet to
> > route-server.ip.att.net and run:
> >
> >  show validation database record 192.139.135.0/24
> >
> >
> > Unfortunately, as may be seen at route-views, etc, most of the
> > Internet now prefers an invalid path that's mis-originated in as4808:
> >
> >
> >  Network  Next Hop  Path
> >  *   192.139.135.0208.51.134.2543549 3356 4837 4808 i
> >  *194.85.40.15  3267 3356 4837 4808 i
> >  *193.0.0.56 1273 4837 4808 i
> >  *37.139.139.0  57866 6762 4837 4808 i
> >  *12.0.1.63 7018 1299 53292 63251 ?
> >  *140.192.8.16  54728 20130 6939 4837 4808 i
> >  *91.218.184.60 49788 1299 53292 63251 ?
> >  *203.181.248.168   7660 2516 4837 4808 i
> >  *154.11.12.212 852 4837 4808 i
> >  *134.222.87.1  286 1299 53292 63251 ?
> >  *209.124.176.223   101 101 3356 4837 4808 i
> >  *137.39.3.55   701 4837 4808 i
> >  *94.142.247.3  8283 1239 4837 4808 i
> >  *162.251.163.2 53767 3257 1299 53292 63251 ?
> >  *212.66.96.126 20912 1267 3356 4837 4808 i
> >  *198.58.198.2551403 6461 4837 4808 i
> >  *198.58.198.2541403 6461 4837 4808 i
> >  *>   202.232.0.2   2497 4837 4808 i
> >  *203.62.252.83 1221 4637 4837 4808 i
> >  *132.198.255.253   1351 6939 4837 4808 i
> >  *206.24.210.80 3561 209 4837 4808 i
> >  *195.208.112.161   3277 39710 9002 3356 4837 4808 i
> >  *217.192.89.50 3303 4837 4808 i
> >  *173.205.57.23453364 3257 1299 53292 63251 ?
> >  *207.172.6.20  6079 3356 4837 4808 i
> >  *207.172.6.1   6079 3356 4837 4808 i
> >  *208.74.64.40  19214 174 4837 4837 4808 i
> >  *144.228.241.130   1239 4837 4808 i
> >  *162.250.137.254   4901 6079 3356 4837 4808 i
> >  *114.31.199.1  4826 1299 53292 63251 i
> >  *64.71.137.241 6939 4837 4808 i
> >
> >
> > Please help the Metro Wireless International folks get this cleared up
> > so their 192.139.135.0/24 can once again be usable.  In particular,
> > help is sought from 4837 and their transit providers:
> >
> >  1239
> >  701
> >  3356
> >
> > (Yes, I am trying to reach folks at those networks in other ways, too.)
> >
> >
> > Thanks.
> >

Re: Was wrong Re: Did IPv6 between HE and Google ever get resolved?

2019-03-29 Thread Job Snijders
A careful observer will note multiple fractures/rifts in the ipv6
default-free zone. It’s not as meshed as ipv4, unfortunately.

Kind regards,

Job


Re: Advertisement of Equinix Chicago IX Subnet

2019-03-28 Thread Job Snijders
On Wed, Mar 27, 2019 at 09:36:20PM +, Graham Johnston wrote:
> This afternoon at around 12:17 central time today we began learning
> the subnet for the Equinix IX in Chicago via a transit provider; we
> are on the IX as well. The subnet in question is 208.115.136.0/23.
> Using stat.ripe.net I can see that this subnet is also being learned
> by others, see the snip below. On our network this caused a nasty
> routing loop until we figured out what was wrong. My current best
> understanding is that because the route was learned via eBGP it
> trumped the OSPF learned route. As soon as I filtered the
> advertisement from my transit provider everything returned to normal.
> What am I doing that isn’t best practices that would have prevented
> this?

There is two pieces to help prevent this type of failure:

1/ Equinix should have created a RPKI ROA for 208.115.136.0/23, with an
   Origin ASN of 0 or one of their own ASNs, and a Max Length of 23.

2/ You should implement RPKI based BGP Origin Validation in your network
   and honor those ROAs.

Kind regards,

Job


Re: Advertisement of Equinix Chicago IX Subnet

2019-03-28 Thread Job Snijders
On Thu, Mar 28, 2019 at 02:59:43PM +0100, Niels Bakker wrote:
> * christopher.morrell.na...@gmail.com (Christopher Morrell) [Thu 28 Mar 2019, 
> 14:35 CET]:
> > I've been bit by this in the past at two different exchanges. I too
> > have a policy applied to deny IXP LANs from upstreams and peers. It
> > would be nice if there was a list of all IXP LANs somewhere that we
> > could generically add to all upstream and peers.
> 
> I like Nick Hilliard's posted solution much better than creating
> static bogon lists that people will eventually forget about.

IXPs can use RPKI ROAs to signal to the world what their intentions are!
IXPs could either create a ROA with an Origin ASN of '0' to suggest to
the world that the peering lan prefix should never be visible in the
DFZ, or they can specify their own services ASN and simply not announce
the prefix. In either case IXPs should carefully specify the Max Length
value to be the same as the Prefix Length value of the peering lan
prefix.

Kind regards,

Job


Re: well-known Anycast prefixes

2019-03-21 Thread Job Snijders
On Thu, Mar 21, 2019 at 06:59:18PM +0300, Frank Habicht wrote:
> On 20/03/2019 21:05, James Shank wrote:
> > I'm not clear on the use cases, though.  What are the imagined use cases?
> > 
> > It might make sense to solve 'a method to request hot potato routing'
> > as a separate problem.  (Along the lines of Damian's point.)
> 
> my personal reason/motivation is this:
> Years ago I noticed that my traffic to the "I" DNS root server was
> traversing 4 continents. That's from Tanzania, East Africa.
> Not having a local instance (back then), we naturally sent the traffic
> to an upstream. That upstream happens to be in that club of those who
> don't have transit providers (which probably doesn't really matter, but
> means a "global" network).

Luckily there are other root servers too! :)

> My Theory :
> So just because one I-root instance was hosted at a customer (or
> customer's customer), that got higher local-pref and now packets take
> the long way from Africa via Europe, NorthAmerica to Asia and that
> customer in Thailand. While closer I-root instances would obviously be
> along the way, just not from a paying customer, "only" from peering.
> 
> I don't know whether or not to blame that "carrier" for intentionally(?)
> carrying the traffic that far - presumably the $ they got for that from
> the I-root host in Thailand was worth it, and not enough customers
> complained enough about the latency?
> 
> But I think it would be worthwhile to give them an option and produce a
> mechanism of knowing what's anycasted.
> 
> Maybe (thinking of it) a solution for really well-known prefixes
> available at many instances/locations (like DNS root) would be to have
> their fixed set of direct transits at all the "global" nodes and
> everywhere else to tell peers to not advertise this to upstreams.

In all instances of what you mention you need cooperation from the
network which is routing in a (from your perspective) suboptimal way.

Either the customer of that upstream should use BGP communities to
localize the announcement, or the upstream themselves need to change
their routing policy to set 'same LOCAL_PREF everywhere' for some
prefixes. Of course any input channel into routing policy can be a
vector of abuse.

Even if you equalize the LOCAL_PREF attribute across your network edge,
you still have other tie breakers such as AS_PATH length. It is not
clear to me how a list of well-known anycast addresses, in practise,
would help swing the pendulum. In all cases you need cooperation from a
lot of networks, and the outcome is not clearly defined because we don't
have a true inter-domain 'shortest latency path' metric.

Kind regards,

Job


Re: FB? / AS 200020 leak

2019-03-14 Thread Job Snijders
Hi,

On Thu, Mar 14, 2019 at 02:04:39PM +, Jeroen Wunnink wrote:
> The route-leak was something different that seems to have mainly hit
> west-Europe between 16:52 UTC to 17:08 UTC. There’s a few people in
> the *NOG communities still digging at the complete details of that
> right now, but it currently points to have originated from AS200020,
> impacting a few large upstreams for a short period of time.

Here are some details of prefixes affected (courtesy of Doug Madory).
The percent at the beginning is the percentage of the peering sources
that saw each prefix leaked. Last column is the AS_PATH

March 14th, 2019 - 16:43 UTC was the start of the BGP leak incident.

The leak was very serious in terms of negative impact, it affected many
West European access providers (for instance AS 1136 has over 50% of
Dutch access market).

Kind regards,

Job

70.6% 92.68.0.0/14 KPN B.V. NL ... 200020 1136
70.6% 92.64.0.0/14 KPN B.V. Amsterdam Provincie Noord-Holland NL ... 200020 1136
70.4% 93.154.64.0/18 KPN B.V. NL ... 200020 1136
70.4% 93.154.0.0/18 KPN B.V. NL ... 200020 1136
70.4% 86.88.0.0/13 KPN B.V. NL ... 200020 1136
70.4% 86.80.0.0/13 KPN B.V. NL ... 200020 1136
70.4% 84.84.0.0/14 KPN B.V. NL ... 200020 1136
70.4% 84.80.0.0/14 KPN B.V. NL ... 200020 1136
70.4% 81.206.0.0/15 KPN B.V. NL ... 200020 1136
70.4% 81.204.0.0/15 KPN B.V. NL ... 200020 1136
70.4% 80.61.0.0/16 Customers NL ... 200020 1136
70.4% 80.60.0.0/16 Customers NL ... 200020 1136
70.4% 77.62.0.0/15 KPN B.V. NL ... 200020 1136
70.4% 77.60.0.0/15 KPN B.V. NL ... 200020 1136
70.4% 77.170.0.0/15 KPN B.V. NL ... 200020 1136
70.4% 77.168.0.0/15 KPN B.V. NL ... 200020 1136
70.4% 77.164.0.0/14 KPN B.V. NL ... 200020 1136
70.4% 77.160.0.0/14 KPN B.V. NL ... 200020 1136
70.4% 62.12.0.0/20 KPN B.V. Amsterdam Provincie Noord-Holland NL ... 200020 1136
70.4% 46.145.0.0/16 KPN B.V. NL ... 200020 1136
70.4% 46.144.0.0/16 KPN B.V. NL ... 200020 1136
70.4% 31.161.0.0/16 KPN B.V. NL ... 200020 1136
70.4% 31.160.0.0/16 KPN B.V. NL ... 200020 1136
70.4% 145.7.128.0/17 KPN B.V. NL ... 200020 1136
70.4% 145.133.0.0/16 KPN B.V. NL ... 200020 1136
70.4% 145.132.0.0/16 KPN B.V. NL ... 200020 1136
70.1% 89.200.64.0/18 KPN Mobile The Netherlands B.V. NL ... 200020 1136
70.1% 89.200.0.0/18 KPN Mobile The Netherlands B.V. NL ... 200020 1136
70.1% 83.232.32.0/19 KPN Mobile The Netherlands B.V. NL ... 200020 1136
70.1% 83.232.128.0/17 KPN B.V. NL ... 200020 1136
70.1% 83.232.0.0/19 KPN Mobile The Netherlands B.V. NL ... 200020 1136
70.1% 83.232.0.0/17 KPN B.V. Amsterdam Provincie Noord-Holland NL ... 200020 
1136
70.1% 82.171.96.0/19 Customers NL ... 200020 1136
70.1% 82.171.64.0/19 Customers NL ... 200020 1136
70.1% 82.171.32.0/19 Customers NL ... 200020 1136
70.1% 82.171.192.0/18 Customers NL ... 200020 1136
70.1% 82.171.128.0/18 Customers NL ... 200020 1136
70.1% 82.171.0.0/19 Customers NL ... 200020 1136
70.1% 82.170.128.0/17 Customers NL ... 200020 1136
70.1% 82.170.0.0/17 Customers NL ... 200020 1136
70.1% 82.169.96.0/20 Customers NL ... 200020 1136
70.1% 82.169.80.0/20 Customers NL ... 200020 1136
70.1% 82.169.64.0/20 Customers NL ... 200020 1136
70.1% 82.169.32.0/19 Customers NL ... 200020 1136
70.1% 82.169.224.0/19 Customers NL ... 200020 1136
70.1% 82.169.192.0/19 Customers NL ... 200020 1136
70.1% 82.169.176.0/20 Customers NL ... 200020 1136
70.1% 82.169.160.0/20 Customers NL ... 200020 1136
70.1% 82.169.144.0/20 Customers NL ... 200020 1136
70.1% 82.169.128.0/20 Customers NL ... 200020 1136
70.1% 82.169.112.0/20 Customers NL ... 200020 1136
70.1% 82.169.0.0/19 Customers NL ... 200020 1136
70.1% 82.168.64.0/18 Customers NL ... 200020 1136
70.1% 82.168.240.0/20 Customers NL ... 200020 1136
70.1% 82.168.224.0/20 Customers NL ... 200020 1136
70.1% 82.168.208.0/20 Customers NL ... 200020 1136
70.1% 82.168.192.0/20 Customers NL ... 200020 1136
70.1% 82.168.160.0/19 Customers NL ... 200020 1136
70.1% 82.168.128.0/19 Customers NL ... 200020 1136
70.1% 82.168.0.0/18 Customers NL ... 200020 1136
70.1% 82.136.224.0/19 KPN B.V. NL ... 200020 1136
70.1% 82.136.192.0/19 KPN B.V. NL ... 200020 1136
70.1% 80.60.224.0/20 Customers Amsterdam Provincie Noord-Holland NL ... 200020 
1136
70.1% 80.60.224.0/19 Customers Amsterdam Provincie Noord-Holland NL ... 200020 
1136
70.1% 77.173.128.0/17 Customers NL ... 200020 1136
70.1% 77.173.0.0/17 Customers NL ... 200020 1136
70.1% 77.172.128.0/17 Customers NL ... 200020 1136
70.1% 77.172.0.0/17 Customers NL ... 200020 1136
70.1% 62.41.128.0/17 KPN B.V. NL ... 200020 1136
70.1% 62.41.0.0/17 KPN B.V. NL ... 200020 1136
70.1% 62.25.32.0/19 KPN B.V. NL ... 200020 1136
70.1% 62.25.0.0/19 KPN B.V. NL ... 200020 1136
70.1% 62.21.192.0/18 KPN B.V. Amsterdam Provincie Noord-Holland NL ... 200020 
1136
70.1% 62.21.128.0/18 KPN B.V. NL ... 200020 1136
70.1% 62.207.128.0/17 KPN B.V. NL ... 200020 1136
70.1% 62.207.0.0/17 KPN B.V. Amsterdam Provincie Noord-Holland NL ... 200020 
1136
70.1% 62.133.96.0/19 KPN Mobile The Netherlands B.V. NL ... 

Re: Best practices for BGP Communities

2019-03-05 Thread Job Snijders
On Wed, Mar 6, 2019 at 8:32 Smith, Courtney 
wrote:

> On 3/5/19, 6:04 PM, "NANOG on behalf of Job Snijders"
>  j...@instituut.net> wrote:
>
> On Sun, Mar 03, 2019 at 08:42:02PM -0500, Joshua Miller wrote:
> > A while back I read somewhere that transit providers shouldn't delete
> > communities unless the communities have a specific impact to their
> > network, but my google-fu is failing me and I can't find any sources.
> >
> > Is this still the case? Does anyone have a source for the practice of
> > leaving unknown communities alone or deleting them?
>
> https://tools.ietf.org/html/rfc7454#section-11
>
>
> Remember policies between two peers may not be same as customer policies.
>
> Example:  Customers_of_transit_X >>> Transit X >>> Peer_A >>
> Customers_of_Peer_A
>
> Customers_of_Peer_A may use community A:50 to set local pref to 50 in
> Peer_A network.  But that doesn’t not mean Customers_of_transit_X can send
> A:50 to set lpref on their routes in Peer_A's network.  Peer_A's policy
> with Transit X likely does not take action on customer communities since
> they are 'peers' not customers.  Transit X can send A:50 to Peer_A but
> nothing would happen.  What's the benefit of Transit X preserving A:50 from
> its customers if it means nothing in Transit X?



OP didn’t specify what kind of BGP communities they were referring to. In
general we can separate communities into two categories: “Informational”
and “Action”. You are right that preserving/propagating “action”
communities (such as in your example) probably isn’t that interesting.
“informational” communities on the other hand can be very valuable.

See https://tools.ietf.org/html/rfc8195 for more information on how the two
types differ.

Kind regards,

Job


  1   2   3   4   5   >