Thank you for the detailed answer! I see, there are obscurities because
of a missing detailed description of the particular problems.
Unfortunately, I'm writing my thesis in German but I hope my comments
clear it up. Beneath the classification, I will try to find ways to
identify the appearance of a problem. I'm searching for anomalies
regarding the repository behavior for several months. I hope/think this
helps a lot to find reasonable ways for identification.
The primary goal is to identify the problems and not to find the guilty
one. This wouldn't help the RP/RO to react on a problem.
My comments below.
Am 13.01.2014 18:20, schrieb Murphy, Sandra:
(Speaking as regular ol' member)
as some of you know, I'm writing my master thesis about RPKI at
Deutsche Telekom (RĂ¼diger Volk). Especially I try to identify the
"problems (attack, misconfiguration, ...)" of using RPKI as a relying
party/resource owner and try to find ways to identify if such a
"problem" arises (i.e. competing ROAs).
Sounds like fun.
Yes, it's an interesting topic and I've spent a lot of time to read and
understand all the drafts/RFC's :-)
Furthermore I want to give
proposals on how to proceed if a "problem" arises. To sum it up, a RP
should use the RPKI as a reliable tool to improve it's routing. I hope
this thesis helps to reduce the concerns of some RPs of using RPKI for
securing inter-domain routing.
Advice on how to proceed in the case of errors would be very interesting.
Beside the understanding of what problems can happen the advices on how
to proceed should be the added value for a RP. I hope this helps for
some deployments :-)
The following classification lists the groups of problems I've
identified including a short description. If possible, I used terms
which are used in SIDR drafts/RFCs. It would be great to get some
feedback to this classification. I guess most of you prefer textual
description, so I tried to represent it in textual form. Additionally,
you can find a jpg attached.
Some comments below:
Classification of "problems"
1. Incorrect representation of RPs/ROs INRs at "RPKI layer": The
initial transformation of RP/Resource Owner INRs as RPKI objects is not
correct.
You mean X issues a certificate for resources Y to some customer/member and
the resources Y weren't actually held by that customer/member?
So this is X's error?
Yes, this would belongs to that "problem" category.
More in detail:
I would say there are three (technical) layers which are of interest for
a relying party/ resource owner (RO) regarding inter-domain routing:
1: RP/RO layer: How does a RP/RO wants to see it's own INRs at routing
layer and rpki layer.Is there a correct mapping? This is kind of a
semantic layer...
2: RPKI layer: Describes the permissions to use an INR (in form of a
cryptographic object).
3: routing layer: Thats the actual inter-domain routing (BGP).
Every layer has it's own challanges/problems. "Incorrect representation
of RPs/ROs INRs at RPKI layer" means that there is an incorrect mapping
and so a wrong semantic representation.
Usually there could be another problem with a wrong representation at
the routing layer (i.e. wrong route announcements because of wrong
router configuration) but the routing layer is not part of my thesis.
I'm focusing on problems regarding the tool RPKI.
Are you working on just the categorization, or are you going to propose methods
of detecting these problems?
I'm going to try to find ways detecting the identified problems with the
given public information. And if there is no way with the given
information, I would like to propose extensions to the existing
"information structure".
But just theoretical. Unfortunately there is not enough time to
implement it.
2. Incorrect/untrustworthy/suspicious RPKI information
2.1 Object related
2.1.1 Competing Attack: Router certificate: ASN competes with existing
router certificate;
Could you say what you mean by "ASN competes"?
A router might belong to an organization that uses more than one ASN.
The AS migration case is one particular case of that happening.
So I'd say a router might have multiple router certificates with different ASNs.
So I'm not sure what "ASN competes" means.
Ok, I should change the description :). I mean if there exists a router
certificate with AS number X and another entity comes up with a router
certificate with the same AS number X but doesn't own this AS number. Of
course there are cases in which this is ok, but here I talk about an
attack, misconfiguration...
Here it would be great, if a distinction between
intentionally/unintentionally is possible.
Other Objects: IP-Range competes with existing
objects (In my opinion, certificates can also compete with other certs
because of their X.509 extensions. Of course, at the end the ROA causes
the problem but it could be kind of an early warning system if competing
certificates are identified --> Should a doctor try to take care of the
cause or just try to allay the pain?).
Again, what do you mean by "competes"?
A prefix holder might authorize more than one ASN to advertise the prefix:
It might authorize upstream providers to announce for it.
It might hold and use multiple ASNs on a regular basis.
There might be AS migration taking place.
And there's the case that a provider has a ROA for its aggregate, has issued
prefixes to its customer and allowed the customer to multi-home with that
prefix,
which means the customer might be issuing ROAs for a more specific prefix
to a different ASN.
So there are reasons why there might be multiple ROAs for the same
prefix to different ASNs.
Same as above. I mean attacks, misconfigurations... The classification
is explained in detail in my documentation for a better understanding
but this is unfortunately in German.
2.1.2 Whacked Objects: Object transition which results in a route
transition from valid to missing or invalid.
You mean a failure to correctly handle the timing involved in changing the
state of a route advertisement and getting the ROAs in place before the
route advertisement occurs? Or a failure to correctly handle the refresh
of certificates so that ROAs do not expire?
Not alone. I mean any action which affects an RPKI object in a "bad way.
In the suspenders draft it's explained as:
"Any object in the RPKI can become invalid or inaccessible (to RPs)
via various actions by CAs and/or publication point maintainers along
the certificate path from the object's EE certificate to a trust
anchor (TA). Any action that causes an object to become invalid or
inaccessible is termed "whacking"."
I mean at the end, it's important what impact the actions has on the
inter-domain routing. Therefore I choosed this description: "...which
results in a route transition from valid to missing or invalid." I guess
this was the obscurity?
(Again, are you going to work on the categorization, or are you going to
propose means of identifying these errors? I think I ask because I'm not
sure how you would tell the difference between a failure and deliberate
intention.)
Yes, the distinction between failure and deliberate intention is not
easy to detect if possible at all by a program. I think the intention of
the causer is secondary. The RP/RO has to be able to detect the arising
of a problem to handle it in the best way.
2.1.3 Non-compliance: Non-compliance of RPKI objects can cause bad
behavior of RP software. Weak alg./key length could result in a
downgrade attack. (There are syntactical checks, but for the sake of
completeness and because of possible implementation mistakes it's also
included)
Implementation errors are a problem everywhere, but you're hypothesizing
two errors here: a compliance error in issuing an object and an error in
checking
objects for compliance. This is probably more likely if both implementations
have the same source.
Or maybe you are talking about error in issuing a certificate causes the
implementation
that is validating the certificate to fail (i.e., crash).
As seen in the past, syntactical/semantical inconsistence of
certificates causes a lot of problems in the "usual" PKI world. This is
also a potential problem for a RP. The RP software has to interpret the
certificates and objects. Checking for compliance to the detailed
standards helps in my opinion to reduce such problems. Of course, such
checks are already included in the current three RP softwares. As I
said, it's for the sake of completeness (academic constraint :-)) and I
would say it's an important point to reduce failures.
2.1.4 Expiring object: Check if objects are almost expired/forgotten.
Can cause unwanted routing behavior.
That's another timing issue, and somewhat related to the Whacked Objects
case. Right?
Yes. Could be merged with "whacked objects".
2.2 System related
2.2.1 Replay attack:A whole old dataset could replace a newer one and
could be still valid.
That's one reason for the manifests. If you manage to come up with a
scenario where replay occurred but was undetected by the manifest, it would
be very interesting.
I.e. it is possible if there are changes on the repository and the the
(now) old manifest has to be set on the CRL as long as it's not expired.
An attacker could use the old dataset for a replay attack and until the
manifest isn't expired the RP would see a valid dataset. Of course, the
validity is limited by the update period of the manifest but even if the
old dataset is expired it depends on the local policy of a relying party
how to handle this situation. I guess most of them are not very strict.
2.2.2 Short lifetime attack: Recurring ROA sets with short lifetimes
could overload the RP software because of it's cryptographic checks.
I think that might depend on whether the RP software is setup to re-sync with
the repositories on a periodic basis on on events like expiration.
Another interesting question is whether short lifetimes would cause churn in
the routing space.
I Agree, it would depend on the configuration of the local cache.
Furthermore it would depend on the tier the objects are located in the
RPKI hierarchy, the hardware the RP sofware runs on, ...
At this point, I'm not sure how CPU-intensive the checks are, but the
fact of the matter is, the local cache processes the objects on the
repository. So, it's at least theoretically possible.
2.2.3 Incomplete amount of objects: Incompleteness of RPKI objects could
affect the global routing behavior.
Are you talking about objects that are somehow deleted from the repository?
That should be something the manifest should detect, so again, if you come
up with a scenario, that would be interesting.
Are you talking about the "missing" case of route validity? Different RPs
might respond differently to that, and the difference could have interesting
effects on
routing - but that's true of any difference in local routing policy.
Are you talking about the "clueless customer" case - a provider produces a
ROA for its aggregate before there are ROAs for its customers who are
advertising more specifics. This is indeed a recognized case - there's
guidance in the origin-ops (see page 5) draft with obligations (MUST)
to providers about this. Which is not to say it won't occur.
I mean the first case. It's would be at least possible through a replay
attack and because of the missing
authentication/confidentiality/integrity, a mitm attack is possible. An
attacker could retain a whole dataset (ca certs including all data
regarding this cert on the repository) belongs to a ca cert. As I
understand the data structure, a rp wouldn't recognize that case?!
2.2.4 repository availability:A DoS attack could affect the availability
of the Repository.
Anything that affects the availability of the repository would be a problem -
DoS attack, power outage, site unreachable, etc., or even internal RP issues.
Certainly there are operational means to ameliorate the impact. Which is not to
say that it is not a problem. If your focus is on the effect on RPs, there's
no
way for the RP to know the cause, so coming up with recommendations of how
to proceed could be tricky.
The availability of the repository is important for an RP to get updated
and could cause in expired objects. To identify the reason (attack,
...) is not my first intention. To identify that the problem arises is
important to avoid i.e. expired objects.
--Sandy, speaking as regular ol' member
Kind regards
Demian
_______________________________________________
sidr mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/sidr