(Speaking as regular ol' member)
>as some of you know, I'm writing my master thesis about RPKI at >Deutsche Telekom (RĂ¼diger Volk). Especially I try to identify the >"problems (attack, misconfiguration, ...)" of using RPKI as a relying >party/resource owner and try to find ways to identify if such a >"problem" arises (i.e. competing ROAs). Sounds like fun. >Furthermore I want to give >proposals on how to proceed if a "problem" arises. To sum it up, a RP >should use the RPKI as a reliable tool to improve it's routing. I hope >this thesis helps to reduce the concerns of some RPs of using RPKI for >securing inter-domain routing. Advice on how to proceed in the case of errors would be very interesting. >The following classification lists the groups of problems I've >identified including a short description. If possible, I used terms >which are used in SIDR drafts/RFCs. It would be great to get some >feedback to this classification. I guess most of you prefer textual >description, so I tried to represent it in textual form. Additionally, >you can find a jpg attached. Some comments below: >Classification of "problems" >1. Incorrect representation of RPs/ROs INRs at "RPKI layer": The >initial transformation of RP/Resource Owner INRs as RPKI objects is not >correct. You mean X issues a certificate for resources Y to some customer/member and the resources Y weren't actually held by that customer/member? So this is X's error? Are you working on just the categorization, or are you going to propose methods of detecting these problems? >2. Incorrect/untrustworthy/suspicious RPKI information >2.1 Object related >2.1.1 Competing Attack: Router certificate: ASN competes with existing >router certificate; Could you say what you mean by "ASN competes"? A router might belong to an organization that uses more than one ASN. The AS migration case is one particular case of that happening. So I'd say a router might have multiple router certificates with different ASNs. So I'm not sure what "ASN competes" means. > Other Objects: IP-Range competes with existing >objects (In my opinion, certificates can also compete with other certs >because of their X.509 extensions. Of course, at the end the ROA causes >the problem but it could be kind of an early warning system if competing >certificates are identified --> Should a doctor try to take care of the >cause or just try to allay the pain?). Again, what do you mean by "competes"? A prefix holder might authorize more than one ASN to advertise the prefix: It might authorize upstream providers to announce for it. It might hold and use multiple ASNs on a regular basis. There might be AS migration taking place. And there's the case that a provider has a ROA for its aggregate, has issued prefixes to its customer and allowed the customer to multi-home with that prefix, which means the customer might be issuing ROAs for a more specific prefix to a different ASN. So there are reasons why there might be multiple ROAs for the same prefix to different ASNs. >2.1.2 Whacked Objects: Object transition which results in a route >transition from valid to missing or invalid. You mean a failure to correctly handle the timing involved in changing the state of a route advertisement and getting the ROAs in place before the route advertisement occurs? Or a failure to correctly handle the refresh of certificates so that ROAs do not expire? (Again, are you going to work on the categorization, or are you going to propose means of identifying these errors? I think I ask because I'm not sure how you would tell the difference between a failure and deliberate intention.) >2.1.3 Non-compliance: Non-compliance of RPKI objects can cause bad >behavior of RP software. Weak alg./key length could result in a >downgrade attack. (There are syntactical checks, but for the sake of >completeness and because of possible implementation mistakes it's also >included) Implementation errors are a problem everywhere, but you're hypothesizing two errors here: a compliance error in issuing an object and an error in checking objects for compliance. This is probably more likely if both implementations have the same source. Or maybe you are talking about error in issuing a certificate causes the implementation that is validating the certificate to fail (i.e., crash). >2.1.4 Expiring object: Check if objects are almost expired/forgotten. >Can cause unwanted routing behavior. That's another timing issue, and somewhat related to the Whacked Objects case. Right? >2.2 System related >2.2.1 Replay attack:A whole old dataset could replace a newer one and >could be still valid. That's one reason for the manifests. If you manage to come up with a scenario where replay occurred but was undetected by the manifest, it would be very interesting. >2.2.2 Short lifetime attack: Recurring ROA sets with short lifetimes >could overload the RP software because of it's cryptographic checks. I think that might depend on whether the RP software is setup to re-sync with the repositories on a periodic basis on on events like expiration. Another interesting question is whether short lifetimes would cause churn in the routing space. >2.2.3 Incomplete amount of objects: Incompleteness of RPKI objects could >affect the global routing behavior. Are you talking about objects that are somehow deleted from the repository? That should be something the manifest should detect, so again, if you come up with a scenario, that would be interesting. Are you talking about the "missing" case of route validity? Different RPs might respond differently to that, and the difference could have interesting effects on routing - but that's true of any difference in local routing policy. Are you talking about the "clueless customer" case - a provider produces a ROA for its aggregate before there are ROAs for its customers who are advertising more specifics. This is indeed a recognized case - there's guidance in the origin-ops (see page 5) draft with obligations (MUST) to providers about this. Which is not to say it won't occur. >2.2.4 repository availability:A DoS attack could affect the availability >of the Repository. Anything that affects the availability of the repository would be a problem - DoS attack, power outage, site unreachable, etc., or even internal RP issues. Certainly there are operational means to ameliorate the impact. Which is not to say that it is not a problem. If your focus is on the effect on RPs, there's no way for the RP to know the cause, so coming up with recommendations of how to proceed could be tricky. --Sandy, speaking as regular ol' member _______________________________________________ sidr mailing list [email protected] https://www.ietf.org/mailman/listinfo/sidr
