On Thu, Aug 8, 2019 at 1:52 AM Han Zhou <zhou...@gmail.com> wrote: > > > > On Wed, Aug 7, 2019 at 8:12 AM Dumitru Ceara <dce...@redhat.com> wrote: >> >> On Mon, Aug 5, 2019 at 5:34 PM Han Zhou <zhou...@gmail.com> wrote: >> > >> > >> > >> > On Mon, Aug 5, 2019 at 1:06 AM Dumitru Ceara <dce...@redhat.com> wrote: >> > > >> > > On Mon, Aug 5, 2019 at 1:41 AM Han Zhou <zhou...@gmail.com> wrote: >> > > > >> > > > >> > > > >> > > > On Mon, Jul 29, 2019 at 2:30 AM Dumitru Ceara <dce...@redhat.com> >> > > > wrote: >> > > > > >> > > > > Add a restriction on the target protocol addresses to match the >> > > > > configured subnets. All other ARP/ND packets are invalid in this >> > > > > context. >> > > > > >> > > > > One exception is for ARP replies that are received for route >> > > > > next-hops >> > > > > that are only reachable via a port but can't be directly resolved >> > > > > through route lookups. Such support was introduced by commit: >> > > > > >> > > > > 6b785fd8fe29 ("ovn-util: Allow /32 IP addresses for router ports.") >> > > > > >> > > > > Reported-at: https://bugzilla.redhat.com/1729846 >> > > > > Reported-by: Haidong Li <ha...@redhat.com> >> > > > > CC: Han Zhou <zhou...@gmail.com> >> > > > > CC: Guru Shetty <g...@ovn.org> >> > > > > Fixes: b068454082f5 ("ovn-northd: Support learning neighbor from ARP >> > > > > request.") >> > > > > Signed-off-by: Dumitru Ceara <dce...@redhat.com> >> > > > >> > > > Hi Dumitru, >> > > >> > > Hi Han, >> > > >> > > Thanks for reviewing this. >> > > >> > > > >> > > > Sorry for my slow response, and thanks a lot for revising the patch >> > > > for a bigger scope of validations. However, the exception of /32 >> > > > address makes me thinking more about this patch. If ARP replies is >> > > > allowed from any nexthop for a LR port with /32, at least ARP request >> > > > for GARP purpose should also be allowed. But before asking for a v3, >> > > > I'd hold on to rethink the purpose of this patch. >> > > >> > > Right, we should allow GARP requests too. If we decide to go ahead >> > > with this patch I'll add a function in v3 to handle all types of ARPs >> > > and call it both for unreachable static route next-hops and attached >> > > subnets. >> > > >> > > > >> > > > The nexthop specific flows are now from static routes. What if OVN >> > > > support dynamical routing protocols in the future? Of course we can >> > > > then take those dynamic nexthops into allowed peers. But then I am >> > > > thinking what is the real benefit of all these restrictions? Why can't >> > > > we just have simpler logic to handle all these situations without >> > > > validation? I think the major benefit of the validation is to avoid >> > > > handling the noise ARP/NDs when multiple subnets shares same L2, but >> > > > most cases it is really not a big deal, right? For the CPU problem >> > > > caused by ARP flooding as mentioned by the bug report, it is a real >> > > > problem, but this patch seems not really helpful for that, because the >> > > > attacker can just trigger the same CPU problem with *valid* packets. >> > > > So I am not sure if the benefit of the change is worth the complexity >> > > > it introduced. Please share your thought and correct me if I missed >> > > > something. >> > > > >> > > > Thanks, >> > > > Han >> > > >> > > I assume the simpler logic to handle all these situations without >> > > validation is to add rate limiting for ARP packets that get punted to >> > > the controller. I agree that this should be implemented too. >> > > >> > > But I think rate limiting all ARP packets regardless of IP addresses >> > > is not enough. In the following scenario and if we would have a way to >> > > rate limit ARP packets: >> > > - Subnet 42.42.42.0/24 configured on the OVN >> > > - "Invalid" ARP packets are injected at high rate in the network for >> > > 41.41.41.41 >> > > - Host 42.42.42.42 sends GARP. >> > > - Rate limiting of ARP packets towards controller at 100pps >> > > >> > > With the current code, ARP packets for 41.41.41.41 will be punted to >> > > controller at a rate of at most 100 per second. But the chances that >> > > the valid 42.42.42.42 GARP is dropped is really high. >> > > >> > > If we instead use the following approach: >> > > a. Punt only useful ARPs to controller. >> > > b. Rate limit ARPs that are sent to controller. >> > > >> > > Then ARP packets outside 42.42.42./24 are never punted to the >> > > controller and don't consume any rate limiting tokens. For the second >> > > case, when an attacker would flood with valid ARP packets we would >> > > have the rate limit in place to protect the controller CPU. >> > > >> > > My commit addresses point "a" above as I think point "b" should be >> > > done in a generic way for all protocol packets that need to reach the >> > > controller. >> > > >> > > For dynamic routing protocols on the other hand I think we should not >> > > install routes with next-hops that are unreachable through other >> > > direct or indirect routes. >> > > >> > > Thanks, >> > > Dumitru >> > >> > I agree that blindly ARP rate limit is not helpful, but a) is not really >> > helpful in this case either. In your example, the attacker can just use >> > any valid IP in 42.42.42.0/24 to send GARP flooding, which would result in >> > exactly same result that a useful GARP from 42.42.42.42 is dropped because >> > of blindly rate limiting all ARPs. To solve the problem properly, the ARP >> > rate limiting must be done per IP. >> >> Ok, ideally ARP rate limiting should be done per IP but it would take >> quite a lot of resources to keep that information per host. >> >> Any idea how to implement that in an efficient way? There are >> scenarios when we don't know beforehand the IPs of the hosts running >> in the network so that we can whitelist them. Also, from what I've >> seen physical routers usually have a single queue for control plane >> protection for ARPs. >> >> Thanks, >> Dumitru > > > Yes, I agree that system resource is a challenge for per IP rate limiting. > And yes there is no way to whitelist (because otherwise ARP is not needed). > We may make some trade-off between the accuracy and efficiency. For example, > we can have separate meter groups for each logical router port for ARP > ratelimiting. Each meter group may have e.g. 64 meters, and then having a > stage to do hash for the IPs, and in the next stage using the hash result > (between 0 - 63) as index to use corresponding meter to do ratelimiting. This > way, flooding to a specific IP will impact only the other IPs that fall into > the same hash bucket on the same router port. But this is just a rough idea > and I believe many more details still need to be figured out for the hashing > part. As a first step maybe we can just do ratelimiting per router port. It > is definitely better than nothing. What do you think?
Hi Han, It sounds like a good idea to have a hashtable for rate limiting groups of ARPs. I will start like you suggested with a single bucket per router port. The question is if we still want to punt only ARPs for configured networks and nexthops like this patch is trying to do. I would add it and then implement rate limiting as a follow up patch/series. Thanks, Dumitru _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev