On Feb 17, 2016, at 11:16 PM, IWAMOTO Toshihiro <[email protected]> wrote: > I think the code is mostly okay now. >
Well - that's a start. ;) >> + self.events = hub.PriorityQueue(96) >> + self._event_get_timeout = 5 >> + # The sum of the following semaphores must equal the number of >> + # entries in the events queue. >> + self._events_sem = hub.BoundedSemaphore(32) >> + self._sc_events_sem = hub.BoundedSemaphore(32) >> + self._pi_events_sem = hub.BoundedSemaphore(32) > > I still don't like this code much, but assuming event priorities are > really important to your application, we might need to bear with this > additional code and overhead. > So - let me tell you where I'm coming from, and we can discuss the need for priorities. With the issue that I described with respect to OFPFC_ADD (potential for switch to flood controller with PacketIn events), we were finding that the controller could *easily* be swamped with PacketIn events; so much so, that "more important" events were not getting processed (even with a longer queue). It occurs to me that, while the controller can be overwhelmed by *any* of several situations where the switch malfunctions and sends too many events, this is most *likely* to occur with PacketIn events, if the controller application is inserting flows in a "reactive" manner. This is *especially* true when high-volume flows somehow manage to fall upon a rule that forwards to the controller. Hence, my desire to have PacketIn events be of lower priority; they're the most likely (in a reactive flow insertion scenario) to overwhelm the controller. This does not change the fact, however, that *any* asynchronous or symmetric message *could* be abused by a malicious switch to overwhelm the controller; it just makes the more likely "overwhelmed by PacketIn events" scenario less likely to occur. In thinking about this, though, I am willing to cut this down to 2 semaphores, and a single PriorityQueue of 128 items. 1 semaphore for PacketIns, and 1 for all other types of events (which *should* - in general - be far less numerous, in a reactive application case). > But I wonder, if a _recv_loop blocks, aren't the following messages > from that switch blocked regardless of event priority? > Isn't it better to accept as much events as possible without blocking? > If the _recv_loop in the Datapath blocks, it will only be for three reasons, that I can tell: 1) The socket recv() receives no data, and the TCP stack does not notify it of this happening. That's a possibility, and the reason I added a timeout on the recv(). Socket operations in eventlet are all non-blocking under the covers, and shutdown() with SHUT_RDWR on the non-blocking socket waiting in select()/epoll() may not have the effect you expect, without a timeout. 2) A handler called inside the _recv_loop blocks - which is why I added a Timeout object. That said - the Timeout is *not* infallible; CPU-bound handlers will ignore it entirely. 3) The _recv_loop is trying to do send_event_to_observers() - and blocks in send_event(). In this case, yes, that Datapath is waiting to be able to queue further events onto the application's event queue - which means, in reality, that the events are being queued in the socket's receive queue (and are waiting to be read by recv()). Accepting as many events as possible would seem a good idea - but, if you've somehow got a 10Gbps flow generating PacketIns? You're going to saturate the event queue eventually, or run the controller out of memory with a dynamically growing queue. I know; I tried making a dynamically growing event queue in my lab environment. ;) >> + try: >> + priority, (ev, state) = >> self.events.get(timeout=self._event_get_timeout) > > You no longer need this timeout? > So - "in theory" this timeout is unnecessary. In practice, I think it's good to have it as a defensive measure. It causes little harm; the get() waits *up to the timeout* to receive an event, then times out. If it didn't find an event before the timeout expired, it loops back around. In effect, it ensures that there's a regular wakeup of the greenlet that manages the event loop, and the overhead is low (a quick check for None, and a return to waiting in get() - thereby allowing other greenlets to be scheduled). I'd like to leave this in here, for safety's sake. >> >> + except Exception as e: >> + if ((isinstance(e, hub.Timeout)) and (e is >> handler_execution_timeout)): > > Have you tested this code? > > $ python >>>> issubclass(eventlet.Timeout, Exception) >>>> > False > >>>> issubclass(eventlet.Timeout, BaseException) >>>> > True > > > It seems that "except Exception" won't catch the Timeout exception. > Apparently not. I'm amenable to removing the Timeout on the handlers. I was being excessively cautious there; furthermore, the fact that the Timeouts do no real good with CPU-bound handlers makes them have little worth. I'll re-submit the patch without them; OK? So - my plan: 1) Cut the number of semaphores to 2 (PacketIns and "everything else"). 2) Increase the size of the priority queue to 128, but keep it static. 3) Remove the handler Timeouts, and the associated configuration option. I'll follow this email up with a patch to do that; hopefully, that'll be more acceptable still. ;) Best, Victor -- Victor J. Orlikowski <> vjo@[cs.]duke.edu ------------------------------------------------------------------------------ Site24x7 APM Insight: Get Deep Visibility into Application Performance APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month Monitor end-to-end web transactions and take corrective actions now Troubleshoot faster and improve end-user experience. Signup Now! http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 _______________________________________________ Ryu-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/ryu-devel
