On Mar 22, 2013, at 6:32 AM, Tmusic wrote:
> I've been trying some things over the last couple days...
>
> The pypy problem was indeed due to some external modules. Debug-pox.py does
> not provide much helpfull information. Can I suggest adding a
> traceback.print_exc() when import fails ( +- line 80 in boot.py [after:
> print("Module not found:", base_name)]). In my case it really showed which
> import failed.
This actually should happen. I thought it was there with debug-pox.py, but can
you try running pox.py --verbose and see if it gives a useful stack trace?
> For profiling I tried yappi (https://code.google.com/p/yappi/). Not as handy
> as cProfile with RunSnakeRun, but it works with the threading model and
> provides #calls, total time,... per function. It requires some changes in the
> code, but it's possible to create some wrappers and load it as a POX module.
> Let me know if you're interested in the code :)
Sounds interesting. Do you have it in a github fork or anything?
> The second issue is the parsing of flow stats. When I'm getting a
> flow_stat_reply from a switch I'm parsing the statistics for each flow
> contained in the reply. It goes for up to about 300 flows, but from that
> point links start disconnecting again. I tried to split the calculation into
> different parts (don't process them in one loop, but fire an event for each
> flow which will process only that flow). So far this had no measurable
> impact. I'm guessing these events are processed right afterwards which
> basically goes back to the "one big for loop" scenario. Can this be the case?
> Pypy offers an improvement of going up to about 550 flows, but then the same
> issues arise again.
>
> Further, I was looking at the recoco and revent libraries. What I'd like to
> do is submit the "processing events" with a lower priority, so the packet in
> events are processed first. I guess this can resolve the problem?
> Are there some features in recoco or revent which could help in implementing
> this? When I print the length of the schedule queue (cycle function in
> recoco), not all fired events seem to be scheduled as a separate tasks. Where
> is the processing queue for the events situated?
Right, this would be my suggestion. The OpenFlow event handlers are producers
that fill a work queue and then you have a consumer in the form of a recoco
Task that tries to drain the queue.
I think recoco could make this somewhat simpler than it is with just a little
work, but I've so rarely hit performance problems that I've never fleshed it
out. In theory the recoco.events module might be a nice way to do this, but I
think it's not as general purpose as it should be (and it has been a long time
since I've used it at all). I've thrown together a quick producer/consumer
example using recoco:
https://gist.github.com/MurphyMc/939fccd335fb3920f993
On my machine, run under CPython the consumer sometimes gets backlogged but
eventually catches up, and on PyPy it pretty much stays caught up all the time.
In general, you'll need some application-specific logic for if/when the
consumer gets really backed up (e.g., just throw away really old events, or
don't yield and just churn through the backlog while stalling the other Tasks,
or temporarily raise the consumer Task's priority, or etc.). Or if the problem
is that your production is just really bursty but not actually more than you
can handle amortized over time then you can just ignore it as in the example.
Some of the things you can do to play with tuning the example are:
1. Adjust the consumer's priority.
2. Adjust the minimum number of items to consume (batch size) in one scheduling
period (the min(10, ...) in run()).
In your case, I'd expect #1 might not do as much as you'd expect since it only
matters when another Task -- e.g., the OpenFlow IO Task -- is actually waiting
to run. In your case, I'd expect the OpenFlow task to be mostly idle, but then
you get a flow_stats and suddenly you have a lot of work to do. #2 (or the
equivalent) is probably more useful. You want to set it high enough that
you're not wasting time rescheduling constantly, but low enough that discovery
doesn't get starved.
I think a lot of the time I get away with a much simpler approximation which is
where events just update some state (e.g. counters, list of expired flows), and
then I have a pending callDelayed which tries to process them, and then
callDelayed()s itself again in some amount of time if there is nothing left to
do and in some shorter amount of time if there is stuff left to do.
Another possibility may just be to adjust discovery's timeouts. There's
nothing magic about the defaults.
> And finally I noticed strongly varying performance (100 flows more or less in
> the reply before it starts crashing) with exactly the same traffic patterns.
> Has this something to do with the random generator in the recoco scheduler
> cycle() function?
Doubtful -- you probably don't have any Tasks now that have a priority other
than 1, so the randomization shouldn't kick in. My first guess is that this is
nondeterminism caused by how Python 2.x is switching between the IO thread and
the cooperative thread. If you used the version of recoco from the debugger
branch (which combines these into a single thread), you might find that it
evens out.
-- Murphy