Yes, I have set ulimit -n to 30000 and controller is actually calculating the routing path for the PACKET_INs...but I don't think that's CPU-intensive workload...I have a description of the topology to be load by the controller, the controller just searches the in-memory list and sends back the PACKET_OUT or of_flow_mod...(customized version of riplpox, as the original version is taking the controller as a big hub)
I will try to use POX-based switch Best, Nan On Sat, Nov 30, 2013 at 5:09 PM, Murphy McCauley <murphy.mccau...@gmail.com>wrote: > It's certainly possible to overload the controller. Are your switches > actually sending it packets or something which require processing? Does > the controller process have high CPU usage? > > Maybe you should try recreating my test using the POX-based switch and see > what happens for you (since it worked fine for me once I upped the file > descriptor limit). For that matter, have you checked your file descriptor > limit (e.g., with ulimit -n)? > > -- Murphy > > On Nov 30, 2013, at 2:04 PM, Nan Zhu <zhunanmcg...@gmail.com> wrote: > > > Hi, Murphy, > > > > I'm pretty sure that the POX is still in the OpenFlow loop, I tested as > the following: > > > > 1. add some statements to ensure that POX only processes PACKET_IN after > all switches are connected > > > > 2. after POX "freeze", the packet_ins are actually received by the > controller > > > > 3. I didn't see the disconnection of any openflow switch, as when the > connection is broken, it should be printed to the screen in my program > > > > 4. I observed that some of my software switch is establishing the > connection, but after that, it always waits for a while to get set_config > message, etc. I assume that POX is overloaded? > > > > > > I will try to reproduce it and post the log > > > > Best, > > > > Nan > > > > > > > > > > > > On Sat, Nov 30, 2013 at 4:53 PM, Murphy McCauley < > murphy.mccau...@gmail.com> wrote: > > I've tested it with many hundred in the past using a modified cbench. > It's easier now, since POX contains its own OpenFlow switch. I just ran: > > #!/bin/bash > > > > for I in $(seq 1 300); do > > ./pox.py datapaths:softwareswitch & > > done > > > > The controller does stop somewhere around 250 connections, which would > be highly suspicious due to its proximity to 256, but in my case, the > problem was clearly visible in the log: > > ERROR:openflow.of_01:Exception reading connection <socket._socketobject > object at 0x10e009d00> > > Traceback (most recent call last): > > File "/Users/murphy/proj/pox_dart/pox/openflow/of_01.py", line 905, in > run > > new_sock = listener.accept()[0] > > File > "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", > line 202, in accept > > error: [Errno 24] Too many open files > > ERROR:openflow.of_01:Exception on OpenFlow listener. Aborting. > > > > POX has run out of file descriptors and the resulting error has caused > the OpenFlow loop to abort. I set it larger using ulimit (500 was the > largest I could get without changing the system config), and could now get > almost 500 connected. > > > > > > Did you not get this log message? As the FAQ says, posting the log can > be very useful. You also don't really describe what you mean by "freeze", > though it may be hard to tell. For example, if the OpenFlow loop aborted, > POX would appear to freeze as far as OpenFlow was concerned, though timers > and other components would continue to run. > > > > It's not clear why setting the connection rate to two per second would > help if the problem was running out of file descriptors unless this led to > some of the switches disconnecting (or being disconnected due to idle)... > which might be obvious from reading the log. > > > > > > At any rate, aborting the OpenFlow loop when out of file descriptors is > probably just not the right thing to do. I've pushed a fix to dart. > > > > -- Murphy > > > > On Nov 30, 2013, at 12:31 PM, Nan Zhu <zhunanmcg...@gmail.com> wrote: > > > > > Hi, > > > > > > Do anyone has some experimental result about the pox capacity in terms > of concurrent connection number? > > > > > > I wrote some software-defined openflow switch (partially implement the > openflow protocol, except the packet counters and queues), each switch is a > in-memory object which can connects to the pox and send openflow messages, > > > > > > > > > it works well when the number of "openflow switches" is small. > however, when I create as many as 320 switches, the pox controller freeze > after it processed connectUp for nearly 250 switches > > > > > > I controlled the connection rate as 2 connectionup per second, it > backs to the normal > > > > > > anyone has the similar experience? > > > > > > Best, > > > > > > Nan > > > > > >