Hi Alexis, I am not sure how OVS uses threads - in changelog there is some concurrency related improvement in 2.1.3 and 2.3. Also I guess docker can be forced regarding assigned resources.
For you the most important is the amount of cores used by controller. How does your cpu and memory consumption look like when you connect all the OVSs? Regards, Michal ________________________________________ From: Alexis de Talhouët <[email protected]> Sent: Tuesday, February 9, 2016 14:44 To: Michal Rehak -X (mirehak - PANTHEON TECHNOLOGIES at Cisco) Cc: [email protected] Subject: Re: [openflowplugin-dev] Scalability issues Hello Michal, Yes, all the OvS instances I’m running has a unique DPID. Regarding the thread limit for netty, I’m running test in a server that has 28 CPU(s). Does each OvS instances is assigned its own thread? Thanks, Alexis > On Feb 9, 2016, at 3:42 AM, Michal Rehak -X (mirehak - PANTHEON TECHNOLOGIES > at Cisco) <[email protected]> wrote: > > Hi Alexis, > in Li-design there is the stats manager not in form of standalone app but as > part of core of ofPlugin. You can disable it via rpc. > > Just a question regarding your ovs setup. Do you have all DPIDs unique? > > Also there is limit for netty in form of amount of used threads. By default > it uses 2 x cpu_cores_amount. You should have as many cores as possible in > order to get max performance. > > > > Regards, > Michal > > > > ________________________________________ > From: [email protected] > <[email protected]> on behalf of Alexis de > Talhouët <[email protected]> > Sent: Tuesday, February 9, 2016 00:45 > To: [email protected] > Subject: [openflowplugin-dev] Scalability issues > > Hello openflowplugin-dev, > > I’m currently running some scalability test against openflowplugin-li plugin, > stable/lithium. > Playing with CSIT job, I was able to connect up to 1090 switches: > https://git.opendaylight.org/gerrit/#/c/33213/ > > I’m now running the test against 40 OvS switches, each one of them is in a > docker container. > > Connecting around 30 of them works fine, but then, adding a new one break > completely ODL, it goes crazy and unresponsible. > Attach a snippet of the karaf.log with log set to DEBUG for > org.opendaylight.openflowplugin, thus it’s a really big log (~2.5MB). > > Here it what I observed based on the log: > I have 30 switches connected, all works fine. Then I add a new one: > - SalRoleServiceImpl starts doing its thing (2016-02-08 23:13:38,534) > - RpcManagerImpl Registering Openflow RPCs (2016-02-08 23:13:38,546) > - ConnectionAdapterImpl Hello received (2016-02-08 23:13:40,520) > - Creation of the transaction chain, … > > Then all starts failing apart with this log: >> 2016-02-08 23:13:50,021 | DEBUG | ntLoopGroup-11-9 | ConnectionContextImpl >> | 190 - org.opendaylight.openflowplugin.impl - 0.1.4.SNAPSHOT | >> disconnecting: node=/172.31.100.9:46736|auxId=0|connection state = RIP > End then ConnectionContextImpl disconnects one by one the switches, > RpcManagerImpl is unregistered > Then it goes crazy for a while. > But all I’ve done is adding a new switch.. > > Finally, at 2016-02-08 23:14:26,666, exceptions are thrown: >> 2016-02-08 23:14:26,666 | ERROR | lt-dispatcher-85 | >> LocalThreePhaseCommitCohort | 172 - >> org.opendaylight.controller.sal-distributed-datastore - 1.2.4.SNAPSHOT | >> Failed to prepare transaction member-1-chn-5-txn-180 on backend >> akka.pattern.AskTimeoutException: Ask timed out on >> [ActorSelection[Anchor(akka://opendaylight-cluster-data/), >> Path(/user/shardmanager-operational/member-1-shard-inventory-operational#-1518836725)]] >> after [30000 ms] > And it goes for a while. > > Do you have any input on the same? > > Could you give some advice to be able to scale? (I know disabling > StatisticManager can help for instance) > > Am I doing something wrong? > > I can provide any asked information regarding the issue I’m facing. > > Thanks, > Alexis > > _______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
