Alexis, thanks very much for sharing this test. Would you mind to open a bug with all this info so we can track this?
> On Feb 18, 2016, at 7:29 AM, Alexis de Talhouët <[email protected]> > wrote: > > Hi Michal, > > ODL memory is capped at 2go, the more memory I add, those more OVS I can > connect. Regarding CPU, it’s around 10-20% when connecting new OVS, with some > peak to 80%. > > After some investigation, here is what I observed: > Let say I have 50 switches connected, stat manager disabled. I have one > opened socket per switch, plus an additional one for the controller. > Then I connect a new switch (2016-02-18 09:35:08,059), 51 switches… something > is happening causing all connection to be dropped (by device?) and then ODL > try to recreate them and goes in a crazy loop where it is never able to > re-establish communication, but keeps creating new sockets. > I’m suspecting something being garbage collected due to lack of memory, > although no OOM errors. > > Attached the YourKit Java Profiler analysis for the described scenario and > the logs [1]. > > Thanks, > Alexis > > [1]: > https://www.dropbox.com/sh/dgqeqv4j76zwbh3/AACim0za1fUozc7DlYJ4fsMJa?dl=0 > <https://www.dropbox.com/sh/dgqeqv4j76zwbh3/AACim0za1fUozc7DlYJ4fsMJa?dl=0> > >> On Feb 9, 2016, at 8:59 AM, Michal Rehak -X (mirehak - PANTHEON TECHNOLOGIES >> at Cisco) <[email protected] <mailto:[email protected]>> wrote: >> >> Hi Alexis, >> I am not sure how OVS uses threads - in changelog there is some concurrency >> related improvement in 2.1.3 and 2.3. >> Also I guess docker can be forced regarding assigned resources. >> >> For you the most important is the amount of cores used by controller. >> >> How does your cpu and memory consumption look like when you connect all the >> OVSs? >> >> Regards, >> Michal >> >> ________________________________________ >> From: Alexis de Talhouët <[email protected] >> <mailto:[email protected]>> >> Sent: Tuesday, February 9, 2016 14:44 >> To: Michal Rehak -X (mirehak - PANTHEON TECHNOLOGIES at Cisco) >> Cc: [email protected] >> <mailto:[email protected]> >> Subject: Re: [openflowplugin-dev] Scalability issues >> >> Hello Michal, >> >> Yes, all the OvS instances I’m running has a unique DPID. >> >> Regarding the thread limit for netty, I’m running test in a server that has >> 28 CPU(s). >> >> Does each OvS instances is assigned its own thread? >> >> Thanks, >> Alexis >> >> >>> On Feb 9, 2016, at 3:42 AM, Michal Rehak -X (mirehak - PANTHEON >>> TECHNOLOGIES at Cisco) <[email protected] <mailto:[email protected]>> wrote: >>> >>> Hi Alexis, >>> in Li-design there is the stats manager not in form of standalone app but >>> as part of core of ofPlugin. You can disable it via rpc. >>> >>> Just a question regarding your ovs setup. Do you have all DPIDs unique? >>> >>> Also there is limit for netty in form of amount of used threads. By default >>> it uses 2 x cpu_cores_amount. You should have as many cores as possible in >>> order to get max performance. >>> >>> >>> >>> Regards, >>> Michal >>> >>> >>> >>> ________________________________________ >>> From: [email protected] >>> <mailto:[email protected]> >>> <[email protected] >>> <mailto:[email protected]>> on behalf of >>> Alexis de Talhouët <[email protected] >>> <mailto:[email protected]>> >>> Sent: Tuesday, February 9, 2016 00:45 >>> To: [email protected] >>> <mailto:[email protected]> >>> Subject: [openflowplugin-dev] Scalability issues >>> >>> Hello openflowplugin-dev, >>> >>> I’m currently running some scalability test against openflowplugin-li >>> plugin, stable/lithium. >>> Playing with CSIT job, I was able to connect up to 1090 switches: >>> https://git.opendaylight.org/gerrit/#/c/33213/ >>> <https://git.opendaylight.org/gerrit/#/c/33213/> >>> >>> I’m now running the test against 40 OvS switches, each one of them is in a >>> docker container. >>> >>> Connecting around 30 of them works fine, but then, adding a new one break >>> completely ODL, it goes crazy and unresponsible. >>> Attach a snippet of the karaf.log with log set to DEBUG for >>> org.opendaylight.openflowplugin, thus it’s a really big log (~2.5MB). >>> >>> Here it what I observed based on the log: >>> I have 30 switches connected, all works fine. Then I add a new one: >>> - SalRoleServiceImpl starts doing its thing (2016-02-08 23:13:38,534) >>> - RpcManagerImpl Registering Openflow RPCs (2016-02-08 23:13:38,546) >>> - ConnectionAdapterImpl Hello received (2016-02-08 23:13:40,520) >>> - Creation of the transaction chain, … >>> >>> Then all starts failing apart with this log: >>>> 2016-02-08 23:13:50,021 | DEBUG | ntLoopGroup-11-9 | ConnectionContextImpl >>>> | 190 - org.opendaylight.openflowplugin.impl - 0.1.4.SNAPSHOT | >>>> disconnecting: node=/172.31.100.9:46736|auxId=0|connection state = RIP >>> End then ConnectionContextImpl disconnects one by one the switches, >>> RpcManagerImpl is unregistered >>> Then it goes crazy for a while. >>> But all I’ve done is adding a new switch.. >>> >>> Finally, at 2016-02-08 23:14:26,666, exceptions are thrown: >>>> 2016-02-08 23:14:26,666 | ERROR | lt-dispatcher-85 | >>>> LocalThreePhaseCommitCohort | 172 - >>>> org.opendaylight.controller.sal-distributed-datastore - 1.2.4.SNAPSHOT | >>>> Failed to prepare transaction member-1-chn-5-txn-180 on backend >>>> akka.pattern.AskTimeoutException: Ask timed out on >>>> [ActorSelection[Anchor(akka://opendaylight-cluster-data/ >>>> <akka://opendaylight-cluster-data/>), >>>> Path(/user/shardmanager-operational/member-1-shard-inventory-operational#-1518836725)]] >>>> after [30000 ms] >>> And it goes for a while. >>> >>> Do you have any input on the same? >>> >>> Could you give some advice to be able to scale? (I know disabling >>> StatisticManager can help for instance) >>> >>> Am I doing something wrong? >>> >>> I can provide any asked information regarding the issue I’m facing. >>> >>> Thanks, >>> Alexis >>> >>> >> > > _______________________________________________ > openflowplugin-dev mailing list > [email protected] > https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
_______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
