Alexis, thanks very much for sharing this test. Would you mind to open a bug 
with all this info so we can track this?


> On Feb 18, 2016, at 7:29 AM, Alexis de Talhouët <[email protected]> 
> wrote:
> 
> Hi Michal,
> 
> ODL memory is capped at 2go, the more memory I add, those more OVS I can 
> connect. Regarding CPU, it’s around 10-20% when connecting new OVS, with some 
> peak to 80%.
>  
> After some investigation, here is what I observed:
> Let say I have 50 switches connected, stat manager disabled. I have one 
> opened socket per switch, plus an additional one for the controller.
> Then I connect a new switch (2016-02-18 09:35:08,059), 51 switches… something 
> is happening causing all connection to be dropped (by device?) and then ODL
> try to recreate them and goes in a crazy loop where it is never able to 
> re-establish communication, but keeps creating new sockets.
> I’m suspecting something being garbage collected due to lack of memory, 
> although no OOM errors.
> 
> Attached the YourKit Java Profiler analysis for the described scenario and 
> the logs [1].
> 
> Thanks,
> Alexis
> 
> [1]: 
> https://www.dropbox.com/sh/dgqeqv4j76zwbh3/AACim0za1fUozc7DlYJ4fsMJa?dl=0 
> <https://www.dropbox.com/sh/dgqeqv4j76zwbh3/AACim0za1fUozc7DlYJ4fsMJa?dl=0>
>  
>> On Feb 9, 2016, at 8:59 AM, Michal Rehak -X (mirehak - PANTHEON TECHNOLOGIES 
>> at Cisco) <[email protected] <mailto:[email protected]>> wrote:
>> 
>> Hi Alexis,
>> I am not sure how OVS uses threads - in changelog there is some concurrency 
>> related improvement in 2.1.3 and 2.3.
>> Also I guess docker can be forced regarding assigned resources.
>> 
>> For you the most important is the amount of cores used by controller.
>> 
>> How does your cpu and memory consumption look like when you connect all the 
>> OVSs?
>> 
>> Regards,
>> Michal
>> 
>> ________________________________________
>> From: Alexis de Talhouët <[email protected] 
>> <mailto:[email protected]>>
>> Sent: Tuesday, February 9, 2016 14:44
>> To: Michal Rehak -X (mirehak - PANTHEON TECHNOLOGIES at Cisco)
>> Cc: [email protected] 
>> <mailto:[email protected]>
>> Subject: Re: [openflowplugin-dev] Scalability issues
>> 
>> Hello Michal,
>> 
>> Yes, all the OvS instances I’m running has a unique DPID.
>> 
>> Regarding the thread limit for netty, I’m running test in a server that has 
>> 28 CPU(s).
>> 
>> Does each OvS instances is assigned its own thread?
>> 
>> Thanks,
>> Alexis
>> 
>> 
>>> On Feb 9, 2016, at 3:42 AM, Michal Rehak -X (mirehak - PANTHEON 
>>> TECHNOLOGIES at Cisco) <[email protected] <mailto:[email protected]>> wrote:
>>> 
>>> Hi Alexis,
>>> in Li-design there is the stats manager not in form of standalone app but 
>>> as part of core of ofPlugin. You can disable it via rpc.
>>> 
>>> Just a question regarding your ovs setup. Do you have all DPIDs unique?
>>> 
>>> Also there is limit for netty in form of amount of used threads. By default 
>>> it uses 2 x cpu_cores_amount. You should have as many cores as possible in 
>>> order to get max performance.
>>> 
>>> 
>>> 
>>> Regards,
>>> Michal
>>> 
>>> 
>>> 
>>> ________________________________________
>>> From: [email protected] 
>>> <mailto:[email protected]> 
>>> <[email protected] 
>>> <mailto:[email protected]>> on behalf of 
>>> Alexis de Talhouët <[email protected] 
>>> <mailto:[email protected]>>
>>> Sent: Tuesday, February 9, 2016 00:45
>>> To: [email protected] 
>>> <mailto:[email protected]>
>>> Subject: [openflowplugin-dev] Scalability issues
>>> 
>>> Hello openflowplugin-dev,
>>> 
>>> I’m currently running some scalability test against openflowplugin-li 
>>> plugin, stable/lithium.
>>> Playing with CSIT job, I was able to connect up to 1090 switches: 
>>> https://git.opendaylight.org/gerrit/#/c/33213/ 
>>> <https://git.opendaylight.org/gerrit/#/c/33213/>
>>> 
>>> I’m now running the test against 40 OvS switches, each one of them is in a 
>>> docker container.
>>> 
>>> Connecting around 30 of them works fine, but then, adding a new one break 
>>> completely ODL, it goes crazy and unresponsible.
>>> Attach a snippet of the karaf.log with log set to DEBUG for 
>>> org.opendaylight.openflowplugin, thus it’s a really big log (~2.5MB).
>>> 
>>> Here it what I observed based on the log:
>>> I have 30 switches connected, all works fine. Then I add a new one:
>>> - SalRoleServiceImpl starts doing its thing (2016-02-08 23:13:38,534)
>>> - RpcManagerImpl Registering Openflow RPCs (2016-02-08 23:13:38,546)
>>> - ConnectionAdapterImpl Hello received (2016-02-08 23:13:40,520)
>>> - Creation of the transaction chain, …
>>> 
>>> Then all starts failing apart with this log:
>>>> 2016-02-08 23:13:50,021 | DEBUG | ntLoopGroup-11-9 | ConnectionContextImpl 
>>>>            | 190 - org.opendaylight.openflowplugin.impl - 0.1.4.SNAPSHOT | 
>>>> disconnecting: node=/172.31.100.9:46736|auxId=0|connection state = RIP
>>> End then ConnectionContextImpl disconnects one by one the switches, 
>>> RpcManagerImpl is unregistered
>>> Then it goes crazy for a while.
>>> But all I’ve done is adding a new switch..
>>> 
>>> Finally, at 2016-02-08 23:14:26,666, exceptions are thrown:
>>>> 2016-02-08 23:14:26,666 | ERROR | lt-dispatcher-85 | 
>>>> LocalThreePhaseCommitCohort      | 172 - 
>>>> org.opendaylight.controller.sal-distributed-datastore - 1.2.4.SNAPSHOT | 
>>>> Failed to prepare transaction member-1-chn-5-txn-180 on backend
>>>> akka.pattern.AskTimeoutException: Ask timed out on 
>>>> [ActorSelection[Anchor(akka://opendaylight-cluster-data/ 
>>>> <akka://opendaylight-cluster-data/>), 
>>>> Path(/user/shardmanager-operational/member-1-shard-inventory-operational#-1518836725)]]
>>>>  after [30000 ms]
>>> And it goes for a while.
>>> 
>>> Do you have any input on the same?
>>> 
>>> Could you give some advice to be able to scale? (I know disabling 
>>> StatisticManager can help for instance)
>>> 
>>> Am I doing something wrong?
>>> 
>>> I can provide any asked information regarding the issue I’m facing.
>>> 
>>> Thanks,
>>> Alexis
>>> 
>>> 
>> 
> 
> _______________________________________________
> openflowplugin-dev mailing list
> [email protected]
> https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev

_______________________________________________
openflowplugin-dev mailing list
[email protected]
https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
  • Re: [openflowpl... Michal Rehak -X (mirehak - PANTHEON TECHNOLOGIES at Cisco)
    • Re: [openf... Alexis de Talhouët
      • Re: [o... Michal Rehak -X (mirehak - PANTHEON TECHNOLOGIES at Cisco)
        • Re... Alexis de Talhouët
          • ... Luis Gomez
            • ... Alexis de Talhouët
              • ... Luis Gomez
                • ... Jamo Luhrsen
                • ... Alexis de Talhouët
                • ... Jamo Luhrsen
                • ... Alexis de Talhouët
                • ... Alexis de Talhouët
                • ... Abhijit Kumbhare
                • ... Alexis de Talhouët
                • ... Alexis de Talhouët

Reply via email to