Alexis, thanks for the bug and the patch, and keep up the good work digging at openflowplugin.
JamO On 03/04/2016 07:38 AM, Alexis de Talhouët wrote: > JamO, > > Here is the bug: https://bugs.opendaylight.org/show_bug.cgi?id=5464 > Here is the patch in int/test: https://git.opendaylight.org/gerrit/#/c/35813/ > It is still WIP. And yes I believe we should have a CSIT job running the test. > > Thanks, > Alexis >> On Mar 3, 2016, at 12:41 AM, Jamo Luhrsen <[email protected] >> <mailto:[email protected]>> wrote: >> >> >> >> On 02/19/2016 02:10 PM, Alexis de Talhouët wrote: >>> So far my results are: >>> >>> OVS 2.4.0: ODL configure with 2G of mem —> max is ~50 switches connected >>> OVS 2.3.1: ODL configure with 256MG of mem —> I currently have 150 switches >>> connected, can’t scale more due to infra >>> limits. >> >> Alexis, I think this is probably worth putting a bugzilla up. >> >> How much horsepower do you need per docker ovs instance? We need to get this >> automated in CSIT. Marcus from ovsdb wants to do similar tests with ovsdb. >> >> JamO >> >> >>> I will pursue me testing next week. >>> >>> Thanks, >>> Alexis >>> >>>> On Feb 19, 2016, at 5:06 PM, Abhijit Kumbhare <[email protected] >>>> <mailto:[email protected]> >>>> <mailto:[email protected]>> wrote: >>>> >>>> Interesting. I wonder - why that would be? >>>> >>>> On Fri, Feb 19, 2016 at 1:19 PM, Alexis de Talhouët >>>> <[email protected] <mailto:[email protected]> >>>> <mailto:[email protected]>> wrote: >>>> >>>> OVS 2.3.x scales fine >>>> OVS 2.4.x doesn’t scale well. >>>> >>>> Here is also the docker file for ovs 2.4.1 >>>> >>>> >>>> >>>>> On Feb 19, 2016, at 11:20 AM, Alexis de Talhouët >>>>> <[email protected] <mailto:[email protected]> >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>>> can I use your containers? do you have any scripts/tools to bring >>>>>> things up/down? >>>>> >>>>> Sure, attached a tar file containing all scripts / config / dockerfile >>>>> I’m using to setup docker containers >>>>> emulating OvS. >>>>> FYI: it’s ovs 2.3.0 and not 2.4.0 anymore >>>>> >>>>> Also, forget about this whole mail thread, something in my private >>>>> container must be breaking OVS behaviour, I >>>>> don’t know what yet. >>>>> >>>>> With the docker file attached here, I can scale 90+ without any >>>>> trouble... >>>>> >>>>> Thanks, >>>>> Alexis >>>>> >>>>> <ovs_scalability_setup.tar.gz> >>>>> >>>>>> On Feb 18, 2016, at 6:07 PM, Jamo Luhrsen <[email protected] >>>>>> <mailto:[email protected]> >>>>>> <mailto:[email protected]>> wrote: >>>>>> >>>>>> inline... >>>>>> >>>>>> On 02/18/2016 02:58 PM, Alexis de Talhouët wrote: >>>>>>> I’m running OVS 2.4, against stable/lithium, openflowplugin-li >>>>>> >>>>>> >>>>>> so this is one difference between CSIT and your setup, in addition to >>>>>> the whole >>>>>> containers vs mininet. >>>>>> >>>>>>> I never scaled up to 1k, this was in the CSIT job. >>>>>>> In a real scenario, I scaled to ~400. But it was even before >>>>>>> clustering came into play in ofp lithium. >>>>>>> >>>>>>> I think the log I sent have log trace for openflowplugin and >>>>>>> openflowjava, it not the case I could resubmit the >>>>>>> logs. >>>>>>> I removed some of them in openflowjava because it was way to chatty >>>>>>> (logging all messages content between ovs >>>>>>> <---> odl) >>>>>>> >>>>>>> Unfortunately those IOException happen after the whole thing blow >>>>>>> up. I was able to narrow done some logs in >>>>>>> openflowjava >>>>>>> to see the first disconnected event. As mentioned in a previous mail >>>>>>> (in this mail thread) it’s the device that is >>>>>>> issuing the disconnect: >>>>>>> >>>>>>>> 2016-02-18 16:56:30,018 | DEBUG | entLoopGroup-6-3 | OFFrameDecoder >>>>>>>> | 201 - >>>>>>>> org.opendaylight.openflowjava.openflow-protocol-impl - >>>>>>>> 0.6.4.SNAPSHOT | skipping bytebuf - too few bytes for >>>>>>>> header: 0 < 8 >>>>>>>> 2016-02-18 16:56:30,018 | DEBUG | entLoopGroup-6-3 | >>>>>>>> OFVersionDetector | 201 - >>>>>>>> org.opendaylight.openflowjava.openflow-protocol-impl - >>>>>>>> 0.6.4.SNAPSHOT | not enough data >>>>>>>> 2016-02-18 16:56:30,018 | DEBUG | entLoopGroup-6-3 | >>>>>>>> DelegatingInboundHandler | 201 - >>>>>>>> org.opendaylight.openflowjava.openflow-protocol-impl - >>>>>>>> 0.6.4.SNAPSHOT | Channel inactive >>>>>>>> 2016-02-18 16:56:30,018 | DEBUG | entLoopGroup-6-3 | >>>>>>>> ConnectionAdapterImpl | 201 - >>>>>>>> org.opendaylight.openflowjava.openflow-protocol-impl - >>>>>>>> 0.6.4.SNAPSHOT | ConsumeIntern msg on [id: 0x1efab5fb, >>>>>>>> /172.18.0.49:36983 <http://172.18.0.49:36983/> :> >>>>>>>> /192.168.1.159:6633 <http://192.168.1.159:6633/>] >>>>>>>> 2016-02-18 16:56:30,018 | DEBUG | entLoopGroup-6-3 | >>>>>>>> ConnectionAdapterImpl | 201 - >>>>>>>> org.opendaylight.openflowjava.openflow-protocol-impl - >>>>>>>> 0.6.4.SNAPSHOT | ConsumeIntern msg - DisconnectEvent >>>>>>>> 2016-02-18 16:56:30,018 | DEBUG | entLoopGroup-6-3 | >>>>>>>> ConnectionContextImpl | 205 - >>>>>>>> org.opendaylight.openflowplugin.impl - 0.1.4.SNAPSHOT | >>>>>>>> disconnecting: node=/172.18.0.49:36983|auxId=0|connection >>>>>>>> state = RIP >>>>>>> >>>>>>> Those logs come from another run, so are not in the logs I sent >>>>>>> earlier. Although the behaviour is always the >>>>>>> same. >>>>>>> >>>>>>> Regarding the memory, I don’t want to add more than 2G memory, >>>>>>> because, and I tested it, the more memory I add, >>>>>>> the more >>>>>>> I can scale. But as you pointed out, >>>>>>> this issue is not OOM error. Thus I rather like failing at 2G (less >>>>>>> docker containers to spawn each run ~50). >>>>>> >>>>>> so, maybe reduce your memory then to simplify the reproducing steps. >>>>>> Since you know that increasing >>>>>> memory allows you to scale further, but still hit the problem; let's >>>>>> make it easier to hit. how far >>>>>> can you go with the max mem set to 500M? if you are only loading >>>>>> ofp-li. >>>>>> >>>>>>> I definitely need some help here, because I can’t sort myself out in >>>>>>> the openflowplugin + openflowjava codebase… >>>>>>> But I believe I already have Michal’s attention :) >>>>>> >>>>>> can I use your containers? do you have any scripts/tools to bring >>>>>> things up/down? >>>>>> I might be able to try and reproduce myself. I like breaking things >>>>>> :) >>>>>> >>>>>> JamO >>>>>> >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Alexis >>>>>>> >>>>>>> >>>>>>>> On Feb 18, 2016, at 5:44 PM, Jamo Luhrsen <[email protected] >>>>>>>> <mailto:[email protected]> >>>>>>>> <mailto:[email protected]> <mailto:[email protected]>> wrote: >>>>>>>> >>>>>>>> Alexis, don't worry about filing a bug just to give us a common >>>>>>>> place to work/comment, even >>>>>>>> if we close it later because of something outside of ODL. Email is >>>>>>>> fine too. >>>>>>>> >>>>>>>> what ovs version do you have in your containers? this test sounds >>>>>>>> great. >>>>>>>> >>>>>>>> Luis is right, that if you were scaling well past 1k in the past, >>>>>>>> but now it falls over at >>>>>>>> 50 it sounds like a bug. >>>>>>>> >>>>>>>> Oh, you can try increasing the jvm max_mem from default of 2G just >>>>>>>> as a data point. The >>>>>>>> fact that you don't get OOMs makes me think memory might not be the >>>>>>>> final bottleneck. >>>>>>>> >>>>>>>> you could enable debug/trace logs in the right modules (need ofp >>>>>>>> devs to tell us that) >>>>>>>> for a little more info. >>>>>>>> >>>>>>>> I've seen those IOExceptions before and always assumed it was from >>>>>>>> an OF switch doing a >>>>>>>> hard RST on it's connection. >>>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> JamO >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 02/18/2016 11:48 AM, Luis Gomez wrote: >>>>>>>>> If the same test worked 6-8 months ago this seems like a bug, but >>>>>>>>> please feel free to open it whenever you >>>>>>>>> are sure. >>>>>>>>> >>>>>>>>>> On Feb 18, 2016, at 11:45 AM, Alexis de Talhouët >>>>>>>>>> <[email protected] <mailto:[email protected]> >>>>>>>>>> <mailto:[email protected]> >>>>>>>>>> <mailto:[email protected]> >>>>>>>>>> <mailto:[email protected]>> wrote: >>>>>>>>>> >>>>>>>>>> Hello Luis, >>>>>>>>>> >>>>>>>>>> For sure I’m willing to open a bug but before I want to make sure >>>>>>>>>> there is a bug and that I’m not doing >>>>>>>>>> something wrong. >>>>>>>>>> In ODL’s infra, there is a test to find the maximum number of >>>>>>>>>> switches that can be connected to ODL, and >>>>>>>>>> this test >>>>>>>>>> reach ~ 500 [0] >>>>>>>>>> I was able to scale up to 1090 switches [1] using the CSIT job in >>>>>>>>>> the sandbox. >>>>>>>>>> I believe the CSIT test is different in a way that switches are >>>>>>>>>> emulated in one mininet VM, whereas I’m >>>>>>>>>> connecting OVS >>>>>>>>>> instances from separate containers. >>>>>>>>>> >>>>>>>>>> 6-8 months ago, I was able to perform the same test, and scale >>>>>>>>>> with OVS docker container up to ~400 before >>>>>>>>>> ODL start >>>>>>>>>> crashing (with some optimization done behind the scene, i.e. >>>>>>>>>> ulimit, mem, cpu, GC…) >>>>>>>>>> Now I’m not able to scale more than 100 with the same >>>>>>>>>> configuration. >>>>>>>>>> >>>>>>>>>> FYI: I just quickly look at the CSIT test [0] karaf.log, it seems >>>>>>>>>> the test is actually failing but it is not >>>>>>>>>> correctly >>>>>>>>>> advertised… switch connection are dropped. >>>>>>>>>> Look for those: >>>>>>>>>> 016-02-18 07:07:51,741 | WARN | entLoopGroup-6-6 | >>>>>>>>>> OFFrameDecoder | 181 - >>>>>>>>>> org.opendaylight.openflowjava.openflow-protocol-impl - >>>>>>>>>> 0.6.4.SNAPSHOT | Unexpected exception from downstream. >>>>>>>>>> java.io.IOException: Connection reset by peer >>>>>>>>>> at sun.nio.ch.FileDispatcherImpl.read0(Native Method)[:1.7.0_85] >>>>>>>>>> at >>>>>>>>>> sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)[:1.7.0_85] >>>>>>>>>> at >>>>>>>>>> sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)[:1.7.0_85] >>>>>>>>>> at sun.nio.ch.IOUtil.read(IOUtil.java:192)[:1.7.0_85] >>>>>>>>>> at >>>>>>>>>> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:384)[:1.7.0_85] >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:311)[111:io.netty.buffer:4.0.26.Final] >>>>>>>>>> at >>>>>>>>>> io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:881)[111:io.netty.buffer:4.0.26.Final] >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:241)[109:io.netty.transport:4.0.26.Final] >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:119)[109:io.netty.transport:4.0.26.Final] >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)[109:io.netty.transport:4.0.26.Final] >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)[109:io.netty.transport:4.0.26.Final] >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)[109:io.netty.transport:4.0.26.Final] >>>>>>>>>> at >>>>>>>>>> io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:349)[109:io.netty.transport:4.0.26.Final] >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)[110:io.netty.common:4.0.26.Final] >>>>>>>>>> at >>>>>>>>>> >>>>>>>>>> io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)[110:io.netty.common:4.0.26.Final] >>>>>>>>>> at java.lang.Thread.run(Thread.java:745)[:1.7.0_85] >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> [0]: >>>>>>>>>> https://jenkins.opendaylight.org/releng/view/openflowplugin/job/openflowplugin-csit-1node-periodic-scalability-daily-only-stable-lithium/ >>>>>>>>>> [1]: https://git.opendaylight.org/gerrit/#/c/33213/ >>>>>>>>>> >>>>>>>>>>> On Feb 18, 2016, at 2:28 PM, Luis Gomez <[email protected] >>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>> <mailto:[email protected]> <mailto:[email protected]>> wrote: >>>>>>>>>>> >>>>>>>>>>> Alexis, thanks very much for sharing this test. Would you mind >>>>>>>>>>> to open a bug with all this info so we can >>>>>>>>>>> track this? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On Feb 18, 2016, at 7:29 AM, Alexis de Talhouët >>>>>>>>>>>> <[email protected] <mailto:[email protected]> >>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>> <mailto:[email protected]>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Michal, >>>>>>>>>>>> >>>>>>>>>>>> ODL memory is capped at 2go, the more memory I add, those more >>>>>>>>>>>> OVS I can connect. Regarding CPU, it’s >>>>>>>>>>>> around 10-20% >>>>>>>>>>>> when connecting new OVS, with some peak to 80%. >>>>>>>>>>>> >>>>>>>>>>>> After some investigation, here is what I observed: >>>>>>>>>>>> Let say I have 50 switches connected, stat manager disabled. I >>>>>>>>>>>> have one opened socket per switch, plus an >>>>>>>>>>>> additional >>>>>>>>>>>> one for the controller. >>>>>>>>>>>> Then I connect a new switch (2016-02-18 09:35:08,059), 51 >>>>>>>>>>>> switches… something is happening causing all >>>>>>>>>>>> connection to >>>>>>>>>>>> be dropped (by device?) and then ODL >>>>>>>>>>>> try to recreate them and goes in a crazy loop where it is never >>>>>>>>>>>> able to re-establish communication, but keeps >>>>>>>>>>>> creating new sockets. >>>>>>>>>>>> I’m suspecting something being garbage collected due to lack of >>>>>>>>>>>> memory, although no OOM errors. >>>>>>>>>>>> >>>>>>>>>>>> Attached the YourKit Java Profiler analysis for the described >>>>>>>>>>>> scenario and the logs [1]. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Alexis >>>>>>>>>>>> >>>>>>>>>>>> [1]: >>>>>>>>>>>> https://www.dropbox.com/sh/dgqeqv4j76zwbh3/AACim0za1fUozc7DlYJ4fsMJa?dl=0 >>>>>>>>>>>> >>>>>>>>>>>>> On Feb 9, 2016, at 8:59 AM, Michal Rehak -X (mirehak - >>>>>>>>>>>>> PANTHEON TECHNOLOGIES at Cisco) >>>>>>>>>>>>> <[email protected] <mailto:[email protected]> >>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>> <mailto:[email protected]>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Alexis, >>>>>>>>>>>>> I am not sure how OVS uses threads - in changelog there is >>>>>>>>>>>>> some concurrency related improvement in 2.1.3 >>>>>>>>>>>>> and 2.3. >>>>>>>>>>>>> Also I guess docker can be forced regarding assigned resources. >>>>>>>>>>>>> >>>>>>>>>>>>> For you the most important is the amount of cores used by >>>>>>>>>>>>> controller. >>>>>>>>>>>>> >>>>>>>>>>>>> How does your cpu and memory consumption look like when you >>>>>>>>>>>>> connect all the OVSs? >>>>>>>>>>>>> >>>>>>>>>>>>> Regards, >>>>>>>>>>>>> Michal >>>>>>>>>>>>> >>>>>>>>>>>>> ________________________________________ >>>>>>>>>>>>> From: Alexis de Talhouët <[email protected] >>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>> <mailto:[email protected]>> >>>>>>>>>>>>> Sent: Tuesday, February 9, 2016 14:44 >>>>>>>>>>>>> To: Michal Rehak -X (mirehak - PANTHEON TECHNOLOGIES at Cisco) >>>>>>>>>>>>> Cc: [email protected] >>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>> Subject: Re: [openflowplugin-dev] Scalability issues >>>>>>>>>>>>> >>>>>>>>>>>>> Hello Michal, >>>>>>>>>>>>> >>>>>>>>>>>>> Yes, all the OvS instances I’m running has a unique DPID. >>>>>>>>>>>>> >>>>>>>>>>>>> Regarding the thread limit for netty, I’m running test in a >>>>>>>>>>>>> server that has 28 CPU(s). >>>>>>>>>>>>> >>>>>>>>>>>>> Does each OvS instances is assigned its own thread? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> Alexis >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On Feb 9, 2016, at 3:42 AM, Michal Rehak -X (mirehak - >>>>>>>>>>>>>> PANTHEON TECHNOLOGIES at Cisco) >>>>>>>>>>>>>> <[email protected] <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Alexis, >>>>>>>>>>>>>> in Li-design there is the stats manager not in form of >>>>>>>>>>>>>> standalone app but as part of core of ofPlugin. >>>>>>>>>>>>>> You can >>>>>>>>>>>>>> disable it via rpc. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Just a question regarding your ovs setup. Do you have all >>>>>>>>>>>>>> DPIDs unique? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also there is limit for netty in form of amount of used >>>>>>>>>>>>>> threads. By default it uses 2 x >>>>>>>>>>>>>> cpu_cores_amount. You >>>>>>>>>>>>>> should have as many cores as possible in order to get max >>>>>>>>>>>>>> performance. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> Michal >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ________________________________________ >>>>>>>>>>>>>> From: [email protected] >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <[email protected] >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]>> >>>>>>>>>>>>>> on >>>>>>>>>>>>>> behalf of Alexis de Talhouët <[email protected] >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]>> >>>>>>>>>>>>>> Sent: Tuesday, February 9, 2016 00:45 >>>>>>>>>>>>>> To: [email protected] >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>>>> Subject: [openflowplugin-dev] Scalability issues >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hello openflowplugin-dev, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I’m currently running some scalability test against >>>>>>>>>>>>>> openflowplugin-li plugin, stable/lithium. >>>>>>>>>>>>>> Playing with CSIT job, I was able to connect up to 1090 >>>>>>>>>>>>>> switches: https://git.opendaylight.org/gerrit/#/c/33213/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> I’m now running the test against 40 OvS switches, each one of >>>>>>>>>>>>>> them is in a docker container. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Connecting around 30 of them works fine, but then, adding a >>>>>>>>>>>>>> new one break completely ODL, it goes crazy and >>>>>>>>>>>>>> unresponsible. >>>>>>>>>>>>>> Attach a snippet of the karaf.log with log set to DEBUG for >>>>>>>>>>>>>> org.opendaylight.openflowplugin, thus it’s a >>>>>>>>>>>>>> really >>>>>>>>>>>>>> big log (~2.5MB). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Here it what I observed based on the log: >>>>>>>>>>>>>> I have 30 switches connected, all works fine. Then I add a >>>>>>>>>>>>>> new one: >>>>>>>>>>>>>> - SalRoleServiceImpl starts doing its thing (2016-02-08 >>>>>>>>>>>>>> 23:13:38,534) >>>>>>>>>>>>>> - RpcManagerImpl Registering Openflow RPCs (2016-02-08 >>>>>>>>>>>>>> 23:13:38,546) >>>>>>>>>>>>>> - ConnectionAdapterImpl Hello received (2016-02-08 >>>>>>>>>>>>>> 23:13:40,520) >>>>>>>>>>>>>> - Creation of the transaction chain, … >>>>>>>>>>>>>> >>>>>>>>>>>>>> Then all starts failing apart with this log: >>>>>>>>>>>>>>> 2016-02-08 23:13:50,021 | DEBUG | ntLoopGroup-11-9 | >>>>>>>>>>>>>>> ConnectionContextImpl | 190 - >>>>>>>>>>>>>>> org.opendaylight.openflowplugin.impl - 0.1.4.SNAPSHOT | >>>>>>>>>>>>>>> disconnecting: >>>>>>>>>>>>>>> node=/172.31.100.9:46736|auxId=0|connection state = RIP >>>>>>>>>>>>>> End then ConnectionContextImpl disconnects one by one the >>>>>>>>>>>>>> switches, RpcManagerImpl is unregistered >>>>>>>>>>>>>> Then it goes crazy for a while. >>>>>>>>>>>>>> But all I’ve done is adding a new switch.. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Finally, at 2016-02-08 23:14:26,666, exceptions are thrown: >>>>>>>>>>>>>>> 2016-02-08 23:14:26,666 | ERROR | lt-dispatcher-85 | >>>>>>>>>>>>>>> LocalThreePhaseCommitCohort | 172 - >>>>>>>>>>>>>>> org.opendaylight.controller.sal-distributed-datastore - >>>>>>>>>>>>>>> 1.2.4.SNAPSHOT | Failed to prepare transaction >>>>>>>>>>>>>>> member-1-chn-5-txn-180 on backend >>>>>>>>>>>>>>> akka.pattern.AskTimeoutException: Ask timed out on >>>>>>>>>>>>>>> [ActorSelection[Anchor(akka://opendaylight-cluster-data/), >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Path(/user/shardmanager-operational/member-1-shard-inventory-operational#-1518836725)]] >>>>>>>>>>>>>>> after [30000 ms] >>>>>>>>>>>>>> And it goes for a while. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Do you have any input on the same? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Could you give some advice to be able to scale? (I know >>>>>>>>>>>>>> disabling StatisticManager can help for instance) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Am I doing something wrong? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I can provide any asked information regarding the issue I’m >>>>>>>>>>>>>> facing. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> Alexis >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> openflowplugin-dev mailing list >>>>>>>>>>>> [email protected] >>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>> <mailto:[email protected]> >>>>>>>>>>>> >>>>>>>>>>>> https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> openflowplugin-dev mailing list >>>>>>>>> [email protected] >>>>>>>>> <mailto:[email protected]> >>>>>>>>> <mailto:[email protected]> >>>>>>>>> <mailto:[email protected]> >>>>>>>>> https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev >>>>> >>>> >>>> >>>> _______________________________________________ >>>> openflowplugin-dev mailing list >>>> [email protected] >>>> <mailto:[email protected]> >>>> <mailto:[email protected]> >>>> https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev >>>> >>>> >>> > _______________________________________________ openflowplugin-dev mailing list [email protected] https://lists.opendaylight.org/mailman/listinfo/openflowplugin-dev
