I believe that it is partially because of controller channel and partially
because POX is busy processing events. Anyway, by removing shuffle, the
link should really expire only if there is no probe in several seconds. I
do not see a reason for probe be delayed unless
a) it is lost
b) switch is really overloaded
c) there are >200 links in the network and pox timer is unable to generate
enough timer events

Peter


On Fri, Apr 12, 2013 at 7:36 PM, Weiyang Mo <[email protected]> wrote:

> Hi,Peter, thank you very much.
>
> Although I tried on Mininet tree  (120 switches) network and it solved the
> timout in this case, I'm still not sure whether it can solve the real
> networks. My real networks is smaller scale (4 switches) and if these
> switches have larger timer skew, I think you solution will be well
> addressed. I cannot tell you now since it is weekend.
>
> Murphy: If it is the shuffle issue, I still don't figure out why the
> time_out will be more frequent when 3 cases happen (in my previous mail).
> All the 3 cases look like that controller channel is partly taken by other
> packet headers. Is it still related to time skew?
>
> Best
>
> Weiyang
>
>
> 2013/4/12 Peter Peresini <[email protected]>
>
>> Hi,
>>  I believe this might related to a small bug in discovery module I
>> encountered some time ago (and bad for me - did not report to Murphy). Can
>> you try removing "random.shuffle" of probes in the discovery module?
>>
>> 2Murphy: Shuffling probes after each lldp cycle might cause some problems
>> if the number of links is big enough (I saw this problem on a fattree topo
>> with 20 switches) because there is slight timer skew (afaik timer can
>> handle roughly 200 invocations/sec) and after shuffling the oldest probes
>> sent might wait for another cycle time which efficiently puts them on edge
>> of expiring (expiration time is 2*cycle time).
>>
>> Peter
>>
>>
>>
>> On Fri, Apr 12, 2013 at 5:04 PM, Weiyang Mo <[email protected]>wrote:
>>
>>> Hi,
>>>
>>> I always use the openflow.discovery as my topology module, however I met
>>> some strange behaviors recently and raise some questions.
>>>
>>> The strange behaviors are that link time_out appear unexpectedly, which
>>> causes flow entry deleted  ( I'm using l2_multi). But actually the link is
>>> good.
>>>
>>> The unexpected link time_out may happen in following cases, and more
>>> frequent if several cases at the same time:
>>>
>>> (1)  If I keep requesting info from switches ( e.g. portstatus request).
>>> I'm wondering why the request causes this. Is that because the request
>>> flushes the LLDP packets?
>>>
>>> (2)  If a new traffic is introduced. Is that because of the traffic to
>>> controller blocks LLDP during learning before flow installed?
>>>
>>> (3)  If do flow entry modification.  I guess the modification takes some
>>> time and during this time, many data packets are forwarded to controller
>>> and occupy control channel.
>>>
>>> Unfortuanately, My program is doing all the above for some intelligent
>>> routing. However the unexpected link time_out will fresh everything... It
>>> is still need to have the time_out because sometimes it is really a link
>>> disconnection. I'm requesting port_status every 2 seconds, and use the feed
>>> back to intelligent route.
>>>
>>> Is that  because of LLDPs are blaocked by other packets in control
>>> channel and links cannot be updated? Is it possible to set highest priority
>>> for LLDP in control channel rather than others? If the data channel is
>>> almost fully occupied, will the LLDPs be blocked in that channel and be
>>> treated as link time_out?
>>>
>>> And another question is that: Why not only using port_status  as
>>> link_event rather than link update? The most concern I can think probably
>>> is that some cables are really bad, but they are stilltreated connected for
>>> switches?
>>>
>>> Thanks very much.
>>>
>>> Weiyang
>>>
>>
>>
>

Reply via email to