Martin,

The issue is that when zebra starts up it fails to recognize routes in the
kernel that it previously put there.  To put it another way, it fails to
recognize that those kernel routes were put there by itself (with flag
"1"), so it assumes they are kernel routes and adds them to the RIB.  From
then on, those routes will always take priority, and never go away, so in
effect OSPF is broken.

I have pointed out where (I think) the error comes from, both in this
thread and in a dev thread, but haven't received any response.  Whether
zebra is killed in a brutal or nice way, zebra should recognize routes it
inserted into the kernel the next time it starts up, which it currently
does not do (since version >=1.0).  Is this not a problem on other
operating systems, or is it just that in most cases zebra is never
restarted unless the entire operating system is restarted?

If you could suggest the proper way to restart zebra that would be
appreciated.  However, that would still not be ideal because the routes
would be removed for a short period of time until they are re-inserted
(when zebra starts back up), which is not the intention.  Of course it
would be ideal if pfSense could make config changes using the zebra VTY,
but it currently does not have that capability.  Currently pfSense writes
changes to the Quagga config files (when a change is made through the web
interface) and then restarts both daemons.

Please let us know what other info we can provide, and how we can help to
get this issue resolved.  Compensation for fixing this "bug" is a
possibility.

Thank you,
Nate Baker

On Tue, Nov 15, 2016 at 6:06 PM, Martin Winter <
mwin...@opensourcerouting.org> wrote:

> Reqlez,
>
> On 12 Nov 2016, at 22:13, Reqlez Guy wrote:
>
> Wait ... is this bug related to this ??? https://lists.quagga.net/piper
>> mail/quagga-dev/2016-February/014777.html   Sounds very familiar ( since
>> this looks like similar thing people are experiencing...  ) and Pfsense is
>> on FreeBSD 10 ...
>>
>
> No, this is a different issue.
> (I hope to have some closure on that issue from february soon as well, but
> that issue is actually some updates from Zebra to FreeBSD getting ignored
> and it looks like some
> data corruption)
>
> The issue here (and what I still lack to understand) is the reason for
> pfsense to kill Quagga every time something changes.
> And it does this the most brutal way with a “kill -9” which give all the
> quagga processes no chance for cleanup.
>
> Quagga is designed for dynamic routing, so an interface going down etc is
> what it is designed to work with and
> recalculate the routing tables based on this - and in a much faster way
> than a full rebuilding all tables because
> of a restart.
> I assume there must be some reason for the kill - maybe another bug? - and
> I would love to get an answer
> for this.
>
> - Martin
>
> ________________________________
>> From: Reqlez Guy
>> Sent: 30 October 2016 01:08
>> To: quagga-users@lists.quagga.net
>> Cc: Martin Winter
>> Subject: Re: [quagga-users 14440] Issues with Routes in FreeBSD / PfSense
>> New to Release 1.0
>>
>> So here are the responses from Pfsense and another user in the forum:
>>
>> ----------
>>
>> Spydre13:
>>
>> I looked at the changelog too, and didn't see anything that would fix
>> this.  The main problem is that when Quagga restarts, it doesn't recognize
>> the routes that it previously put in there, so it pulls them in as "kernel"
>> routes and they will always take precedence.  That's why it works fine
>> until Quagga is restarted (which is basically kill & start, there is no
>> graceful restart in Quagga).  Since the rib_sweep_table() function isn't
>> used anymore, when it starts up it doesn't remove routes from the list of
>> kernel routes that it previously put there (which it flags as RTF_PROTO1,
>> or "1" in netstat -r).  I don't see how they aren't having more issues with
>> this, unless the common scenario is that Quagga never gets restarted unless
>> the whole OS is restarted.
>>
>> I don't see why kill -9 matters here, because it worked fine before v1.0,
>> and there is no graceful restart capability in Quagga.  Ideally pfSense
>> could use the Quagga VTY to make changes live without restarting, and then
>> write changes to the config files for the next time it starts up, but I
>> doubt anyone wants to take on a project like that.
>>
>> If you want more details let me know, but it would probably make more
>> sense to discuss on the Quagga list instead of here.
>> ----------
>>
>> ----------
>>
>> Jimp from pfsense team:
>>
>> That sounds like the issue. Preventing it from restarting is a hackish
>> workaround no matter what signal is used. It will get restarted at some
>> point and failing to recover gracefully is a regression in quagga's
>> behavior in 1.x.
>>
>> It needs to recognize the flags it sets on routes in the table, and it
>> isn't. Hopefully someone at Quagga can pick up and run with that on their
>> list.
>> ----------
>>
>> Can anybody comment ? Since this bug has been in the code for 8 months
>> now ... or more ...
>>
>> ________________________________
>> From: Reqlez Guy
>> Sent: 18 October 2016 19:46
>> To: quagga-users@lists.quagga.net
>> Cc: Martin Winter
>> Subject: Re: [quagga-users 14440] Issues with Routes in FreeBSD / PfSense
>> New to Release 1.0
>>
>> Sorry it seems I'm a lists noob and didnt realize this stuff was not
>> going into list... i'm assuming that i should be emailing the list only ?
>> and not Martin or anybody else ? I CCed Martin just in case.
>>
>> So as per Martin, he thinks what is triggering the issue is the use of -9
>> to terminate quagga process in pfsense rc scripts. I did submit the debug
>> logs to Martin... not sure if you need more. And no, I have no tested the
>> routers yet while eliminating -9 from pfsense scripts.
>>
>> Martin: Please see below response from a person in pfsense forum:
>>
>> I see Martin's reply to you on Oct. 10, but I don't see anything after
>> that.  Are you emailing him off-list?
>>
>> I was looking through the Quagga code last night, and found something
>> that I'm wondering whether or not could be the problem.  Quagga (zebra
>> daemon) puts routes into the kernel with flag "1" (RTF_PROTO1, see netstat
>> man page).  When zebra starts up it's supposed to ignore (filter out) any
>> kernel routes with flag "1" because it should assume it put those there to
>> begin with.  I think before Quagga version 1 this was working, and in
>> version >= 1 it pulls in those kernel routes into the zebra RIB.
>>
>> If I reboot a firewall and go to OSPF -> Status -> Zebra routes, I see a
>> bunch of OSPF routes but barely any K (kernel) routes.  If I make any
>> change on the Global Settings or Interface Settings tab quagga restarts,
>> and then when looking at the zebra routes it is filled with kernel routes
>> (one for each OSPF route).
>>
>> Can you ask Martin to look at this:
>> Commit: https://github.com/Quagga/quagga/commit/0d0686f98e6401741507
>> 1e590bde262f0ab5a4c9
>>
>> File: zebra/zebra_rib.c
>> Function: rib_sweep_table
>>
>> This function is commented out starting in version 1, but it was used in
>> version 0.99.24.  There is a block of code in it:
>>
>> Code: [Select]
>>
>> if (rib->type == ZEBRA_ROUTE_KERNEL &&
>>   CHECK_FLAG (rib->flags, ZEBRA_FLAG_SELFROUTE))
>> {
>>     ret = rib_uninstall_kernel (rn, rib);
>>     if (! ret)
>>         rib_delnode (rn, rib);
>> }
>>
>> The rib_weed_tables function that is still being used doesn't seem to do
>> this same thing, from what I can tell.  This URL shows them side-by-side:
>> https://fossies.org/diffs/quagga/0.99.24.1_vs_1.0.20160315/
>> zebra/zebra_rib.c-diff.html
>>
>> If you can point me to the thread where you are discussing this with
>> Martin, I can pass this along to him if you prefer.
>>
>> ________________________________
>> From: Martin Winter <mwin...@opensourcerouting.org>
>> Sent: 10 October 2016 04:42
>> To: Reqlez Guy
>> Cc: quagga-users@lists.quagga.net
>> Subject: Re: [quagga-users 14440] Issues with Routes in FreeBSD / PfSense
>> New to Release 1.0
>>
>> Just seeing this now...
>>
>> On 3 Oct 2016, at 18:13, Reqlez Guy wrote:
>>
>> Hello.
>>>
>>> Can anybody review this and see what they think ?
>>> https://forum.pfsense.org/index.php?topic=111108.0
>>>
>> Major issue with QUAGGA-OSPF and VLANs (pfsense 2.3.0)<
>> https://forum.pfsense.org/index.php?topic=111108.0>
>> forum.pfsense.org
>> Author Topic: Major issue with QUAGGA-OSPF and VLANs (pfsense 2.3.0)
>> (Read 3006 times)
>>
>>
>>> When triggering failover ... the failover link does not work with
>>> version 1  .... reverting back to .99 no problems. Pfsense Team seems
>>> to think it's something regarding Zebra restart... Several users have
>>> confirmed this issue. See thread for further info.
>>>
>>
>> What is the "normal" version of Quagga in PfSense? (i.e. output of
>> "zebra -v")
>> Is this 1.0.20160315 ?
>>
>> Can you post the output of a "zebra -v" from it (it should give some
>> compile options as well)
>> And what is the base OS? FreeBSD 10?
>>
>> We are just about trying to get a new version out. If you are able to
>> compile your own
>> version, then it might be worthwhile to download and build from the
>> latest git master.
>>
>> - Martin Winter
>>
>
>
>
> _______________________________________________
> Quagga-users mailing list
> Quagga-users@lists.quagga.net
> https://lists.quagga.net/mailman/listinfo/quagga-users
>
_______________________________________________
Quagga-users mailing list
Quagga-users@lists.quagga.net
https://lists.quagga.net/mailman/listinfo/quagga-users

Reply via email to