On Fri, Apr 11, 2014 at 4:35 PM, Raimund Sacherer <[email protected]> wrote:

> Hello List,
>
> I checked a bit further and at first I deleted all the rrd graphs, because
> I had GW graphs from my old machine (which where i386) and the new one is
> amd64. I do not know if there is a problem with apinger if you have i386
> rrd files on an amd64 architecture. It could be that apinger has problems
> when it's not possible to write the RRD data correctly?
>
>
> After this change I still got the problem that a couple of seconds the
> widget displays online, then Pending, then online, then Pending ... so I
> dug a bit deeper and when I found out how the data get's updated I saw that
> sometimes I get gateway status data and sometimes not.
>
> I checked the /var/run/apinger.status file and it was updated always.
>
> Out of a hunch I commented out this bit:
>
>         /* Always get the latest status from apinger */
> /*
>         if (file_exists("{$g['varrun_path']}/apinger.pid"))
>                 sigkillbypid("{$g['varrun_path']}/apinger.pid", "USR1");
> */
>
> in /etc/inc/gwlb.inc
>
> And suddenly it seems to work fine.
>
> The data in /var/run/apinger.status is still updated constantly, it may be
> that in the widget I do not see every change right away because the USR1
> signal is not sent when we read the file, but at least the problem with the
> widget behaving erratically is now gone!
>
> I am not sure if/how the USR1 kill can influence gathering the data from
> the file, but somehow it does ...
>
>
> Hopefully someone who knows pfSense better than me can shed a light on the
> why ....
>
>
>
Thank you for the analysis.

Would you mind reporting your findings on redmine.pfsense.org and also
attach your config.xml annonymized or describe in more details your config
with so many gateways?


> Best,
> Ray
>
>
>
> ----- "Raimund Sacherer" <[email protected]> wrote:
>
> > Hello,
> >
> > I installed on the weekend our new firewall system. It consists of two
> > Dell R210 with intel (igb) 2-port interface cards.
> >
> > The old system was 2.0-RELEASE.
> >
> > We have 11 Gateways configured, it's a mix of WAN's and LAN-Type
> > interconnects with 2 other companys. We have a couple of ADSL's, a
> > 10Mbit fiber and 2 100Mbit fiber WAN's.
> >
> > The apinger works perfectly on the 2.0-RELEASE.
> >
> > In the 2.1-RELEASE I have the following problems:
> >
> > On Sunday I made the switch and I noticed that all gateways are marked
> > as down, with status first pending, then unknown.
> > In the logs I have a message which says that all gateways can not be
> > contacted and they are assumed online.
> >
> > Now without the apinger working correctly I did not configure the 2nd
> > Firewall out of fear that there will be problems and I deactivated
> > gateway monitoring.
> >
> >
> >
> > In the last two days I played around with the 2nd Firewall and I
> > noticed this:
> >
> > up to 4 interfaces/gateways configured (out of the 11) everything
> > works fine, I see stable behavior in the gatway section on the
> > dashboard.
> > Then I added one interface more and I sasw problems in the dashboard,
> > the lines went from online to unknown/pending. When I deactivated the
> > last interface all went online again. I did not investigate further as
> > I had to go.
> >
> > (after a couple of activate/deactivate I had problems that activating
> > the interface in the GUI and clicking save/apply did not configure the
> > interface, ifconfig said it was simply not there, I had to execute
> > /etc/rc.interfaces_opt_configure to get everything configured again,
> > not sure if this can occur if you have lot's of tabs open to the
> > firewall or if there is another configuration/GUI bug).
> >
> >
> > Today I configured 1 more interface and with 6 interfaces I see
> > something really weird. The dashboard shows me that all lines are
> > online (with RTT times which seem reasonable) for around 8 seconds,
> > then it shows me unknown for about 20-30 seconds, then online for
> > around 8 seconds again, then unknown ....
> >
> > it seems the more interfaces you configure, the weirder get's the
> > apinger behavior.
> >
> > I tried to copy the apinger from the 2.0-RELEASE and use it, but it
> > also did not work as expected.
> >
> >
> > I hope someone can find out what's wrong with apinger, because it
> > definitly *is* a problem, I have seen a couple of people in the
> > forums, and I think at least 2 bug - reports, maybe it does not occur
> > if you have only a couple of WAN's.
> >
> >
> > Tomorrow I will try to see if I can install the 2.0-RELEASE on this
> > machine (I hope it can support the new hardware) because 2.0 was
> > rock-solid for me (we had the FW with an uptime of 895 days without
> > any signs of trouble).
> >
> > I fear a little an upgrade to 2.1.1-RELEASE because there seems to be
> > quite some troubling problems with this release as well ... :-(
> >
> >
> > Thank you,
> > Best regards,
> >
> > Raimund
> >
> >
> > _______________________________________________
> > List mailing list
> > [email protected]
> > https://lists.pfsense.org/mailman/listinfo/list
> _______________________________________________
> List mailing list
> [email protected]
> https://lists.pfsense.org/mailman/listinfo/list
>



-- 
Ermal
_______________________________________________
List mailing list
[email protected]
https://lists.pfsense.org/mailman/listinfo/list

Reply via email to