I have been having an issue with 2.0 for a few months (beta snapshots and RC1) that is driving me mad. I'm hoping someone can shed some light on this.
The server is a Dell PowerEdge R610 with bce0-bce3. It is a repurposed server, so it is built and configured as a server and for performance. In the simplest setup, I only have a LAN (bce0) and WAN (bce1). This is a test server for evaluating 2.0, so it doesn't really have much traffic. There are only a couple of us using it as a gateway. A few minutes after booting, the Web UI will become unusably slow or completely unresponsive. Sometimes we will be greeted with a 503 response. Other times the browser just spins forever. SSH access is similarly flaky. We have found that if we force some traffic through the gateway (e.g. http request from LAN to WAN) right after requesting a page from the Web UI or attempting an SSH session, it will respond to that request. I have dug through posts related to this in the forums and archives, but haven't found too much that's relevant. I did find one post [1], though, that was somewhat similar. Basically, the OP had to run tcpdump on the pfSense box to get it to work. I tried that, and it works! So, now every time I restart the pfSense box I have to log in on console or SSH (if I can get in) and run a `nohup tcpdump -i bce0 >& /dev/null' to make it behave. Note that unlike the referenced post, we do not have any trouble LAN->WAN through the gateway. It just seems to be problematic accessing the gateway itself from the LAN. As long as my tcpdump is running, everything works beautifully, and the box is as fast and responsive as can be. But once that dump is stopped, it seems that pfSense doesn't like to respond to the LAN. I have: - restarted the Web Configurator over and over - stripped out all config except the most basic needed to function - tried different ports for both LAN and WAN - reinstalled the box from scratch several times - watched `top' for cpu hogs; the system is bored - verified all BIOS settings look normal - scoured the logs - found that sometimes when this issue is happening, I cannot kill (even `kill -9') the lighttpd process for the Web Configurator; it's almost like it is blocked waiting on something (?) I am a FreeBSD server admin, so I have no problem digging through the system, debugging, installing tools, changing sysctls, or whatever to try to figure this out, but I don't know where to start. Does anyone have any ideas? I would rather not go to production with the tcpdump kludge. Am I the only one who has seen this? - Jim [1] http://forum.pfsense.org/index.php?topic=13701.0 --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] Commercial support available - https://portal.pfsense.org
