Well, it may not be the ALIX boards after all. I connected the servers directly to the modem, ran the crawlers, and I'm still getting UnknownHostException's. I'm guessing my modem's to blame... I'll have to upgrade it and find out.
On 3/18/14, David Noel <david.i.n...@gmail.com> wrote: > Well, I bumped Maximum State Table from the default of 23,000 to > 75,000, and now it's throwing fewer UnknownHostException's. But > they're still being thrown. My resource utilization is getting pretty > high though. I don't think these ALIX boards can handle much more of a > load, and I still have 2 more servers I need to scale these crawlers > out to. I do see there's a "Firewall Adaptive Timeouts" setting in the > web configurator.. this seems like it might be useful. Can anyone > recommend any settings I should try to free up some system resources? > I'm not clear on the consequences of purging pf state entries and > whether that's something I'd want to do though. > > The state table on my primary router (alix1) is at roughly 50% > utilization, or 40,000 states. The state table on my secondary router > (alix2) is at 0%, roughly 250 states. This seems odd. Is this to be > expected under CARP? Why is the load not distributed evenly? > > Memory usage on my primary router (alix1) is hovering around 55% (of > 235MB). On my backup (alix2) it's pushing 85-90%. Does this make sense > to anyone? Top output looks roughly the same... and now alix2 has gone > down. 95% packet loss. Web Configurator unresponsive. ... It's back up > but throwing "500 - Internal Server Error"s periodically. I've ssh'd > in to alix2 and am looking at top output.. tcpdump seems to be running > for pflog purposes.. and it's hogging quite a bit of CPU. Is this > necessary? Can I disable it somehow? > > -David > > On 3/18/14, David Noel <david.i.n...@gmail.com> wrote: >> I've encountered a strange issue while scaling a Java project that I'm >> not quite sure how to resolve. Any thoughts would be appreciated. >> >> The code is a crawler that uses HTMLUnit to crawl a bunch of pages >> concurrently. It uses HTMLUnits getPage method to do the crawling. I'm >> running 100 threads per instance. When I have 1 instance up and >> running on 1 machine everything is fine. When I scale it to a second >> machine though I start having trouble. Calls to getPage keep throwing >> UnknownHostException's (DNS resolution error). With 2 servers running, >> roughly 1 out of every 20 calls to getPage throw this exception. For >> some reason it's unable to resolve domain names.. and it's not just >> the crawlers, my entire network starts to bug on DNS queries. On >> different systems on the same network I get 'unable to resolve host' >> errors in my web browser periodically when loading URL's. Usually when >> I retry it goes through, but it keeps happening sporadically as long >> as the crawlers are running. >> >> So many things could be going wrong here. Thinking maybe it was my >> provider throttling DNS queries I've tried changing DNS servers, but >> that's done nothing. Thinking it might be a bandwidth issue I checked >> systat, but the cumulative load is well under what my line can handle. >> What else could be causing this? My network is pretty simple: Provider >> <--> modem <--> 2 ALIX boards running pfSense <--> Servers and >> workstations. The servers are running FreeBSD, and the workstations >> run FreeBSD, Windows, and OSX. >> >> Has anyone encountered this before? Does anyone have any thoughts on >> what might be causing it? >> >> My only other thought is that maybe pfSense is doing something strange >> so if I can't come up with any better ideas I'll try plugging the >> servers directly into the modem. I'd rather have them behind the >> routers though, so this would be a less-than-ideal solution. >> >> UPDATE: Ok, so it seems to be a pfSense issue. I launched the crawlers >> on 2 servers as before and waited for UnknownHostException's to be >> thrown. I then took a spare laptop and connected it directly into my >> modem, bypassing my 2 pfSense routers. All DNS queries have gone >> through without a hitch, so something strange is going on with >> pfSense. Can anyone think of what might be causing this? I'm guessing >> there's some tunable that needs to be tweaked, but I'm not sure where >> to start. I also might have configured pfSense incorrectly, but I >> think that's less likely to be the case than some default tunable >> being set too low because at low volumes all DNS queries go through >> just fine. If it were a configuration error it seems more likely that >> no DNS queries would be going through. If it's relevant, with 200 >> active threads I'm probably querying DNS a minimum of 10 times per >> second. >> >> Can anyone think of anything I might have done wrong setting up >> pfSense? Does anyone know of any tunables that might causing this >> error? >> > _______________________________________________ List mailing list List@lists.pfsense.org https://lists.pfsense.org/mailman/listinfo/list