Well, it may not be the ALIX boards after all. I connected the servers
directly to the modem, ran the crawlers, and I'm still getting
UnknownHostException's. I'm guessing my modem's to blame... I'll have
to upgrade it and find out.

On 3/18/14, David Noel <david.i.n...@gmail.com> wrote:
> Well, I bumped Maximum State Table from the default of 23,000 to
> 75,000, and now it's throwing fewer UnknownHostException's. But
> they're still being thrown. My resource utilization is getting pretty
> high though. I don't think these ALIX boards can handle much more of a
> load, and I still have 2 more servers I need to scale these crawlers
> out to. I do see there's a "Firewall Adaptive Timeouts" setting in the
> web configurator.. this seems like it might be useful. Can anyone
> recommend any settings I should try to free up some system resources?
> I'm not clear on the consequences of purging pf state entries and
> whether that's something I'd want to do though.
>
> The state table on my primary router (alix1) is at roughly 50%
> utilization, or 40,000 states. The state table on my secondary router
> (alix2) is at 0%, roughly 250 states. This seems odd. Is this to be
> expected under CARP? Why is the load not distributed evenly?
>
> Memory usage on my primary router (alix1) is hovering around 55% (of
> 235MB). On my backup (alix2) it's pushing 85-90%. Does this make sense
> to anyone? Top output looks roughly the same... and now alix2 has gone
> down. 95% packet loss. Web Configurator unresponsive. ... It's back up
> but throwing "500 - Internal Server Error"s periodically. I've ssh'd
> in to alix2 and am looking at top output.. tcpdump seems to be running
> for pflog purposes.. and it's hogging quite a bit of CPU. Is this
> necessary? Can I disable it somehow?
>
> -David
>
> On 3/18/14, David Noel <david.i.n...@gmail.com> wrote:
>> I've encountered a strange issue while scaling a Java project that I'm
>> not quite sure how to resolve. Any thoughts would be appreciated.
>>
>> The code is a crawler that uses HTMLUnit to crawl a bunch of pages
>> concurrently. It uses HTMLUnits getPage method to do the crawling. I'm
>> running 100 threads per instance. When I have 1 instance up and
>> running on 1 machine everything is fine. When I scale it to a second
>> machine though I start having trouble. Calls to getPage keep throwing
>> UnknownHostException's (DNS resolution error). With 2 servers running,
>> roughly 1 out of every 20 calls to getPage throw this exception. For
>> some reason it's unable to resolve domain names.. and it's not just
>> the crawlers, my entire network starts to bug on DNS queries. On
>> different systems on the same network I get 'unable to resolve host'
>> errors in my web browser periodically when loading URL's. Usually when
>> I retry it goes through, but it keeps happening sporadically as long
>> as the crawlers are running.
>>
>> So many things could be going wrong here. Thinking maybe it was my
>> provider throttling DNS queries I've tried changing DNS servers, but
>> that's done nothing. Thinking it might be a bandwidth issue I checked
>> systat, but the cumulative load is well under what my line can handle.
>> What else could be causing this? My network is pretty simple: Provider
>> <--> modem <--> 2 ALIX boards running pfSense <--> Servers and
>> workstations. The servers are running FreeBSD, and the workstations
>> run FreeBSD, Windows, and OSX.
>>
>> Has anyone encountered this before? Does anyone have any thoughts on
>> what might be causing it?
>>
>> My only other thought is that maybe pfSense is doing something strange
>> so if I can't come up with any better ideas I'll try plugging the
>> servers directly into the modem. I'd rather have them behind the
>> routers though, so this would be a less-than-ideal solution.
>>
>> UPDATE: Ok, so it seems to be a pfSense issue. I launched the crawlers
>> on 2 servers as before and waited for UnknownHostException's to be
>> thrown. I then took a spare laptop and connected it directly into my
>> modem, bypassing my 2 pfSense routers. All DNS queries have gone
>> through without a hitch, so something strange is going on with
>> pfSense. Can anyone think of what might be causing this? I'm guessing
>> there's some tunable that needs to be tweaked, but I'm not sure where
>> to start. I also might have configured pfSense incorrectly, but I
>> think that's less likely to be the case than some default tunable
>> being set too low because at low volumes all DNS queries go through
>> just fine. If it were a configuration error it seems more likely that
>> no DNS queries would be going through. If it's relevant, with 200
>> active threads I'm probably querying DNS a minimum of 10 times per
>> second.
>>
>> Can anyone think of anything I might have done wrong setting up
>> pfSense? Does anyone know of any tunables that might causing this
>> error?
>>
>
_______________________________________________
List mailing list
List@lists.pfsense.org
https://lists.pfsense.org/mailman/listinfo/list

Reply via email to