What model and how old is your BC Web filter?
-----Original Message-----
From: Kurt Buff [mailto:[email protected]]
Sent: Thursday, June 09, 2011 12:13 AM
To: NT System Admin Issues
Subject: OTish: Need some ideas on troubleshooting web browsing problem
All,
I'm in need of a new approach to troubleshooting staff complaints about
intermittent slowness of web browsing. We have about 200 staff members
on site, the symptoms are intermittent at best, but include some
generalized slowness in page loads, and occasional complete page misses
- that is, staff report that a page fails to load at all, with a message
that the system can't find the page, but hitting refresh will usually
bring the page right up.
My current testing methodology seems to be getting me nowhere and
causing me to lose hair in great chunks. I outline the methodology below
because someone might spot a flaw in it.
I'm not well versed in reading packets, so haven't yet resorted to
wireshark or tcpdump, but my testing so far leads me to believe that I
won't find much that way. If your reading of the situation leads you to
believe otherwise, I'm all ears. But I'm also really interested in
hearing other things all y'all might suggest on how to go about this.
Network physical configuration:
DS3 >> HP 2524 switch >> Sidewinder firewall >> HP 2524 switch >>
Barracuda web filter >> HP 3400cl switch >> production VLANs
Network logical configuration:
No VLANs externally, 9 VLANs that run over the 3400cl and 18 VLANs
(the ones on the 3400cl, plus 9 for test/dev/other) that run on the
internal HP 2524. The firewall is a HA pair (active/passive) and has a
VLANed interface to the HP 2524 - it sees all of the VLANs.
Other data:
I've got ntop running on two different points on the network - the
external HP 2524, and the HP 3400cl - no load anomalies for the LAN or
Internet connection noted.
Testing methodology:
I have placed a FreeBSD box with a public IP address external to
the firewall, and two FreeBSD boxes internal to the firewall on
different VLANs. One of the internal FreeBSD boxes is on a VLAN that
doesn't traverse the 3400cl, and the other is placed in a VLAN that does
- both VLANs transit the Barracuda, as do all staff machines.
Each box has cURL installed (there's a version for Windows as well), and
is given an identical list of about 2100 unique (http://fqdn only
- not http://fqdn/somepath) URLs to resolve and download. I kick off the
batch files manually - and simultaneously.
The batch file is simple:
date > /root/out.txt
/usr/local/bin/curl -K /root/urls.txt >> /root/out.txt
date >> /root/out.txt
The entries are all formatted similarly, e.g.:
url = "http://www.google.com"
-s
-w = "%{url_effective}\t%{time_total}\t%{time_namelookup}\n"
-o = /dev/null
The output looks like this:
http://www.google.com 0.093 0.066
Downloaded data is dumped to /dev/null, but I capture the timings
for name resolution and the total transaction so that if I want I can
analyze them later. I used this method before to identify a problem with
the DNS proxy on the firewall, so thought this would be a useful method
to do the same thing.
All three boxes are using Google for name resolution: 8.8.8.8 - so
that I can eliminate variances based on possible problems with our AD
DNS infrastructure - I don't think there are any, but....
Currently, our AD DNS points to 8.8.8.8 for its resolvers, but was
originally pointed at our ISPs DNS - that change doesn't seem to have
made a difference in staff experience.
I gathered the URLs from my syslogs, so they are real sites that
people here visit.
The problem with the results from the methodology:
Using the same data files each time, timings across all three boxes
have varied wildly. On Friday of last week, each of the three boxes took
40 minutes to run through the list of URLs. On Tuesday they each took
roughly three hours. Today the external box took 40 minutes and one of
the internal boxes took about 3 hours, and the other internal machine
hadn't finished by the time I left work - cURL hung on that machine and
I'm going to rebuild it, as it had been mothballed and only revived for
this test, and really needs updating. Because there is no consistency in
the data, I cannot draw any conclusions.
I'm going to try a few more runs, but definitely feel the need for a
different approach
Any thoughts you might have will be appreciated. I'm out for the next
couple of days, so won't be able to try any suggestions until next week,
but would love to hear from folks on this.
Thanks,
Kurt
~ Finally, powerful endpoint security that ISN'T a resource hog! ~ ~
<http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~
---
To manage subscriptions click here:
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to [email protected]
with the body: unsubscribe ntsysadmin
~ Finally, powerful endpoint security that ISN'T a resource hog! ~
~ <http://www.sunbeltsoftware.com/Business/VIPRE-Enterprise/> ~
---
To manage subscriptions click here:
http://lyris.sunbelt-software.com/read/my_forums/
or send an email to [email protected]
with the body: unsubscribe ntsysadmin