Re: dhcpd related issues

2009-09-21 Thread James Tanis
Yeah, it seems to be the case that enough traffic was being generated  
to delay the dhcp leases that the client computers were giving up. I  
used dhcping to watch and witnessed it in action. Moving dhcp to  
another server solved the issue. Likely I'll be moving some other  
services off that server soon to cut down on other "hotspot" related  
problems.


--
James Tanis
Technical Coordinator
Computer Science Department
Monsignor Donovan Catholic High School



On Sep 20, 2009, at 10:41 PM, Olivier Nicole wrote:


Hi James,


I have a FreeBSD 7.0 gateway/server with isc-dhcpd 3.1.2p1_2. Late
yesterday I began having some unique and intermittent issues.
Basically, random computers will all of a sudden lose their dhcp
leases and be unable to contact the dhcp server.


I did not see any reply to your question.

It happened to me that a secondary switch was working badly with
DHCP. If I rebooted the switch, it would work for a while, then fail
again. But that was a sort of random failure, some ports would be
affected sometime. The problem would occur a the first lease in the
morning rather than at renewal time.

As it was a cheap switch, unmanageable, I replaced it.


Sep 17 14:03:15 grendel dhcpd: ICMP Echo reply while lease
192.168.1.253 valid.
Sep 17 15:25:19 grendel dhcpd: ICMP Echo Reply for 192.168.1.74 late
or spurious.

which doesn't seem particularly relevant or heinous. Many more
computers than the ones above have been affected.


I don't remember ever seeing such error, but I would think that late
or spurions is not that innocent: it could be the symptom of a switch
not working at its nominal speed.

Of  course you could  also consider  a computer  on that  switch being
infected by some kind of virus and generating so much traffic it takes
your switch out.

Bests,

Olivier


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


dhcpd related issues

2009-09-18 Thread James Tanis
I have a FreeBSD 7.0 gateway/server with isc-dhcpd 3.1.2p1_2. Late  
yesterday I began having some unique and intermittent issues.  
Basically, random computers will all of a sudden lose their dhcp  
leases and be unable to contact the dhcp server.


At first I figured the dhcp server had crashed, but it did not. It was  
still up and running. Secondly I figured we ran out of leases; this  
has happened before -- the school is growing rapidly enough, not to  
mention the kids keep getting more connected. Unfortunately, after  
doubling the amount of available leases the problem is still persisting.


Now the issue gets more confused by the fact that some computers  
haven't been affected at all. There seems to be no real difference  
between their configurations and the configurations of the computers  
affected. For a while I was considering the possibility of the switch  
dropping packets or developing bad ports, but the behavior isn't  
consistent with that. One would think that if the port connecting a  
secondary switch to the main switch was going bad that it would affect  
all clients on the secondary switch -- this is not the case. There  
doesn't seem to be much rhyme or reason to which computers are affected.


The server isn't reporting any dropped packets on either of its  
interfaces and the links aren't even close to saturated. I'm  
completely at a loss as to the cause of the problem. The problem  
occurs in a time period that is  pretty consistent with the default  
lease time -- which would suggest there is something odd happening  
with lease renewal, but I certainly can't seem to get a grasp on it.


If I do a "cat debug.log|grep dhcpd" I get:

Sep 17 08:36:07 grendel dhcpd: ICMP Echo Reply for 192.168.1.243 late  
or spurious.
Sep 17 12:58:04 grendel dhcpd: ICMP Echo Reply for 192.168.1.57 late  
or spurious.
Sep 17 12:58:04 grendel dhcpd: ICMP Echo Reply for 192.168.1.57 late  
or spurious.
Sep 17 13:56:27 grendel dhcpd: ICMP Echo Reply for 192.168.1.155 late  
or spurious.
Sep 17 14:03:15 grendel dhcpd: ICMP Echo reply while lease  
192.168.1.253 valid.
Sep 17 15:25:19 grendel dhcpd: ICMP Echo Reply for 192.168.1.74 late  
or spurious.


which doesn't seem particularly relevant or heinous. Many more  
computers than the ones above have been affected.


doing the same for the console.log got me a whole bunch of:

Sep 17 16:45:18 grendel dhcpd: if mdchs203-2.mdchs.org IN A rrset  
doesn't exist add mdchs203-2.mdchs.org 300 IN A 192.168.1.162: timed  
out.
Sep 17 16:45:26 grendel dhcpd: if mdchs100-1.mdchs.org IN A rrset  
doesn't exist add mdchs100-1.mdchs.org 300 IN A 192.168.1.126: timed  
out.


which is pretty much the norm and shouldn't be causing the problem.

The main switch is a HP Procurve 1700-24 and it doesn't seem to be  
reporting any problems. All ports are up that should be. There is 1  
"Rx Error Packet" on Port 23 being reported. Port 23 is the one that  
goes out to the server, but a single packet couldn't be causing this  
kind of behavior.


Does anyone have *any* ideas? I'm about tapped out myself here. I'll  
attack the problem fresh if it persists tomorrow, but I'd like to come  
with some ideas from different perspectives.


Here is the dhcpd.conf file, recently changed to add more leases:

ddns-update-style ad-hoc;
option domain-name "mdchs.org";
option domain-name-servers 192.168.1.1;
option netbios-name-servers 192.168.1.1;
option netbios-node-type 8;

shared-network mdchs {
   default-lease-time 600;
   max-lease-time 7200;
   option subnet-mask 255.255.0.0;
   option broadcast-address 192.168.255.255;
   option routers 192.168.1.1;

subnet 192.168.1.0 netmask 255.255.255.0 {
range 192.168.1.46 192.168.1.253;

host mdchs12 {
hardware ethernet xx:xx:xx:xx:xx:xx;
fixed-address 192.168.1.6;
}
   snipped the rest of the host entries  for brevity 
}

subnet 192.168.2.0 netmask 255.255.255.0 {
   range 192.168.2.1 192.168.2.254;
}
}

It seems worth noting that this server was functioning perfectly well  
for a year and half before this occured. Nothing was changed before  
the problem manifested. After the problem manifested I upgraded to the  
above mentioned version and added the shared-network with the second  
subnet. So far the nature of the problem has not change whatsoever.


--
James Tanis
Technical Coordinator
Computer Science Department
Monsignor Donovan Catholic High School



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: issues with Intel Pro/1000 and 1000baseTX

2009-05-14 Thread James Tanis

Bill Moran wrote:

In response to James Tanis :


  

<.. snip ..>
Attempting to force 1000baseTX via:

ifconfig em1 media 1000baseTX mediaopt full-duplex

gets me:

status: no carrier

After forcing the NIC to go 1000baseTX the LEDs on the backpane are both 
off. I can only come to the conclusion that this is a driver issue based 
on previous experience and the simple fact that the end user system is 
capable of connecting at 1000baseTX. Anybody have any suggestions? I'm 
hoping I'm wrong. I'd rather not do an in-place upgrade, this is a 
production system and the main gateway for an entire school, when I do 
not even know for sure whether this will fix the problem. It's worth it 
to me though, having a 1000baseTX uplink from the switch would remove a 
major bottleneck for me.



Try forcing on both ends (I assume the Procurve will allow you to do that).
One thing I've seen consistently is that if you force the speed/duplex on
one end, the other end will still try to autoneg, and will end up with
something stupid like 100baseT/half-duplex, or will give up and disable
the port.
  
Ok, I just did that -- I have now attempted to force 1000baseTX on both 
sides and on one side while the other was left auto, all three possible 
combinations resulted in the same behavior (no carrier).

Also, try autoneg on both ends.  Make absolutely sure the Procurve is set
to autoneg.
  
This was the original set up. It is also how I have it set up currently, 
it results in 100baseTX full-duplex on both sides.

Replace the cable.  If the cable is marginal, autoneg will downgrade the
speed to ensure reliability.  Use a cable that you know will produce
1000baseTX because you've tested it on other systems.
  
Well, I don't have any verified working cable of the appropriate length 
so I simply switched out the cables for the main server and the backup 
server. They are both cat6 cables crimped with cat5e modules by me. For 
what reason (bad crimp job?) that seemed to fix the issue.


Thanks for the advice!

--
James Tanis
Technical Coordinator
Computer Science Department
Monsignor Donovan Catholic High School

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


issues with Intel Pro/1000 and 1000baseTX

2009-05-14 Thread James Tanis
I have a FreeBSD v7.0 box it has two Intel Pro/1000 NICs, the one in 
question is:


em1:  port 
0x2020-0x203f mem 0xd806-0xd807,0xd804-0xd805 irq 19 at 
device 0.1 on pci4


what we get after boot is:

em1: flags=8943 metric 0 
mtu 1500

   options=19b
   ether 00:30:48:xx:xx:xx
   inet 192.168.1.1 netmask 0xff00 broadcast 192.168.1.255
   media: Ethernet autoselect (100baseTX )
   status: active

The problem is that the NIC refuses to connect at 1000baseTX.

It's connected to a HP Procurve 1700-24 switch which supports 1000baseTX 
on ports 23 and 24. This particular computer is connected on port 24. I 
have a much older end user system which uses the same card (but earlier 
revision), runs Windows XP and is plugged in to port 23. The end user 
system has no problem connecting at 1000baseTX. I have of course tried 
switching ports.


Attempting to force 1000baseTX via:

ifconfig em1 media 1000baseTX mediaopt full-duplex

gets me:

status: no carrier

After forcing the NIC to go 1000baseTX the LEDs on the backpane are both 
off. I can only come to the conclusion that this is a driver issue based 
on previous experience and the simple fact that the end user system is 
capable of connecting at 1000baseTX. Anybody have any suggestions? I'm 
hoping I'm wrong. I'd rather not do an in-place upgrade, this is a 
production system and the main gateway for an entire school, when I do 
not even know for sure whether this will fix the problem. It's worth it 
to me though, having a 1000baseTX uplink from the switch would remove a 
major bottleneck for me.


Any help would be appreciated.

--
James Tanis
Technical Coordinator
Computer Science Department
Monsignor Donovan Catholic High School

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: The Design and Implementation of the FreeBSD Operating System

2008-07-23 Thread James Tanis
"Kevin Kinsey" <[EMAIL PROTECTED]> wrote:
> I stand ready for correction, but "Design & Implementation" is mostly
> about, well, the design of the system itself ... not an operational
> manual but a programmer's guide to OS internals.  And, not only that,
> but it's about 4.4BSD (1993?), so the exact OS described is quite old*;
> however, it's of great value not only as history but as 4.4BSD has
> fed code into not only FreeBSD, but NetBSD, OpenBSD, and others.
> (see /usr/share/misc/bsd-family-tree).  If that's not of interest
> to you I'd not worry about this book --- no offence to Mr. McKusick
> et al, of course.

Your thinking of "The Design and Implementation of the 4.4BSD Operating
System" not "The Design and Implementation of the FreeBSD Operating System."
They are, believe it or not, two different books. Your point is just as
valid though as far as it being "not an operational manual but a
programmer's guide to OS internals."
--
James Tanis
Technical Coordinator
Monsignor Donovan Catholic High School
e: [EMAIL PROTECTED]


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Auto-saving distfiles on freebsd (was: FreeBSD for webserver?)

2008-07-23 Thread James Tanis
"cpghost" <[EMAIL PROTECTED]> wrote:
> The ports would still go to the primary sites (to conserve bandwidth),
> but should the original distfile disappear, it would be still available
> on freebsd.

I think his problem comes from the fact that some ports don't do this, not
that it isn't a good idea. The port maintainers just never did it.
--
James Tanis
Technical Coordinator
Monsignor Donovan Catholic High School
e: [EMAIL PROTECTED]


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Spamassassin very slow

2008-07-22 Thread James Tanis
"lyd mc" <[EMAIL PROTECTED]> wrote:
>
> What causes spamassassin to slow?
>
> Here is my config:
>
> snippet from sendmail.mc
> ..  ..
>
> I have .procmailrc in every home directory of my mail users and it goes
like
> this:

So if I'm understanding you correctly.. your calling spamc from a sendmail
milter *and* .procmailrc. That's pretty redundant and would definately slow
you down. Choose one based on your needs.

>
> I also have RulesDuJour installed and spammassassin --lint does complain
about
> it.
>

Extra rules can slow you down regardless of syntax, but most computers
created this decade can handle RulesDuJour fine. Personally I think your
main problem is that your effectively spam checking every message twice. The
spamassassin queues most likely get filled followed by sendmail having to
wait and queue up the slack.
--
James Tanis
Technical Coordinator
Monsignor Donovan Catholic High School
e: [EMAIL PROTECTED]


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"