Re: ATA Woes.

2005-07-19 Thread Jon Simola
On 7/19/05, Tony Byrne [EMAIL PROTECTED] wrote:

 Jul 19 13:01:48 roo kernel: ad0: FAILURE - READ_DMA 
 status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=288810495
 Jul 19 13:01:59 roo kernel: ad0: FAILURE - READ_DMA 
 status=51READY,DSC,ERROR error=1ILLEGAL_LENGTH LBA=288810495
 Jul 19 13:02:05 roo kernel: ad0: FAILURE - READ_DMA 
 status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=288810495
 Jul 19 13:02:16 roo kernel: ad0: FAILURE - READ_DMA 
 status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=288810495
 Jul 19 13:04:36 roo last message repeated 4 times

 I'm totally confused. I don't know enough about SMART to know whether
 I'm looking at real failing drives or some bug exposed by the
 interaction between drive firmware, hd controller and FreeBSD.

What I've recently learned the hard way is that desktop drives have no
place in a server. I've now failed 4 of 10 SATA drives (Maxtor and WD)
in 1U rackmounts, and am moving on to trying the WD Raptor SATA drives
(which claim to be low-end server).

-- 
Jon Simola
Systems Administrator
ABC Communications
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ATA Woes.

2005-07-19 Thread Jon Simola
On 7/19/05, Wilko Bulte [EMAIL PROTECTED] wrote:
 On Tue, Jul 19, 2005 at 11:22:01AM -0700, Jon Simola wrote..

  I've now failed 4 of 10 SATA drives (Maxtor and WD)
  in 1U rackmounts, and am moving on to trying the WD Raptor SATA drives
  (which claim to be low-end server).
 
 Properly cooled?

Yeah, they're in the Supermicro 811 chassis with hotswap SATA sleds.
There's a decent amount of air flowing over the drives, and SMART says
they're running about 26C. Compared to my 10Krpm SCSI array that I've
burned my fingers on, frequently.

-- 
Jon Simola
Systems Administrator
ABC Communications
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: atacontrol raid1 vs. gmirror

2005-06-13 Thread Jon Simola
On 6/11/05, Paul Mather [EMAIL PROTECTED] wrote:

 I found array rebuilding to be troublesome on atacontrol RAID

Some bits from the in-house documentation I've been writing. I've
tested this on multiple occasions on my 1U Supermicro SATA boxes, so
might possibly be interesting for someone.

* RAID setup: (If required, FreeBSD only)
  - ensure that the drives are probed as ad4 and ad6 like:

  ad4: 76324MB [155072/16/63] at ata2-master SATA150
  ad6: 76324MB [155072/16/63] at ata3-master SATA150

  - if they do not probe correctly, check the BIOS settings above
  - perform a minimal install of FreeBSD 5.3 (do not worry about
network or anything)
  - reboot from the installed OS and login as root
  - Run the command atacontrol create RAID1 ad4 ad6 to create the raid set
  - Reboot and reinstall the OS, choosing ar0 as the drive, which
should probe like:

  ad4: 76324MB [155072/16/63] at ata2-master SATA150
  ad6: 76324MB [155072/16/63] at ata3-master SATA150
  ar0: 76324MB [9730/255/63] status: READY subdisks:
  disk0 READY on ad4 at ata2-master
  disk1 READY on ad6 at ata3-master

Minimal Survival for FreeBSD software RAID1 sets

* Read the atacontrol man page
* atacontrol status ar0 - to check the status
* atacontrol detach 2 - to detach ad4 if failed (again, use 3 for ad6).
  The SATA disks in the 5013C-T chassis are hotswappable, so it
can be pulled once detached.
* atacontrol attach 2 - to reattach ad4 once replaced
* atacontrol addspare ar0 ad4 - to add the replaced ad4 as a spare
on the RAID set
* atacontrol rebuild ar0 - to rebuild the mirror 

-- 
Jon Simola
Systems Administrator
ABC Communications
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Using jails and djbdns

2005-05-12 Thread Jon Simola
On 5/12/05, Tony Arcieri [EMAIL PROTECTED] wrote:

 Is there some easy way to reverse this order, so svscan is started first and
 jails started afterward?

Rename the svscan.sh script to 000svscan.sh so that it shows up first
in a directory listing and is run before the jail scripts
(jail-x.x.x.x.sh in my case).

man rc(8) has a lot of good points to read through if you have any
further questions.


-- 
Jon Simola
Systems Administrator
ABC Communications
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: xl(4) polling

2005-05-11 Thread Jon Simola
On 5/10/05, Rob [EMAIL PROTECTED] wrote:
 Interestingly: HZ=1000 is apparently a problem with
 the xl devices (3Com 3c905B-TX), but not with the
 rl devices (RealTek 8139).
 What could cause that difference? Could a difference
 in buffer size on the LAN card cause this?

Yes. GigE cards tend to have larger packet buffers, but that certainly
doesn't solve all the problems. I've been having some problems with
the em cards in particular (fxp I've had no problems with) as no
matter what I've tried tuning (tcprecvspace, HZ, polling knobs) I've
been seeing packet loss of about 0.5%. That doesn't seem like much,
but it's an awful lot to the couple thousand users behind it.

Anyways, HZ=1000 shouldn't be a CPU problem on anything faster than a
500MHz-ish processor. There are also a few lightly documented sysctls
that might be useful to play with that do things like poll during the
idle loop (actual usefullness in any particular case may be void in
your area, many will enter, few will win).

-- 
Jon Simola
Systems Administrator
ABC Communications
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Bad disk or kernel (ATA Driver) problem?

2005-01-19 Thread Jon Simola
On Wed, 19 Jan 2005 15:13:01 -0600, Karl Denninger [EMAIL PROTECTED] wrote:

 I've got a 2 x SATA system here I'm playing with in preparation to move
 over production to 5.x.
 
 These drives have been working under 4.x for quite some time - they're 250GB
 Maxtor disks
 
 ad4: 239372MB Maxtor 6B250S0/BANC1980 [486344/16/63] at ata2-master SATA150
 ad6: 239372MB Maxtor 6B250S0/BANC1980 [486344/16/63] at ata3-master SATA150

 The first disk runs nice and happy.
 
 The second does too, provided that the load isn't too high.  If it is, then I
 start to get DMA transfer errors, such as the following:
 
 ad6: FAILURE - READ_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE
 LBA=543191
 GEOM_MIRROR: Request failed (error=5). ad6[READ(offset=278048256,
 length=102400)]
 GEOM_MIRROR: Device m0: provider ad6 disconnected.
 ad6: FAILURE - READ_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE
 LBA=300463
 ad6: TIMEOUT - READ_DMA retrying (2 retries left) LBA=90863
 ad6: FAILURE - READ_DMA timed out
 ad6: FAILURE - READ_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE
 ad6: TIMEOUT - READ_DMA retrying (2 retries left) LBA=120663
 ad6: FAILURE - READ_DMA timed out
 
 I'm having a lot of trouble believing this is an actual disk problem.  Among
 other things, its happening at different places - not always at the same
 block.

I've got a few 1U Supermicro boxes running dual SATA drives:
ad4: 78167MB Maxtor 6Y080M0/YAR51HW0 [158816/16/63] at ata2-master SATA150
ad6: 78167MB Maxtor 6Y080M0/YAR51HW0 [158816/16/63] at ata3-master SATA150

I've run into all sorts of problems with every one, and changing the
IDE channel settings in the BIOS always fixes it. Which really annoys
me, because I setup a new box, run it for a couple weeks, then the
drives start getting flaky under load. Then I go change the setting in
the BIOS (that I always forget to do on initial setup) and it's dead
stable for months at a time.

I've had the exact same problem with FreeBSD 5.3 and OpenBSD 3.5 as well.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Bad disk or kernel (ATA Driver) problem?

2005-01-19 Thread Jon Simola
On Wed, 19 Jan 2005 13:33:12 -0800, Jon Simola [EMAIL PROTECTED] wrote:

 I've got a few 1U Supermicro boxes running dual SATA drives:
 I've run into all sorts of problems with every one, and changing the
 IDE channel settings in the BIOS always fixes it. Which really annoys
 me, because I setup a new box, run it for a couple weeks, then the
 drives start getting flaky under load. Then I go change the setting in
 the BIOS (that I always forget to do on initial setup) and it's dead
 stable for months at a time.

I was politely asked to actually dig up the settings, which cut
through my lack of sleep. I should have done this earlier :)

On this one box (Supermicro SuperServer 5013C-T, P4SCE BIOS v1.2c):
5.2.1-RELEASE-p4
atapci0: Intel ICH5 SATA150 controller port 0xf000-0xf00f,0-0x3,0-0x7,0-0x3,0-
0x7 irq 16 at device 31.2 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata0: [MPSAFE]
ata1: at 0x170 irq 15 on atapci0
ata1: [MPSAFE]
GEOM: create disk ad0 dp=0xc671a560
ad0: 70911MB WDC WD740GD-00FLA0 [144073/16/63] at ata0-master UDMA100
GEOM: create disk ad1 dp=0xc671a460
ad1: 70911MB WDC WD740GD-00FLA0 [144073/16/63] at ata0-slave UDMA100
acd0: CDROM CD-224E at ata1-master PIO4

That's a pair of SATA 74GB WD Raptors. The BIOS IDE setting is for
Combined - SATA drives will appear on the Primary IDE channel.


On a different box (Supermicro SuperServer 5013C-T, P4SCE BIOS v1.2c):
5.3-STABLE-20050107
atapci0: Intel ICH5 UDMA100 controller port
0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on
pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
atapci1: Intel ICH5 SATA150 controller port
0xd000-0xd00f,0xcc00-0xcc03,0xc800-0xc807,0xc400-0xc403,0xc000-0xc007
irq 18 at device 31.2 on pci0
ata2: channel #0 on atapci1
ata3: channel #1 on atapci1
acd0: CDROM CD-224E/1.9A at ata1-master UDMA33
ad4: 78167MB Maxtor 6Y080M0/YAR51HW0 [158816/16/63] at ata2-master SATA150
ad6: 78167MB Maxtor 6Y080M0/YAR51HW0 [158816/16/63] at ata3-master SATA150

A pair of Maxtor 80GBs, the BIOS is set for Enhanced, up to 6 drives
(4 IDE + 2 SATA).


Crazy as though it seems, I wasn't kidding about changing the BIOS.
The other 2 settings are SATA only and Auto. When the drives
started flaking out (timeouts on reads) I would go into the BIOS and
cycle through the BIOS settings. After changing it once or twice,
things would be fine for months at a time.

My best suspicion is that something makes the ICH5 a little flaky,
and twiddling the BIOS clears it somehow. My only evidence supporting
that is that twice the bios stalled on probing the drives once this
error had happened, and I had to physically remove the drives, twiddle
the bios settings, and replace the drives before it would work again.

On OpenBSD, this problem on the same hardware manifests as a read
timeout failure during the initial boot probes. Same fix, play with
the BIOS and it suddenly works. There's a term in the Jargon file for
this, but I can't recall it at the moment.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ALTQ patch for if_vlan.c

2005-01-11 Thread Jon Simola
On Wed, 5 Jan 2005 12:51:56 -0800, Brooks Davis
[EMAIL PROTECTED] wrote:

 ALTQ makes no sense of virtual interfaces.  ALTQ works by providing
 fine-grained control of the dequeueing of packets on to the wire.  It's
 too early to do this when you're still in the virtual interface.

PF does not have any access to traffic on the vlan parent interface.
By my reading of the source, outbound traffic - PF - vlan -
ether_output on the parent

This seems accurate as there are no packets leaving on the vlan parent
(em1 in my case):
bash-3.00# pfctl -vvs rules
@0 pass in quick on em1 all
  [ Evaluations: 749738Packets: 0 Bytes: 0   States: 0 ]
@1 pass out quick on em1 all
  [ Evaluations: 0 Packets: 0 Bytes: 0   States: 0 ]



I've had this patch running for a few hours now and it certainly seems
to accomplish what I was looking to do (throttle DSL customers at my
router):

# pfctl -vs rules
pass out quick on vlan130 from any to throttled_ips keep state queue
throttle_130
  [ Evaluations: 249230Packets: 6552  Bytes: 2443357 States: 554   ]

# pfctl -vs queue
queue  throttle_130 bandwidth 64Kb cbq( red )
  [ pkts:   1062  bytes: 348272  dropped pkts:   1588 bytes: 870884 ]
  [ qlength:  18/ 50  borrows:  0  suspends:105 ]
  [ measured:23.2 packets/s, 55.08Kb/s ]


 You can tag packets appropiratly at this point, but the actual ALTQ queue
 needs to be on a physical interface.

I don't see any way to accomplish this, and my experimenting has been
in vain until I patched ALTQ into if_vlan.

 FYI, spl*() funtions are all no-ops now.  We just have them around to
 remind us that we need to lock certain functions and to document what
 was protected before.

Thanks, good to know. I'm learning a lot about the kernel as I go.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


ALTQ patch for if_vlan.c

2005-01-05 Thread Jon Simola
I just whipped up this against
5.3-STABLE #1: Wed Dec 22 17:11:02 PST 2004

Would someone who knows a bit more about this than myself give it a
quick lookover and see if it appears sane? I'm mostly wondering about
the splimp() and splx() and whether it's required or excessive due to
the mtx_lock/unlock in the VLAN_LOCK/UNLOCK macros.

Due to a lack of equipment it's difficult for me to run a seperate
test environment, so any sort of review would be appreciated.

--- sys/net/if_vlan.c.orig  Wed Jan  5 12:25:19 2005
+++ sys/net/if_vlan.c   Wed Jan  5 12:53:45 2005
@@ -379,7 +379,10 @@
ifp-if_init = vlan_ifinit;
ifp-if_start = vlan_start;
ifp-if_ioctl = vlan_ioctl;
-   ifp-if_snd.ifq_maxlen = ifqmaxlen;
+   IFQ_SET_MAXLEN(ifp-if_snd, ifqmaxlen);
+   ifp-if_snd.ifq_drv_maxlen = 0;
+   IFQ_SET_READY(ifp-if_snd);
+
ether_ifattach(ifp, ifv-ifv_ac.ac_enaddr);
/* Now undo some of the damage... */
ifp-if_baudrate = 0;
@@ -423,11 +426,15 @@
 {
int unit;
struct ifvlan *ifv = ifp-if_softc;
+   int s;

unit = ifp-if_dunit;

VLAN_LOCK();
LIST_REMOVE(ifv, ifv_list);
+   s = splimp();
+   IFQ_PURGE(ifp-if_snd);
+   splx(s);
vlan_unconfig(ifp);
VLAN_UNLOCK();

@@ -458,12 +465,22 @@
struct mbuf *m;
int error;

+   if (ALTQ_IS_ENABLED(ifp-if_snd)) {
+   IFQ_LOCK(ifp-if_snd);
+   IFQ_POLL_NOLOCK(ifp-if_snd, m);
+   if (m == NULL ) {
+   IFQ_UNLOCK(ifp-if_snd);
+   return;
+   }
+   IFQ_UNLOCK(ifp-if_snd);
+   }
+
ifv = ifp-if_softc;
p = ifv-ifv_p;

ifp-if_flags |= IFF_OACTIVE;
for (;;) {
-   IF_DEQUEUE(ifp-if_snd, m);
+   IFQ_DEQUEUE(ifp-if_snd, m);
if (m == 0)
break;
BPF_MTAP(ifp, m);
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]