Re: ATA Woes.
On 7/19/05, Tony Byrne [EMAIL PROTECTED] wrote: Jul 19 13:01:48 roo kernel: ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=288810495 Jul 19 13:01:59 roo kernel: ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR error=1ILLEGAL_LENGTH LBA=288810495 Jul 19 13:02:05 roo kernel: ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=288810495 Jul 19 13:02:16 roo kernel: ad0: FAILURE - READ_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=288810495 Jul 19 13:04:36 roo last message repeated 4 times I'm totally confused. I don't know enough about SMART to know whether I'm looking at real failing drives or some bug exposed by the interaction between drive firmware, hd controller and FreeBSD. What I've recently learned the hard way is that desktop drives have no place in a server. I've now failed 4 of 10 SATA drives (Maxtor and WD) in 1U rackmounts, and am moving on to trying the WD Raptor SATA drives (which claim to be low-end server). -- Jon Simola Systems Administrator ABC Communications ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ATA Woes.
On 7/19/05, Wilko Bulte [EMAIL PROTECTED] wrote: On Tue, Jul 19, 2005 at 11:22:01AM -0700, Jon Simola wrote.. I've now failed 4 of 10 SATA drives (Maxtor and WD) in 1U rackmounts, and am moving on to trying the WD Raptor SATA drives (which claim to be low-end server). Properly cooled? Yeah, they're in the Supermicro 811 chassis with hotswap SATA sleds. There's a decent amount of air flowing over the drives, and SMART says they're running about 26C. Compared to my 10Krpm SCSI array that I've burned my fingers on, frequently. -- Jon Simola Systems Administrator ABC Communications ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: atacontrol raid1 vs. gmirror
On 6/11/05, Paul Mather [EMAIL PROTECTED] wrote: I found array rebuilding to be troublesome on atacontrol RAID Some bits from the in-house documentation I've been writing. I've tested this on multiple occasions on my 1U Supermicro SATA boxes, so might possibly be interesting for someone. * RAID setup: (If required, FreeBSD only) - ensure that the drives are probed as ad4 and ad6 like: ad4: 76324MB [155072/16/63] at ata2-master SATA150 ad6: 76324MB [155072/16/63] at ata3-master SATA150 - if they do not probe correctly, check the BIOS settings above - perform a minimal install of FreeBSD 5.3 (do not worry about network or anything) - reboot from the installed OS and login as root - Run the command atacontrol create RAID1 ad4 ad6 to create the raid set - Reboot and reinstall the OS, choosing ar0 as the drive, which should probe like: ad4: 76324MB [155072/16/63] at ata2-master SATA150 ad6: 76324MB [155072/16/63] at ata3-master SATA150 ar0: 76324MB [9730/255/63] status: READY subdisks: disk0 READY on ad4 at ata2-master disk1 READY on ad6 at ata3-master Minimal Survival for FreeBSD software RAID1 sets * Read the atacontrol man page * atacontrol status ar0 - to check the status * atacontrol detach 2 - to detach ad4 if failed (again, use 3 for ad6). The SATA disks in the 5013C-T chassis are hotswappable, so it can be pulled once detached. * atacontrol attach 2 - to reattach ad4 once replaced * atacontrol addspare ar0 ad4 - to add the replaced ad4 as a spare on the RAID set * atacontrol rebuild ar0 - to rebuild the mirror -- Jon Simola Systems Administrator ABC Communications ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Using jails and djbdns
On 5/12/05, Tony Arcieri [EMAIL PROTECTED] wrote: Is there some easy way to reverse this order, so svscan is started first and jails started afterward? Rename the svscan.sh script to 000svscan.sh so that it shows up first in a directory listing and is run before the jail scripts (jail-x.x.x.x.sh in my case). man rc(8) has a lot of good points to read through if you have any further questions. -- Jon Simola Systems Administrator ABC Communications ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: xl(4) polling
On 5/10/05, Rob [EMAIL PROTECTED] wrote: Interestingly: HZ=1000 is apparently a problem with the xl devices (3Com 3c905B-TX), but not with the rl devices (RealTek 8139). What could cause that difference? Could a difference in buffer size on the LAN card cause this? Yes. GigE cards tend to have larger packet buffers, but that certainly doesn't solve all the problems. I've been having some problems with the em cards in particular (fxp I've had no problems with) as no matter what I've tried tuning (tcprecvspace, HZ, polling knobs) I've been seeing packet loss of about 0.5%. That doesn't seem like much, but it's an awful lot to the couple thousand users behind it. Anyways, HZ=1000 shouldn't be a CPU problem on anything faster than a 500MHz-ish processor. There are also a few lightly documented sysctls that might be useful to play with that do things like poll during the idle loop (actual usefullness in any particular case may be void in your area, many will enter, few will win). -- Jon Simola Systems Administrator ABC Communications ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Bad disk or kernel (ATA Driver) problem?
On Wed, 19 Jan 2005 15:13:01 -0600, Karl Denninger [EMAIL PROTECTED] wrote: I've got a 2 x SATA system here I'm playing with in preparation to move over production to 5.x. These drives have been working under 4.x for quite some time - they're 250GB Maxtor disks ad4: 239372MB Maxtor 6B250S0/BANC1980 [486344/16/63] at ata2-master SATA150 ad6: 239372MB Maxtor 6B250S0/BANC1980 [486344/16/63] at ata3-master SATA150 The first disk runs nice and happy. The second does too, provided that the load isn't too high. If it is, then I start to get DMA transfer errors, such as the following: ad6: FAILURE - READ_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=543191 GEOM_MIRROR: Request failed (error=5). ad6[READ(offset=278048256, length=102400)] GEOM_MIRROR: Device m0: provider ad6 disconnected. ad6: FAILURE - READ_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE LBA=300463 ad6: TIMEOUT - READ_DMA retrying (2 retries left) LBA=90863 ad6: FAILURE - READ_DMA timed out ad6: FAILURE - READ_DMA status=51READY,DSC,ERROR error=40UNCORRECTABLE ad6: TIMEOUT - READ_DMA retrying (2 retries left) LBA=120663 ad6: FAILURE - READ_DMA timed out I'm having a lot of trouble believing this is an actual disk problem. Among other things, its happening at different places - not always at the same block. I've got a few 1U Supermicro boxes running dual SATA drives: ad4: 78167MB Maxtor 6Y080M0/YAR51HW0 [158816/16/63] at ata2-master SATA150 ad6: 78167MB Maxtor 6Y080M0/YAR51HW0 [158816/16/63] at ata3-master SATA150 I've run into all sorts of problems with every one, and changing the IDE channel settings in the BIOS always fixes it. Which really annoys me, because I setup a new box, run it for a couple weeks, then the drives start getting flaky under load. Then I go change the setting in the BIOS (that I always forget to do on initial setup) and it's dead stable for months at a time. I've had the exact same problem with FreeBSD 5.3 and OpenBSD 3.5 as well. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Bad disk or kernel (ATA Driver) problem?
On Wed, 19 Jan 2005 13:33:12 -0800, Jon Simola [EMAIL PROTECTED] wrote: I've got a few 1U Supermicro boxes running dual SATA drives: I've run into all sorts of problems with every one, and changing the IDE channel settings in the BIOS always fixes it. Which really annoys me, because I setup a new box, run it for a couple weeks, then the drives start getting flaky under load. Then I go change the setting in the BIOS (that I always forget to do on initial setup) and it's dead stable for months at a time. I was politely asked to actually dig up the settings, which cut through my lack of sleep. I should have done this earlier :) On this one box (Supermicro SuperServer 5013C-T, P4SCE BIOS v1.2c): 5.2.1-RELEASE-p4 atapci0: Intel ICH5 SATA150 controller port 0xf000-0xf00f,0-0x3,0-0x7,0-0x3,0- 0x7 irq 16 at device 31.2 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata0: [MPSAFE] ata1: at 0x170 irq 15 on atapci0 ata1: [MPSAFE] GEOM: create disk ad0 dp=0xc671a560 ad0: 70911MB WDC WD740GD-00FLA0 [144073/16/63] at ata0-master UDMA100 GEOM: create disk ad1 dp=0xc671a460 ad1: 70911MB WDC WD740GD-00FLA0 [144073/16/63] at ata0-slave UDMA100 acd0: CDROM CD-224E at ata1-master PIO4 That's a pair of SATA 74GB WD Raptors. The BIOS IDE setting is for Combined - SATA drives will appear on the Primary IDE channel. On a different box (Supermicro SuperServer 5013C-T, P4SCE BIOS v1.2c): 5.3-STABLE-20050107 atapci0: Intel ICH5 UDMA100 controller port 0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0 ata0: channel #0 on atapci0 ata1: channel #1 on atapci0 atapci1: Intel ICH5 SATA150 controller port 0xd000-0xd00f,0xcc00-0xcc03,0xc800-0xc807,0xc400-0xc403,0xc000-0xc007 irq 18 at device 31.2 on pci0 ata2: channel #0 on atapci1 ata3: channel #1 on atapci1 acd0: CDROM CD-224E/1.9A at ata1-master UDMA33 ad4: 78167MB Maxtor 6Y080M0/YAR51HW0 [158816/16/63] at ata2-master SATA150 ad6: 78167MB Maxtor 6Y080M0/YAR51HW0 [158816/16/63] at ata3-master SATA150 A pair of Maxtor 80GBs, the BIOS is set for Enhanced, up to 6 drives (4 IDE + 2 SATA). Crazy as though it seems, I wasn't kidding about changing the BIOS. The other 2 settings are SATA only and Auto. When the drives started flaking out (timeouts on reads) I would go into the BIOS and cycle through the BIOS settings. After changing it once or twice, things would be fine for months at a time. My best suspicion is that something makes the ICH5 a little flaky, and twiddling the BIOS clears it somehow. My only evidence supporting that is that twice the bios stalled on probing the drives once this error had happened, and I had to physically remove the drives, twiddle the bios settings, and replace the drives before it would work again. On OpenBSD, this problem on the same hardware manifests as a read timeout failure during the initial boot probes. Same fix, play with the BIOS and it suddenly works. There's a term in the Jargon file for this, but I can't recall it at the moment. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ALTQ patch for if_vlan.c
On Wed, 5 Jan 2005 12:51:56 -0800, Brooks Davis [EMAIL PROTECTED] wrote: ALTQ makes no sense of virtual interfaces. ALTQ works by providing fine-grained control of the dequeueing of packets on to the wire. It's too early to do this when you're still in the virtual interface. PF does not have any access to traffic on the vlan parent interface. By my reading of the source, outbound traffic - PF - vlan - ether_output on the parent This seems accurate as there are no packets leaving on the vlan parent (em1 in my case): bash-3.00# pfctl -vvs rules @0 pass in quick on em1 all [ Evaluations: 749738Packets: 0 Bytes: 0 States: 0 ] @1 pass out quick on em1 all [ Evaluations: 0 Packets: 0 Bytes: 0 States: 0 ] I've had this patch running for a few hours now and it certainly seems to accomplish what I was looking to do (throttle DSL customers at my router): # pfctl -vs rules pass out quick on vlan130 from any to throttled_ips keep state queue throttle_130 [ Evaluations: 249230Packets: 6552 Bytes: 2443357 States: 554 ] # pfctl -vs queue queue throttle_130 bandwidth 64Kb cbq( red ) [ pkts: 1062 bytes: 348272 dropped pkts: 1588 bytes: 870884 ] [ qlength: 18/ 50 borrows: 0 suspends:105 ] [ measured:23.2 packets/s, 55.08Kb/s ] You can tag packets appropiratly at this point, but the actual ALTQ queue needs to be on a physical interface. I don't see any way to accomplish this, and my experimenting has been in vain until I patched ALTQ into if_vlan. FYI, spl*() funtions are all no-ops now. We just have them around to remind us that we need to lock certain functions and to document what was protected before. Thanks, good to know. I'm learning a lot about the kernel as I go. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
ALTQ patch for if_vlan.c
I just whipped up this against 5.3-STABLE #1: Wed Dec 22 17:11:02 PST 2004 Would someone who knows a bit more about this than myself give it a quick lookover and see if it appears sane? I'm mostly wondering about the splimp() and splx() and whether it's required or excessive due to the mtx_lock/unlock in the VLAN_LOCK/UNLOCK macros. Due to a lack of equipment it's difficult for me to run a seperate test environment, so any sort of review would be appreciated. --- sys/net/if_vlan.c.orig Wed Jan 5 12:25:19 2005 +++ sys/net/if_vlan.c Wed Jan 5 12:53:45 2005 @@ -379,7 +379,10 @@ ifp-if_init = vlan_ifinit; ifp-if_start = vlan_start; ifp-if_ioctl = vlan_ioctl; - ifp-if_snd.ifq_maxlen = ifqmaxlen; + IFQ_SET_MAXLEN(ifp-if_snd, ifqmaxlen); + ifp-if_snd.ifq_drv_maxlen = 0; + IFQ_SET_READY(ifp-if_snd); + ether_ifattach(ifp, ifv-ifv_ac.ac_enaddr); /* Now undo some of the damage... */ ifp-if_baudrate = 0; @@ -423,11 +426,15 @@ { int unit; struct ifvlan *ifv = ifp-if_softc; + int s; unit = ifp-if_dunit; VLAN_LOCK(); LIST_REMOVE(ifv, ifv_list); + s = splimp(); + IFQ_PURGE(ifp-if_snd); + splx(s); vlan_unconfig(ifp); VLAN_UNLOCK(); @@ -458,12 +465,22 @@ struct mbuf *m; int error; + if (ALTQ_IS_ENABLED(ifp-if_snd)) { + IFQ_LOCK(ifp-if_snd); + IFQ_POLL_NOLOCK(ifp-if_snd, m); + if (m == NULL ) { + IFQ_UNLOCK(ifp-if_snd); + return; + } + IFQ_UNLOCK(ifp-if_snd); + } + ifv = ifp-if_softc; p = ifv-ifv_p; ifp-if_flags |= IFF_OACTIVE; for (;;) { - IF_DEQUEUE(ifp-if_snd, m); + IFQ_DEQUEUE(ifp-if_snd, m); if (m == 0) break; BPF_MTAP(ifp, m); ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]