watchdog questions
I need some help understanding FreeBSD's kernel watchdog functionality. I've been reading up, and here's what I think I understand (correct me if I'm wrong): If a watchdog timer is set in the kernel and not reset or disabled within the time given, the kernel reboots the system. 'watchdog -t n' starts a watchdog for n seconds. Runing watchdog(8) again in n seconds, resets the timer. If 'watchdog -t 0' is run, the kernel disables the watchdog. watchdogd(8) either runs stat(2) on /etc, or a user-defined cmd (with -e), and resets the watchdog only on a zero exit code. There's a few things that aren't clear, though: How many watchdog timers can be enabled at a given time? If more than one, does a single 'watchdog -t 0' disable all timers? Upon timer expiration, can the kernel be configured to do anything OTHER than rebooting? Is it the general idea that watchdog(8) would be run in a script, making sure the script doesn't hang? And that watchdogd(8) is run to ensure the entire system doesn't hang? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
Broken drive geometry / partitions on 7.2 install
Hi all, I was trying to install 7.2 RELEASE on top of a previous 6.4 RELEASE I'd set up (but not deployed). The server has a 40MB Intel service partition and the rest of the drive for FreeBSD. Here's what greeted me when doing the fdisk from the install CD: Disk name: da0FDISK Partition Editor DISK Geometry: 2209 cyls/255 heads/63 sectors = 35487585 sectors (17327MB) Offset Size(ST)End Name PType Desc SubtypeFlags 0 63 62- 12 unused0 63 64197 64259da0as1 4 Compaq Diagnostic 18 6426030134993077758- 12 unused0 3077758 641973141955da0cs1 4 Compaq Diagnostic 18 3141956 32345629 354875584- 12 unused0 It says there's 2 service partition slices (type 18) and no FreeBSD slice. Remember, I had successfully installed 6.4 on this drive and was able to boot into both the service partition and FreeBSD. I ended up deleting all the partitions and recreating them by hand. I first created the service partition slice with a size of 80262 (which is what /sbin/fdisk under 6.4 reported), and the FBSD slice with a size of 35407260 (the remaining space). After doing that, I was able to install 7.2 just fine and boot into it. I was also able to boot into the Intel service partition, since I hadn't blown over any of the original slice. However, this is what I get from /usr/sbin/sysinstall's fdisk now: Disk name: da0FDISK Partition Editor DISK Geometry: 2209 cyls/255 heads/63 sectors = 35487585 sectors (17327MB) Offset Size(ST)End Name PType Desc SubtypeFlags 0 63 62- 12 unused0 63 64197 64259da0s1 4 Compaq Diagnostic 18 64260 35423325 35487584da0s2 8freebsd 165 And /sbin/fdisk reports the same: *** Working on device /dev/da0 *** parameters extracted from in-core disklabel are: cylinders=2209 heads=255 sectors/track=63 (16065 blks/cyl) Figures below won't work with BIOS for partitions not in cyl 1 parameters to be used for BIOS calculations are: cylinders=2209 heads=255 sectors/track=63 (16065 blks/cyl) Media sector size is 512 Warning: BIOS sector numbering starts with sector 1 Information from DOS bootblock is: The data for partition 1 is: sysid 18 (0x12),(Compaq diagnostics) start 63, size 64197 (31 Meg), flag 0 beg: cyl 0/ head 1/ sector 1; end: cyl 3/ head 254/ sector 63 The data for partition 2 is: sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD) start 64260, size 35423325 (17296 Meg), flag 80 (active) beg: cyl 4/ head 0/ sector 1; end: cyl 1023/ head 254/ sector 63 Notice that I have only 2 slices, but the service partition slice is 64194 blocks instead of the 80262. On top of this, when I boot from the 7.2 install CD again, fdisk shows the same screwed-up setup with 2 Compaq Diag slices with no FBSD slice. What on earth is happening? Is my drive geometry hosed? Is this some sort of weird LBA issue? I'm nervous about configuring and deploying this machine acting as it is. I also have an identical machine that's reporting the same thing. Thanks. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
/var or /usr for data?
It would appear that the proper allocation of filesystems on FreeBSD is to put all data in /usr. I'm used to this and have been doing it for years. However, there's a few issues that keep coming up. A lot of the ports use /var for data dirs. MySQL, Qmail, dspam are a few that I've had issues with. Is there a canonical place to put data files on a modern FreeBSD server? Figuring out the sizes for each partition is an exercise in frustration when I don't know how big /var or /usr are going to grow. For now, I've changed the default config files for MySQL and dspam to use /usr/local for data dirs, but is this the right thing to do? I used to put everything on /, but that created problems when I couldn't fsck the single large partition and I had to boot from CD to fix things. That's an issue when the server's not in the same state. A Solaris associate of mine is of the opinion that /usr should be able to be mounted RO for security purposes. If /var was the default for all add-ons and data, I could see that, but that wouldn't work the ways things are now. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Xeon CPU temp
Wasn't sure if this would be better directed to -hardware. I'm attempting to read the temperatures of my CPUs on my dual Xeon Tyan 2720 running 5.4-STABLE. According to the manual, the motherboard has supports diagnostics via smbus, so I dutifully built a new kernel with the smbus and i2c options. The docs says Winbond 83782D is accessible on slave 0x29 for CPU fans, voltage and system temperature. The W83627HF at slave 0x2A has 3 addtional chassis fan sensors. So far, no problem. I've been able to read these via healthd, xmbmon or lmmon. I've fiddled with them a bit to make sure they're looking at the right slave address, but other than that reading the smbus makes sense. Here's where I'm stuck. The manual says the Xeons have on-chip thermal sensors at slave 0x18 0x19, both at bank 0 and register 0. When I try to read these, I get a Device not configured error from the ioctl call. I've come across a few refs to the hw.acpi.thermal.tz1.temperature sysctl, but that OID's apparently not available to me. Any suggestions? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Building part of world
I'm trying to update my sys/pci/if_sk.c and would like to be able to build several versions without having to build the entire world. How would I do that? Thanks, Brad Waite ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Building part of world
In the last episode (Oct 29), Brad Waite said: I'm trying to update my sys/pci/if_sk.c and would like to be able to build several versions without having to build the entire world. Since that's a kernel driver, you only have to build a new kernel. Heh. I realized that about 10 minutes after I posted the question. If I wasn't able to laugh at myself sometimes, I'd be in a heap of trouble. ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Building part of world
On 2004-10-29 13:37, Dan Nelson [EMAIL PROTECTED] wrote: In the last episode (Oct 29), Brad Waite said: I'm trying to update my sys/pci/if_sk.c and would like to be able to build several versions without having to build the entire world. Since that's a kernel driver, you only have to build a new kernel. An even better approach in the case of a single kernel driver is to leave it commented out in the kernel config file. Then it will be built as a module by default. After at least one buildworld/buildkernel cycle has finished correctly with this configuration, you can use the already populated /usr/obj tree to build just this module: # cd /usr/src/sys/i386/conf # config -g -d /usr/obj/usr/src/sys/MYKERNEL MYKERNEL # cd /usr/obj/usr/src/sys/MYKERNEL # make depend make make install If you have only touched a single .c file, the 'make depend' step is AFAIK optional. The rest should finish pretty fast. Brave people might even get away by building the sk module only, by emulating the specific part of the kernel build: # cd /usr/src/sys/modules/sk # env MAKEOBJDIRPREFIX=/tmp/sk \ KMODDIR=/boot/kernel DEBUG_FLAGS=-g MACHINE=i386 \ KERNBUILDDIR=/usr/obj/usr/src/sys/MYKERNEL make obj # env MAKEOBJDIRPREFIX=/tmp/sk \ KMODDIR=/boot/kernel DEBUG_FLAGS=-g MACHINE=i386 \ KERNBUILDDIR=/usr/obj/usr/src/sys/MYKERNEL make all If all this works, you can just kldload the new if_sk.ko from `/tmp/sk/usr/src/sys/modules/sk' to test your changes. HTH, Giorgos Wow, Giorgos, this really *does* help. It never dawned on me that FBSD even supported loadable kernel modules. Feel kinda sheepish now, but hey, I guess you learn something new every day. In my stumbling around since you've enlightened me, I noticed a sk/ dir in /usr/src/sys/modules, and in there a Makefile. 'make install' apparently builds the .ko and installs it into /modules. Am I missing something here, or is this the way to go? Brad ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
How do you increase the size of lost+found?
While trying to recover from a HD crash, 'fsck -y /dev/rad1s1a' reports the following error a number of times at the end of it's run: UNREF FILE I=3537799 OWNER=500 MODE=100644 SIZE=6611 MTIME=Oct 25 21:12 2003 RECONNECT? yes SORRY. NO SPACE IN lost+found DIRECTORY This tells me that it's not saving some of the files on the drive. Is that correct? Is there anything I can do to make more space in lost+found, either system-wide or while the fsck is running? Some possibly pertinent info: # ls -lad lost+found drwxrwxrwt 1379 root wheel 182272 Apr 12 16:55 lost+found # ls lost+found | wc -l 8899 This fs was copied from a drive reporting hard errors reading fsbn... using dd. Thanks, Brad Waite ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
SORRY. NO SPACE IN lost+found DIRECTORY
While trying to recover from a HD crash, 'fsck -y /dev/rad1s1a' reports the following error a number of times at the end of it's run. UNREF FILE I=3537799 OWNER=500 MODE=100644 SIZE=6611 MTIME=Oct 25 21:12 2003 RECONNECT? yes SORRY. NO SPACE IN lost+found DIRECTORY This tells me that it's not saving some of the files on the drive. Is that correct? Is there anything I can do to make more space in lost+found, either system-wide or while the fsck is running? Some possibly pertinent info: # ls -lad lost+found drwxrwxrwt 1379 root wheel 182272 Apr 12 16:55 lost+found # ls lost+found | wc -l 8899 This fs was copied from a drive reporting hard errors reading fsbn... using dd. Thanks, Brad Waite ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: hard disk recover
[EMAIL PROTECTED] wrote: I'm getting the dreaded ad1s1a: hard error reading fsbn 524543 of 96-127 (ad1s1 bn 524543; cn 520 tn 6 sn 5) status=59 error=40 errors. Based on what I've read, it means my drive's going bye-bye. As it is, it won't even boot - fortunately I have another FBSD drive to boot from, and I get these errors while trying fsck it. Shame on me for not noticing the errors sooner and an even bigger shame for not having a proper backup. In any case, the milk is spilled and I need to mop it up as best I can. While I can mount the partition, I can't cd to it (more hard errors...), and since fsck isn't apparently helping, what can I do to recover what's left? I'm thinking dd's the tool to use, but I'm not really sure how to go about it. Here's what I get when I try to read from the beginning on the partition: # dd if=/dev/ad1s1a bs=64k dd: /dev/ad1s1a: Input/output error However, when I add skip=1, the drive spits back data. That leads me to believe that if I skip over the bad sectors, I can read what's left. I've got a spare drive I can use as a sandbox, but how should I dump the data? Should I label the second drive with the same partition size and dd if=/dev/ad1s1a of=/dev/ad2s1a? Is there any chance of recovering filesystem data going this route? [Quoting myself as it's been 2 weeks since the first post] Here's what's new: ad0: 21557MB IBM-DJNA-372200 [43800/16/63] at ata0-master UDMA66 ad1: 39083MB Maxtor 5T040H4 [79408/16/63] at ata0-slave UDMA100 ad2: 29311MB Maxtor 5T030H3 [59554/16/63] at ata1-master UDMA100 ad2 is the 30GB drive reporting errors; ad1 is the new 40GB drive I copied the partition to. I tried to fdisk the 40G to be identical to the 30G, but I could never get the size to match exactly. In the end, I just set up the 256M swap, and hoped the 524288 offset for the 'a' partition would work. Here's relevant disklabel output: # disklabel -r /dev/ad1s1 # /dev/ad1s1: type: ESDI disk: ad0s1 label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 4981 sectors/unit: 80035767 [...] 8 partitions: #size offset fstype [fsize bsize bps/cpg] a: 79511479 524288 4.2BSD 2048 1638489 # (Cyl. 32*- 4981*) b: 524288 0 swap # (Cyl. 0 - 32*) c: 80035767 0 unused0 0# (Cyl. 0 - 4981*) # disklabel -r /dev/ad2s1 # /dev/ad2s1: type: ESDI disk: ad0s1 label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 59553 sectors/unit: 60030369 [...] 8 partitions: #size offsetfstype [fsize bsize bps/cpg] a: 59506081 5242884.2BSD 2048 1638416 # (Cyl. 520*- 59553*) b: 524288 0 swap # (Cyl. 0 - 520*) c: 60030369 0unused 0 0# (Cyl. 0 - 59553*) I used lewiz' suggestion to add 'conv=noerror,sync' to dd. I was able to copy the readable data from the bad drive to a new one. I changed it to bs=512b (redundant, I know) since if the old disk was bad on 512-byte block 0, I figured dd would skip to the next 64k. Here's what I used: dd if=/dev/ad2s1a of=/dev/ad1s1a conv=noerror,sync bs=512b Of course, I got about 165 ad2s1a: hard error reading fsbn ... errors, but it appeared to copy everything else okay. The first 16 blocks of ad2s1a are null, but there is 16 blocks of data at block 32, so it appears the first backup superblock survived. Is there a remote chance that I'll be able to fsck this fs and recover? I know that fsck will complain about the first alternate superblock not matching because the last superblock won't be in the first 30GB. Do the different sized partitions make this impossible? ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]