watchdog questions

2009-05-05 Thread Brad Waite
I need some help understanding FreeBSD's kernel watchdog functionality.  I've
been reading up, and here's what I think I understand (correct me if I'm wrong):

If a watchdog timer is set in the kernel and not reset or disabled within the
time given, the kernel reboots the system.

'watchdog -t n' starts a watchdog for n seconds.  Runing watchdog(8) again in
n seconds, resets the timer.  If 'watchdog -t 0' is run, the kernel disables
the watchdog.

watchdogd(8) either runs stat(2) on /etc, or a user-defined cmd (with -e), and
resets the watchdog only on a zero exit code.

There's a few things that aren't clear, though:

How many watchdog timers can be enabled at a given time?  If more than one,
does a single 'watchdog -t 0' disable all timers?

Upon timer expiration, can the kernel be configured to do anything OTHER than
rebooting?

Is it the general idea that watchdog(8) would be run in a script, making sure
the script doesn't hang?  And that watchdogd(8) is run to ensure the entire
system doesn't hang?


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Broken drive geometry / partitions on 7.2 install

2009-05-04 Thread Brad Waite
Hi all,

I was trying to install 7.2 RELEASE on top of a previous 6.4 RELEASE I'd set up
(but not deployed).  The server has a 40MB Intel service partition and the rest
of the drive for FreeBSD.  Here's what greeted me when doing the fdisk from the
install CD:



Disk name:  da0FDISK Partition Editor
DISK Geometry:  2209 cyls/255 heads/63 sectors = 35487585 sectors (17327MB)

Offset   Size(ST)End Name  PType   Desc  SubtypeFlags

 0 63 62- 12 unused0
63  64197  64259da0as1 4 Compaq Diagnostic   18
 6426030134993077758- 12 unused0
   3077758  641973141955da0cs1 4 Compaq Diagnostic   18
   3141956   32345629  354875584- 12 unused0



It says there's 2 service partition slices (type 18) and no FreeBSD slice.
Remember, I had successfully installed 6.4 on this drive and was able to boot
into both the service partition and FreeBSD.

I ended up deleting all the partitions and recreating them by hand.  I first
created the service partition slice with a size of 80262 (which is what
/sbin/fdisk under 6.4 reported), and the FBSD slice with a size of 35407260
(the remaining space).

After doing that, I was able to install 7.2 just fine and boot into it.  I was
also able to boot into the Intel service partition, since I hadn't blown over
any of the original slice.

However, this is what I get from /usr/sbin/sysinstall's fdisk now:



Disk name:  da0FDISK Partition Editor
DISK Geometry:  2209 cyls/255 heads/63 sectors = 35487585 sectors (17327MB)

Offset   Size(ST)End Name  PType   Desc  SubtypeFlags

 0 63 62- 12 unused0
63  64197  64259da0s1  4 Compaq Diagnostic   18
 64260   35423325   35487584da0s2  8freebsd  165



And /sbin/fdisk reports the same:



*** Working on device /dev/da0 ***
parameters extracted from in-core disklabel are:
cylinders=2209 heads=255 sectors/track=63 (16065 blks/cyl)

Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=2209 heads=255 sectors/track=63 (16065 blks/cyl)

Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 18 (0x12),(Compaq diagnostics)
start 63, size 64197 (31 Meg), flag 0
beg: cyl 0/ head 1/ sector 1;
end: cyl 3/ head 254/ sector 63
The data for partition 2 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
start 64260, size 35423325 (17296 Meg), flag 80 (active)
beg: cyl 4/ head 0/ sector 1;
end: cyl 1023/ head 254/ sector 63


Notice that I have only 2 slices, but the service partition slice is 64194
blocks instead of the 80262.  On top of this, when I boot from the 7.2 install
CD again, fdisk shows the same screwed-up setup with 2 Compaq Diag slices with
no FBSD slice.

What on earth is happening?  Is my drive geometry hosed?  Is this some sort of
weird LBA issue?  I'm nervous about configuring and deploying this machine
acting as it is.  I also have an identical machine that's reporting the same 
thing.

Thanks.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


/var or /usr for data?

2007-08-22 Thread Brad Waite
It would appear that the proper allocation of filesystems on FreeBSD is
to put all data in /usr.  I'm used to this and have been doing it for
years.

However, there's a few issues that keep coming up.  A lot of the ports use
/var for data dirs.  MySQL, Qmail, dspam are a few that I've had issues
with.

Is there a canonical place to put data files on a modern FreeBSD server? 
Figuring out the sizes for each partition is an exercise in frustration
when I don't know how big /var or /usr are going to grow.

For now, I've changed the default config files for MySQL and dspam to use
/usr/local for data dirs, but is this the right thing to do?

I used to put everything on /, but that created problems when I couldn't
fsck the single large partition and I had to boot from CD to fix things. 
That's an issue when the server's not in the same state.

A Solaris associate of mine is of the opinion that /usr should be able to
be mounted RO for security purposes.  If /var was the default for all
add-ons and data, I could see that, but that wouldn't work the ways things
are now.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Xeon CPU temp

2006-02-11 Thread Brad Waite

Wasn't sure if this would be better directed to -hardware.

I'm attempting to read the temperatures of my CPUs on my dual Xeon Tyan 
2720 running 5.4-STABLE.


According to the manual, the motherboard has supports diagnostics via 
smbus, so I dutifully built a new kernel with the smbus and i2c options. 
 The docs says Winbond 83782D is accessible on slave 0x29 for CPU fans, 
voltage and system temperature.  The W83627HF at slave 0x2A has 3 
addtional chassis fan sensors.


So far, no problem.  I've been able to read these via healthd, xmbmon or 
lmmon.  I've fiddled with them a bit to make sure they're looking at the 
right slave address, but other than that reading the smbus makes sense.


Here's where I'm stuck.  The manual says the Xeons have on-chip thermal 
sensors at slave 0x18  0x19, both at bank 0 and register 0.  When I try 
to read these, I get a Device not configured error from the ioctl call.


I've come across a few refs to the hw.acpi.thermal.tz1.temperature 
sysctl, but that OID's apparently not available to me.


Any suggestions?
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Building part of world

2004-10-29 Thread Brad Waite
I'm trying to update my sys/pci/if_sk.c and would like to be able to build
several versions without having to build the entire world.

How would I do that?

Thanks,

Brad Waite
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Building part of world

2004-10-29 Thread Brad Waite
 In the last episode (Oct 29), Brad Waite said:
 I'm trying to update my sys/pci/if_sk.c and would like to be able to
 build several versions without having to build the entire world.

 Since that's a kernel driver, you only have to build a new kernel.

Heh.  I realized that about 10 minutes after I posted the question.

If I wasn't able to laugh at myself sometimes, I'd be in a heap of trouble.
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Building part of world

2004-10-29 Thread Brad Waite
 On 2004-10-29 13:37, Dan Nelson [EMAIL PROTECTED] wrote:
 In the last episode (Oct 29), Brad Waite said:
  I'm trying to update my sys/pci/if_sk.c and would like to be able to
  build several versions without having to build the entire world.

 Since that's a kernel driver, you only have to build a new kernel.

 An even better approach in the case of a single kernel driver is to leave
 it
 commented out in the kernel config file.  Then it will be built as a
 module by
 default.  After at least one buildworld/buildkernel cycle has finished
 correctly with this configuration, you can use the already populated
 /usr/obj
 tree to build just this module:

   # cd /usr/src/sys/i386/conf
   # config -g -d /usr/obj/usr/src/sys/MYKERNEL MYKERNEL
   # cd /usr/obj/usr/src/sys/MYKERNEL
   # make depend  make  make install

 If you have only touched a single .c file, the 'make depend' step is AFAIK
 optional.  The rest should finish pretty fast.

 Brave people might even get away by building the sk module only, by
 emulating the specific part of the kernel build:

   # cd /usr/src/sys/modules/sk
   # env MAKEOBJDIRPREFIX=/tmp/sk \
 KMODDIR=/boot/kernel DEBUG_FLAGS=-g MACHINE=i386 \
 KERNBUILDDIR=/usr/obj/usr/src/sys/MYKERNEL make obj
   # env MAKEOBJDIRPREFIX=/tmp/sk \
 KMODDIR=/boot/kernel DEBUG_FLAGS=-g MACHINE=i386 \
 KERNBUILDDIR=/usr/obj/usr/src/sys/MYKERNEL make all

 If all this works, you can just kldload the new if_sk.ko from
 `/tmp/sk/usr/src/sys/modules/sk' to test your changes.

 HTH,
 Giorgos

Wow, Giorgos, this really *does* help.  It never dawned on me that FBSD
even supported loadable kernel modules.  Feel kinda sheepish now, but hey,
I guess you learn something new every day.

In my stumbling around since you've enlightened me, I noticed a sk/ dir in
/usr/src/sys/modules, and in there a Makefile.  'make install' apparently
builds the .ko and installs it into /modules.

Am I missing something here, or is this the way to go?

Brad
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


How do you increase the size of lost+found?

2004-04-13 Thread Brad Waite
While trying to recover from a HD crash, 'fsck -y /dev/rad1s1a' reports 
the following error a number of times at the end of it's run:

UNREF FILE  I=3537799  OWNER=500 MODE=100644
SIZE=6611 MTIME=Oct 25 21:12 2003
RECONNECT? yes
SORRY. NO SPACE IN lost+found DIRECTORY

This tells me that it's not saving some of the files on the drive.  Is 
that correct?

Is there anything I can do to make more space in lost+found, either 
system-wide or while the fsck is running?

Some possibly pertinent info:

# ls -lad lost+found
drwxrwxrwt  1379 root  wheel  182272 Apr 12 16:55 lost+found
# ls lost+found | wc -l
8899
This fs was copied from a drive reporting hard errors reading fsbn... 
using dd.

Thanks,

Brad Waite
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


SORRY. NO SPACE IN lost+found DIRECTORY

2004-04-12 Thread Brad Waite
While trying to recover from a HD crash, 'fsck -y /dev/rad1s1a' reports 
the following error a number of times at the end of it's run.

UNREF FILE  I=3537799  OWNER=500 MODE=100644
SIZE=6611 MTIME=Oct 25 21:12 2003
RECONNECT? yes
SORRY. NO SPACE IN lost+found DIRECTORY

This tells me that it's not saving some of the files on the drive.  Is 
that correct?  Is there anything I can do to make more space in 
lost+found, either system-wide or while the fsck is running?

Some possibly pertinent info:

# ls -lad lost+found
drwxrwxrwt  1379 root  wheel  182272 Apr 12 16:55 lost+found
# ls lost+found | wc -l
8899
This fs was copied from a drive reporting hard errors reading fsbn... 
using dd.

Thanks,

Brad Waite
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: hard disk recover

2004-04-09 Thread Brad Waite
[EMAIL PROTECTED] wrote:
I'm getting the dreaded ad1s1a: hard error reading fsbn 524543 of 96-127
(ad1s1 bn 524543; cn 520 tn 6 sn 5) status=59 error=40 errors.  Based on
what I've read, it means my drive's going bye-bye.  As it is, it won't
even boot - fortunately I have another FBSD drive to boot from, and I get
these errors while trying fsck it.  Shame on me for not noticing the
errors sooner and an even bigger shame for not having a proper backup.
In any case, the milk is spilled and I need to mop it up as best I can. 
While I can mount the partition, I can't cd to it (more hard errors...),
and since fsck isn't apparently helping, what can I do to recover what's
left?  I'm thinking dd's the tool to use, but I'm not really sure how to
go about it.  Here's what I get when I try to read from the beginning on
the partition:

# dd if=/dev/ad1s1a bs=64k
dd: /dev/ad1s1a: Input/output error
However, when I add skip=1, the drive spits back data.  That leads me to
believe that if I skip over the bad sectors, I can read what's left.
I've got a spare drive I can use as a sandbox, but how should I dump the
data?  Should I label the second drive with the same partition size and
dd if=/dev/ad1s1a of=/dev/ad2s1a?  Is there any chance of recovering
filesystem data going this route?
[Quoting myself as it's been 2 weeks since the first post]

Here's what's new:

ad0: 21557MB IBM-DJNA-372200 [43800/16/63] at ata0-master UDMA66
ad1: 39083MB Maxtor 5T040H4 [79408/16/63] at ata0-slave UDMA100
ad2: 29311MB Maxtor 5T030H3 [59554/16/63] at ata1-master UDMA100
ad2 is the 30GB drive reporting errors; ad1 is the new 40GB drive I 
copied the partition to.

I tried to fdisk the 40G to be identical to the 30G, but I could never 
get the size to match exactly.  In the end, I just set up the 256M swap, 
and hoped the 524288 offset for the 'a' partition would work. Here's 
relevant disklabel output:

# disklabel -r /dev/ad1s1
# /dev/ad1s1:
type: ESDI
disk: ad0s1
label:
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 4981
sectors/unit: 80035767
[...]
8 partitions:
#size offset   fstype   [fsize bsize bps/cpg]
  a: 79511479 524288   4.2BSD 2048 1638489  # (Cyl. 32*- 4981*)
  b:   524288  0 swap   # (Cyl.  0 - 32*)
  c: 80035767  0   unused0 0# (Cyl.  0 - 4981*)
# disklabel -r /dev/ad2s1
# /dev/ad2s1:
type: ESDI
disk: ad0s1
label:
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 16
sectors/cylinder: 1008
cylinders: 59553
sectors/unit: 60030369
[...]
8 partitions:
#size offsetfstype  [fsize bsize bps/cpg]
  a: 59506081 5242884.2BSD   2048 1638416  # (Cyl. 520*- 59553*)
  b:   524288  0  swap # (Cyl.   0 - 520*)
  c: 60030369  0unused  0 0# (Cyl.   0 - 59553*)
I used lewiz' suggestion to add 'conv=noerror,sync' to dd. I was able to 
copy the readable data from the bad drive to a new one.  I changed it to 
bs=512b (redundant, I know) since if the old disk was bad on 512-byte 
block 0, I figured dd would skip to the next 64k.  Here's what I used:

dd if=/dev/ad2s1a of=/dev/ad1s1a conv=noerror,sync bs=512b

Of course, I got about 165 ad2s1a: hard error reading fsbn ... errors, 
but it appeared to copy everything else okay.  The first 16 blocks of 
ad2s1a are null, but there is 16 blocks of data at block 32, so it 
appears the first backup superblock survived.

Is there a remote chance that I'll be able to fsck this fs and recover? 
 I know that fsck will complain about the first alternate superblock 
not matching because the last superblock won't be in the first 30GB.  Do 
the different sized partitions make this impossible?

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]