Re: /usr/obj partition AWOL
On Thu, Jun 07, 2007 at 09:06:32AM +0200, Otto Moerbeek wrote: On Wed, 6 Jun 2007, Otto Moerbeek wrote: On Wed, 6 Jun 2007, Markus Lude wrote: On Tue, Jun 05, 2007 at 07:51:48AM +0200, Otto Moerbeek wrote: There were some validations checkc added to partitions. If a bad partition is found, it will be marked unused. The checks were a little to strict for some cases. A fix for that went in yesterday, so try a new snap. Thanks for your info. After rebuilding kernel and userland the problem still exists, but now the affected partitions are /var, /home and /data. Hmm. Unmounting /data and doing a manual fsck -f runs without problems. If the problem persists, please report with full disklabel output. $ cat /etc/fstab /dev/wd0a / ffs rw 1 1 /dev/wd0d /tmp ffs rw,nodev,nosuid 1 2 /dev/wd0e /usr ffs rw,nodev 1 2 /dev/wd0f /var ffs rw,nodev,nosuid 1 2 /dev/wd0g /home ffs rw,nodev,nosuid 1 2 /dev/wd0h /data ffs rw,nodev,nosuid 1 2 /dev/wd1d /backup ffs rw,nodev,nosuid 1 2 with an actual kernel: $ sudo disklabel wd0 # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: ST3120213A flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 16383 total sectors: 16514064 ^^^ 1008 * 16383 = 16514064 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 1024128 0 4.2BSD 2048 16384 16 # Cyl 0 - 1015 b: 3072384 1024128swap # Cyl 1016 - 4063 c: 234441648 0 unused 0 0 # Cyl 0 -232580 ^ Your disk size and c partition size do not match. Can you send a dmesg, to see what the actual size of your disk is? This is really needed to see what is going on. Did you at any time edit the disk size by hand? No, at least I can't remember it. d: 2048256 4096512 4.2BSD 2048 16384 16 # Cyl 4064 - 6095 e: 20479536 6144768 4.2BSD 2048 16384 16 # Cyl 6096 - 26412 disklabel: partition c: partition extends past end of unit disklabel: partition e: partition extends past end of unit older kernel: $ sudo disklabel wd0 [...] 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 1024128 0 4.2BSD 0 0 16 # Cyl 0 - 1015 b: 3072384 1024128swap # Cyl 1016 - 4063 c: 234441648 0 unused 0 0 # Cyl 0 -232580 d: 2048256 4096512 4.2BSD 0 0 16 # Cyl 4064 - 6095 e: 20479536 6144768 4.2BSD 0 0 16 # Cyl 6096 - 26412 f: 4095504 26624304 4.2BSD 0 0 16 # Cyl 26413 - 30475 g: 20479536 30719808 4.2BSD 0 0 16 # Cyl 30476 - 50792 h: 183242304 51199344 4.2BSD 0 0 16 # Cyl 50793 -232580 disklabel: partition c: partition extends past end of unit disklabel: partition e: partition extends past end of unit disklabel: partition f: offset past end of unit disklabel: partition f: partition extends past end of unit disklabel: partition g: offset past end of unit disklabel: partition g: partition extends past end of unit disklabel: partition h: offset past end of unit disklabel: partition h: partition extends past end of unit Any hints how to fix this beside repartition and reinstall? If possible, please leave the disk as is, until we've done further diagnosis. If that is not possible, you can use the 'e' command in disklabel, to set the actual size of the disk to the size (in sectors) reported in the dmesg. You might need to adjust the 'c' partition as well. After having sen your dmesg, I see that your disk size is really 234441648 sectors. The disklabel says 16514064 though. The new consistency checks did not like that. The consistency checks have been disabled in two steps (rev 1.44. and rev 1.66 of sys/kern/subr_disk.c). So a current kernel should not trip on this anymore. There remain two questions: how did the size end up being wrong in the disklabel, and how to repair. To the first question I can only guess; it could be you dd'ed an image from another disk, you edited the size by hand or we are seeing the results of a (old?) bug in disklabel handling that now surfaced because of the concistency checks. The second question I already answered: using the 'e' command in disklabel lets you set the size of the disk in the label. After that,
Invalid partition table (was /usr/obj partition AWOL)
On Thu, Jun 07, 2007 at 04:58:18PM -0500, Emilio Perea wrote: On Thu, Jun 07, 2007 at 07:50:24PM +0200, Otto Moerbeek wrote: I have thinking a bit more about the problem, and it is very likely the following scenario happened: 1. Kernel upgrade by source. 2. Reboot 3. Kernel reads old disklabel format and converts it in-memory to the new v1 format. 4. Run a newfs using the old executable that does not know about the new disklabel format. newfs writes the block and fragment size info the old way, on a spot that is used in v1 labels to store the high 16 bits of the offset and size of a partition. The label is written with version = 1, since the in-memory copy is v1. 5. Reboot, the kernel now sees a v1 disklabel with very high offset and/or size, the new consistency code (which is now disabled) kicks in and marks the partition as unused. So the lesson here is: keep userland and kernel in sync, or use a snapshot to upgrade. I believe that's exactly what happened the first time. The catch is that kernel and userland were being built from the same cvs update, and I thought I was keeping them in sync. In this case it would probably have been better to skip the reboot between building the kernel and the userland. It might have been better to start a whole new thread, but it seemed logical to believe that the problems might be related. Using recent snapshots, last night's insecurity output showed another disklabel change: == sd1 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd1.current Fri Apr 20 01:31:19 2007 +++ /var/backups/disklabel.sd1 Fri Jun 8 01:31:55 2007 @@ -1,4 +1,4 @@ -# Inside MBR partition 0: type A6 start 63 size 71681967 +disklabel: warning, DOS partition table with no valid OpenBSD partition # /dev/rsd1c: type: SCSI disk: da0s1 *--* The full output of disklabel and dmesg follow, but as I was getting ready to send it, I remembered that this same disk had problems with the disklabel changes last October. For some reason it was shown as having a FreeBSD disklabel. Most of correspondence regarding it was off-list, but involved several developers and ended with Ken Westerback suggesting some tests before setting it to OpenBSD. This was fdisk then: Disk: sd1 geometry: 4462/255/63 [71682030 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: idC H S -C H S [ start: size ] *0: A60 1 1 - 4461 254 63 [ 63:71681967 ] OpenBSD 1: 000 0 0 -0 0 0 [ 0: 0 ] unused 2: 000 0 0 -0 0 0 [ 0: 0 ] unused 3: 000 0 0 -0 0 0 [ 0: 0 ] unused This is now: Disk: sd1 geometry: 4462/255/63 [71687370 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: idC H S -C H S [ start: size ] 0: 000 0 0 -0 0 0 [ 0: 0 ] unused 1: 000 0 0 -0 0 0 [ 0: 0 ] unused 2: 000 0 0 -0 0 0 [ 0: 0 ] unused *3: A50 0 1 -3 28 41 [ 0: 5 ] FreeBSD *--* It is currently working fine. Should I just change the partition ID to A6, or is there something else I should try first? *--* disklabel: warning, DOS partition table with no valid OpenBSD partition # /dev/rsd1c: type: SCSI disk: da0s1 label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 4462 total sectors: 71687370 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 15 partitions: # sizeoffset fstype [fsize bsize cpg] c: 7168196763 unused 0 0 # Cyl 0*- 4461 d: 210445263 4.2BSD 2048 16384 132 # Cyl 0*- 130 e: 8385930 2104515 4.2BSD 2048 16384 328 # Cyl 131 - 652 f: 23294250 48387780 4.2BSD 2048 16384 328 # Cyl 3012 - 4461 h: 4112640 15936480 4.2BSD 2048 16384 256 # Cyl 992 - 1247 i: 2104515 40933620 4.2BSD 2048 163841 # Cyl 2548 - 2678 j: 18828180 20049120 4.2BSD 2048 16384 328 # Cyl 1248 - 2419 k: 5349645 43038135 4.2BSD 2048 16384 16 # Cyl 2679 - 3011 l: 2056320 38877300 4.2BSD 2048 16384 128 # Cyl 2420 - 2547 m: 2104515 10490445 4.2BSD 2048 16384 132 # Cyl 653 - 783
Re: Invalid partition table (was /usr/obj partition AWOL)
This is very odd on several fronts. First, someone has obviously been writing on the MBR for no good reason. I just tested an fdisk compiled to day and noticed no oddities on my i386. Second, the fact that you find a disklabel. Since we no longer store or look for disklabels in FreeBSD partitions it is being read from sector 1 if I recall the code correctly. But it should not have been writing the disklabel there when there was an OpenBSD partition to store it in. Do you know if this is exactly the same disklabel you were using before? Have you changed anything in the disklabel recently that would identify this as an artifact that just happened to be lying in sector 1 for a while? Can you copy the MBR and send it to me. There might be a clue as to what overwrote it. Then I would do fdisk -i and see what happens. This will move the OpenBSD partition to partition 3, but cover the entire disk as your original MBR did. Then see if the disklabel, which should be read from the OpenBSD partition says. Ken On Fri, Jun 08, 2007 at 09:08:21PM -0500, Emilio Perea wrote: On Thu, Jun 07, 2007 at 04:58:18PM -0500, Emilio Perea wrote: On Thu, Jun 07, 2007 at 07:50:24PM +0200, Otto Moerbeek wrote: I have thinking a bit more about the problem, and it is very likely the following scenario happened: 1. Kernel upgrade by source. 2. Reboot 3. Kernel reads old disklabel format and converts it in-memory to the new v1 format. 4. Run a newfs using the old executable that does not know about the new disklabel format. newfs writes the block and fragment size info the old way, on a spot that is used in v1 labels to store the high 16 bits of the offset and size of a partition. The label is written with version = 1, since the in-memory copy is v1. 5. Reboot, the kernel now sees a v1 disklabel with very high offset and/or size, the new consistency code (which is now disabled) kicks in and marks the partition as unused. So the lesson here is: keep userland and kernel in sync, or use a snapshot to upgrade. I believe that's exactly what happened the first time. The catch is that kernel and userland were being built from the same cvs update, and I thought I was keeping them in sync. In this case it would probably have been better to skip the reboot between building the kernel and the userland. It might have been better to start a whole new thread, but it seemed logical to believe that the problems might be related. Using recent snapshots, last night's insecurity output showed another disklabel change: == sd1 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd1.currentFri Apr 20 01:31:19 2007 +++ /var/backups/disklabel.sd1Fri Jun 8 01:31:55 2007 @@ -1,4 +1,4 @@ -# Inside MBR partition 0: type A6 start 63 size 71681967 +disklabel: warning, DOS partition table with no valid OpenBSD partition # /dev/rsd1c: type: SCSI disk: da0s1 *--* The full output of disklabel and dmesg follow, but as I was getting ready to send it, I remembered that this same disk had problems with the disklabel changes last October. For some reason it was shown as having a FreeBSD disklabel. Most of correspondence regarding it was off-list, but involved several developers and ended with Ken Westerback suggesting some tests before setting it to OpenBSD. This was fdisk then: Disk: sd1 geometry: 4462/255/63 [71682030 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: idC H S -C H S [ start: size ] *0: A60 1 1 - 4461 254 63 [ 63:71681967 ] OpenBSD 1: 000 0 0 -0 0 0 [ 0: 0 ] unused 2: 000 0 0 -0 0 0 [ 0: 0 ] unused 3: 000 0 0 -0 0 0 [ 0: 0 ] unused This is now: Disk: sd1 geometry: 4462/255/63 [71687370 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: idC H S -C H S [ start: size ] 0: 000 0 0 -0 0 0 [ 0: 0 ] unused 1: 000 0 0 -0 0 0 [ 0: 0 ] unused 2: 000 0 0 -0 0 0 [ 0: 0 ] unused *3: A50 0 1 -3 28 41 [ 0: 5 ] FreeBSD *--* It is currently working fine. Should I just change the partition ID to A6, or is there something else I should try first? *--* disklabel: warning, DOS partition table with no valid OpenBSD partition # /dev/rsd1c:
Re: Invalid partition table (was /usr/obj partition AWOL)
c: 7168196763 unused 0 0 # Cyl 0*- 4461 d: 210445263 4.2BSD 2048 16384 132 # Cyl 0*- 130 Ah -- your 'c' partition does not start at 0. It's an old FreeBSD partition on your disk. That should not work; it is bunk. We are removing the code from the kernel that allows it to work, because it requires extra stupid checks all over the place to support an old 386BSD stupidity. I hope that our new disklabel command, upon re-writing that label, will repair that. Todd? That's the way to handle this, right?
Re: Invalid partition table (was /usr/obj partition AWOL)
On 6/8/07, Theo de Raadt [EMAIL PROTECTED] wrote: c: 7168196763 unused 0 0 # Cyl 0*- 4461 d: 210445263 4.2BSD 2048 16384 132 # Cyl 0*- 130 Ah -- your 'c' partition does not start at 0. It's an old FreeBSD partition on your disk. That should not work; it is bunk. We are removing the code from the kernel that allows it to work, because it requires extra stupid checks all over the place to support an old 386BSD stupidity. It appears I have the very same issue, though with a much larger offset. I created an OpenBSD partition on an existing partition table towards the end of the drive. [EMAIL PROTECTED]:~ sudo fdisk wd0 Disk: wd0 geometry: 11978/255/63 [192426570 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: idC H S -C H S [ start: size ] 0: E8 15356 77 8 - 229721 118 4 [ 246698998: 3443776305 ] Unknown ID 1: 010 0 1 - 267349 89 4 [ 0: 0 ] DOS FAT-12 2: 000 0 0 -0 0 0 [ 0: 0 ] unused 3: 3F0 0 1 - 267349 89 4 [ 0: 0 ] Unknown ID [EMAIL PROTECTED]:~ sudo disklabel wd0 # /dev/rwd0c: type: ESDI disk: ad0s3 label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 11978 total sectors: 192426570 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 8 partitions: # sizeoffset fstype [fsize bsize cpg] a:208845 17395 4.2BSD 2048 16384 13 # Cyl 9683 - 9695 b: 4192965 155766240swap # Cyl 9696 - 9956 c: 36869175 17395 unused 0 0 # Cyl 9683 - 11977 d:401625 159959205 4.2BSD 2048 16384 25 # Cyl 9957 - 9981 e: 20964825 160360830 4.2BSD 2048 16384 328 # Cyl 9982 - 11286 f: 11100915 181325655 4.2BSD 2048 16384 328 # Cyl 11287 - 11977 disklabel: warning, unused partition i: size 1413615339 offset -2147417768 disklabel: warning, unused partition j: size -196918 offset 402701520 disklabel: warning, unused partition k: size 503365533 offset 1463353529 disklabel: warning, unused partition l: size -1407327343 offset -1382830702 disklabel: warning, unused partition m: size -2013104760 offset -1065155243 disklabel: warning, unused partition n: size 402998726 offset 268977606 disklabel: warning, unused partition o: size -400023365 offset 17760440 disklabel: warning, unused partition p: size 1086332943 offset -356507121 [EMAIL PROTECTED]:~ Jimmy.
Re: Invalid partition table (was /usr/obj partition AWOL)
On Fri, Jun 08, 2007 at 10:41:40PM -0400, Kenneth R Westerback wrote: This is very odd on several fronts. First, someone has obviously been writing on the MBR for no good reason. I just tested an fdisk compiled to day and noticed no oddities on my i386. Second, the fact that you find a disklabel. Since we no longer store or look for disklabels in FreeBSD partitions it is being read from sector 1 if I recall the code correctly. But it should not have been writing the disklabel there when there was an OpenBSD partition to store it in. Do you know if this is exactly the same disklabel you were using before? Have you changed anything in the disklabel recently that would identify this as an artifact that just happened to be lying in sector 1 for a while? Other than reducing the size of the last partition a couple of months ago, there has been no (intentional) change to that disklabel since: On Wed, Oct 11, 2006 at 08:09:08AM -0700, K WESTERBACK wrote: Darn. A perfectly good theory shot to hell. :-). It would seem that you have a 'valid' disklabel at sector 1 of that disk. First, if you could save the first two sectors of the disk with dd if=/dev/rsd1c of=SaveMySectors bs=512 count=2 and send me that file, and do two experiments, I would appreciate it. If you can run fdisk against the disk and change the partition type to 'A6' (OpenBSD) the correct disklabel should be read in and you should get the 'old' info back again. Second, if you are the risk taking type, change partition type back to 'A5' (FreeBSD) and zero out sector 1 on the disk with something like dd if=/dev/zero of=/dev/rsd1c bs=512 count=1 seek=1 Then see what disklabel says. You should get a simple spoofed disklabel with 'c' and 'i' partitions. Finally, changing the partition type to 'A6' again should give you access to the data. That was the last change I'm aware of. Can you copy the MBR and send it to me. There might be a clue as to what overwrote it. Then I would do fdisk -i and see what happens. This will move the OpenBSD partition to partition 3, but cover the entire disk as your original MBR did. Then see if the disklabel, which should be read from the OpenBSD partition says. I'll send the file attached to the next message, since I assume it would be stripped from the mailing list. After running fdisk -i sd1: # fdisk sd1 Disk: sd1 geometry: 4462/255/63 [71687370 Sectors] Offset: 0 Signature: 0xAA55 Starting Ending LBA Info: #: idC H S -C H S [ start: size ] 0: 000 0 0 -0 0 0 [ 0: 0 ] unused 1: 000 0 0 -0 0 0 [ 0: 0 ] unused 2: 000 0 0 -0 0 0 [ 0: 0 ] unused *3: A60 1 1 - 4461 254 63 [ 63:71681967 ] OpenBSD It's back as an OpenBSD disklabel, but the c partition still starts at 63 rather than 0: # disklabel sd1 # Inside MBR partition 3: type A6 start 63 size 71681967 # /dev/rsd1c: type: SCSI disk: da0s1 label: flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 4462 total sectors: 71687370 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 15 partitions: # sizeoffset fstype [fsize bsize cpg] c: 7168196763 unused 0 0 # Cyl 0*- 4461 d: 210445263 4.2BSD 2048 16384 132 # Cyl 0*- 130 e: 8385930 2104515 4.2BSD 2048 16384 328 # Cyl 131 - 652 f: 23294250 48387780 4.2BSD 2048 16384 328 # Cyl 3012 - 4461 h: 4112640 15936480 4.2BSD 2048 16384 256 # Cyl 992 - 1247 i: 2104515 40933620 4.2BSD 2048 163841 # Cyl 2548 - 2678 j: 18828180 20049120 4.2BSD 2048 16384 328 # Cyl 1248 - 2419 k: 5349645 43038135 4.2BSD 2048 16384 16 # Cyl 2679 - 3011 l: 2056320 38877300 4.2BSD 2048 16384 128 # Cyl 2420 - 2547 m: 2104515 10490445 4.2BSD 2048 16384 132 # Cyl 653 - 783 n: 2056320 12594960 4.2BSD 2048 163841 # Cyl 784 - 911 Emilio
Re: /usr/obj partition AWOL
On Wed, 6 Jun 2007, Otto Moerbeek wrote: On Wed, 6 Jun 2007, Markus Lude wrote: On Tue, Jun 05, 2007 at 07:51:48AM +0200, Otto Moerbeek wrote: On Tue, 5 Jun 2007, Markus Lude wrote: On Mon, Jun 04, 2007 at 06:02:59PM -0500, Emilio Perea wrote: I follow -current on an i386 at work and an amd64 at home, and rarely run into any problem which is not self-inflicted. So when I had a weird experience this weekend, I assumed it was my fault. What happened was that after the usual sequence of [build kernel; reboot; build userland; reboot] the system complained that it could not fsck wd1j and dropped into single-user mode. wd1j is mounted on /usr/obj, and I thought that something in the last build had messed it up, so I ran newfs wd1j and got newfs: /dev/rwd1j: Device not configured disklabel wd1 showed partitions d-i and k-p, but no j. I added the partition, ran newfs, and everything seemed fine. This afternoon I installed the i386 snapshot downloaded this morning (dated Jun 3 19:19) on the work pc, and after reboot it was missing the /usr/obj partition (sd0g in this case). Everything seems to be working fine on both computers, but I didn't expect the partitions to disappear. Did nobody else run into this problem? Or did everybody else who saw it thought it was too obvious to mention it to the mailing list? I had a similar problem on sparc64 with a snapshot from jun 2. The system was unable to fsck some partitions and dropped to single user mode. Here the problems were with the /usr, /var, /tmp and /home partitions. Some further (and larger partitions) weren't affected. I installed an older snapshot. Any suggestions how to get this fixed or what to test/try? There were some validations checkc added to partitions. If a bad partition is found, it will be marked unused. The checks were a little to strict for some cases. A fix for that went in yesterday, so try a new snap. Thanks for your info. After rebuilding kernel and userland the problem still exists, but now the affected partitions are /var, /home and /data. Hmm. Unmounting /data and doing a manual fsck -f runs without problems. If the problem persists, please report with full disklabel output. $ cat /etc/fstab /dev/wd0a / ffs rw 1 1 /dev/wd0d /tmp ffs rw,nodev,nosuid 1 2 /dev/wd0e /usr ffs rw,nodev 1 2 /dev/wd0f /var ffs rw,nodev,nosuid 1 2 /dev/wd0g /home ffs rw,nodev,nosuid 1 2 /dev/wd0h /data ffs rw,nodev,nosuid 1 2 /dev/wd1d /backup ffs rw,nodev,nosuid 1 2 with an actual kernel: $ sudo disklabel wd0 # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: ST3120213A flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 16383 total sectors: 16514064 ^^^ 1008 * 16383 = 16514064 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 1024128 0 4.2BSD 2048 16384 16 # Cyl 0 - 1015 b: 3072384 1024128swap # Cyl 1016 - 4063 c: 234441648 0 unused 0 0 # Cyl 0 -232580 ^ Your disk size and c partition size do not match. Can you send a dmesg, to see what the actual size of your disk is? This is really needed to see what is going on. Did you at any time edit the disk size by hand? d: 2048256 4096512 4.2BSD 2048 16384 16 # Cyl 4064 - 6095 e: 20479536 6144768 4.2BSD 2048 16384 16 # Cyl 6096 - 26412 disklabel: partition c: partition extends past end of unit disklabel: partition e: partition extends past end of unit older kernel: $ sudo disklabel wd0 [...] 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 1024128 0 4.2BSD 0 0 16 # Cyl 0 - 1015 b: 3072384 1024128swap # Cyl 1016 - 4063 c: 234441648 0 unused 0 0 # Cyl 0 -232580 d: 2048256 4096512 4.2BSD 0 0 16 # Cyl 4064 - 6095 e: 20479536 6144768 4.2BSD 0 0 16 # Cyl 6096 - 26412 f: 4095504 26624304 4.2BSD 0 0 16 # Cyl 26413 - 30475 g: 20479536 30719808 4.2BSD 0 0 16 # Cyl 30476 - 50792 h: 183242304 51199344 4.2BSD 0 0 16 # Cyl 50793 -232580 disklabel: partition c: partition extends past end of unit disklabel: partition e: partition extends past
Re: /usr/obj partition AWOL
On Tue, 5 Jun 2007, Emilio Perea wrote: On Tue, Jun 05, 2007 at 07:51:48AM +0200, Otto Moerbeek wrote: There were some validations checkc added to partitions. If a bad partition is found, it will be marked unused. The checks were a little to strict for some cases. A fix for that went in yesterday, so try a new snap. If the problem persists, please report with full disklabel output. The problem showed up on the latest snapshot as of now, which may well have been built before the fix you mention was incorporated. The home PC running -current has not had a problem since Saturday afternoon. The daily insecurity reports show four changes in this partition during the last couple of months. (Note that since this is on /usr/obj on a PC running -current, newfs is run just about every day.) It seems funny that on May 29 the fsize and bsize were changed to 0, but nothing weird happened until the day after they were changed to what appeared to be more reasonable numbers. Anyhow, in case the information is useful, the insecurity messages and current disklabel follow: == sd0 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd0.currentFri Apr 21 01:31:35 2006 +++ /var/backups/disklabel.sd0Tue Apr 17 01:31:10 2007 @@ -26,4 +26,4 @@ d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 - g: 4139682 13628208 4.2BSD 2048 16384 480 # Cyl 5357 - 6984* + g: 4139682 13628208 4.2BSD 2048 163841 # Cyl 5357 - 6984* The cpg change is due to making newfs cylinder unaware. . == sd0 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd0.currentTue Apr 17 01:31:10 2007 +++ /var/backups/disklabel.sd0Wed May 30 01:32:08 2007 @@ -26,4 +26,4 @@ d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 - g: 4139682 13628208 4.2BSD 2048 163841 # Cyl 5357 - 6984* + g: 4139682 13628208 4.2BSD 0 01 # Cyl 5357 - 6984* Here you are running with a new kernel, but userland is still old. Hense the 0 fsize and bsize == sd0 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd0.currentWed May 30 01:32:08 2007 +++ /var/backups/disklabel.sd0Fri Jun 1 01:32:15 2007 @@ -26,4 +26,4 @@ d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 - g: 4139682 13628208 4.2BSD 0 01 # Cyl 5357 - 6984* + g: 4139682 13628208 4.2BSD 2048 81921 # Cyl 5357 - 6984* newfs is run, but it is still using the old struct partition format. Hence the wrong fsize anf bsize. == sd0 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd0.currentFri Jun 1 01:32:15 2007 +++ /var/backups/disklabel.sd0Tue Jun 5 01:32:10 2007 @@ -26,4 +26,4 @@ d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 - g: 4139682 13628208 4.2BSD 2048 81921 # Cyl 5357 - 6984* + g: 4139682 13628208 4.2BSD 2048 163841 # Cyl 5357 - 6984* And here things are back in shape. # Inside MBR partition 3: type A6 start 63 size 17767827 # /dev/rsd0c: type: SCSI disk: SCSI disk label: ST39102LW flags: bytes/sector: 512 sectors/track: 212 tracks/cylinder: 12 sectors/cylinder: 2544 cylinders: 6962 total sectors: 17783240 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 209619363 4.2BSD 2048 16384 480 # Cyl 0*- 823 b: 1048128 2096256swap # Cyl 824 - 1235 c: 17783240 0 unused 0 0 # Cyl 0 - 6990* d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 g: 4139682 13628208 4.2BSD 2048 163841 # Cyl 5357 - 6984* We have seen some reports now on disappearing paritions. On sparc and
Re: /usr/obj partition AWOL
On Thu, 7 Jun 2007, Otto Moerbeek wrote: We have seen some reports now on disappearing paritions. On sparc and sparc64, there were actual bugs that have been fixed now. For all platforms, the suspect new consistency checking code now been disabled until we find out what is causing the mishap, and (very) recent kernels should be back to normal. Please report with dikslabel info and dmesg if things are still going wrong. Preferable with fdisk (if applicable) and old disklabel information as well. I have thinking a bit more about the problem, and it is very likely the following scenario happened: 1. Kernel upgrade by source. 2. Reboot 3. Kernel reads old disklabel format and converts it in-memory to the new v1 format. 4. Run a newfs using the old executable that does not know about the new disklabel format. newfs writes the block and fragment size info the old way, on a spot that is used in v1 labels to store the high 16 bits of the offset and size of a partition. The label is written with version = 1, since the in-memory copy is v1. 5. Reboot, the kernel now sees a v1 disklabel with very high offset and/or size, the new consistency code (which is now disabled) kicks in and marks the partition as unused. So the lesson here is: keep userland and kernel in sync, or use a snapshot to upgrade. -Otto
Re: /usr/obj partition AWOL
On Thu, Jun 07, 2007 at 07:50:24PM +0200, Otto Moerbeek wrote: I have thinking a bit more about the problem, and it is very likely the following scenario happened: 1. Kernel upgrade by source. 2. Reboot 3. Kernel reads old disklabel format and converts it in-memory to the new v1 format. 4. Run a newfs using the old executable that does not know about the new disklabel format. newfs writes the block and fragment size info the old way, on a spot that is used in v1 labels to store the high 16 bits of the offset and size of a partition. The label is written with version = 1, since the in-memory copy is v1. 5. Reboot, the kernel now sees a v1 disklabel with very high offset and/or size, the new consistency code (which is now disabled) kicks in and marks the partition as unused. So the lesson here is: keep userland and kernel in sync, or use a snapshot to upgrade. I believe that's exactly what happened the first time. The catch is that kernel and userland were being built from the same cvs update, and I thought I was keeping them in sync. In this case it would probably have been better to skip the reboot between building the kernel and the userland. I'll take newfs out of my build script (back to rm -rf /usr/obj/*) and try to remember to use newfs before rebooting with a new kernel if I want to avoid the wait. Thanks again! Emilio
Re: /usr/obj partition AWOL
On Tue, Jun 05, 2007 at 07:51:48AM +0200, Otto Moerbeek wrote: On Tue, 5 Jun 2007, Markus Lude wrote: On Mon, Jun 04, 2007 at 06:02:59PM -0500, Emilio Perea wrote: I follow -current on an i386 at work and an amd64 at home, and rarely run into any problem which is not self-inflicted. So when I had a weird experience this weekend, I assumed it was my fault. What happened was that after the usual sequence of [build kernel; reboot; build userland; reboot] the system complained that it could not fsck wd1j and dropped into single-user mode. wd1j is mounted on /usr/obj, and I thought that something in the last build had messed it up, so I ran newfs wd1j and got newfs: /dev/rwd1j: Device not configured disklabel wd1 showed partitions d-i and k-p, but no j. I added the partition, ran newfs, and everything seemed fine. This afternoon I installed the i386 snapshot downloaded this morning (dated Jun 3 19:19) on the work pc, and after reboot it was missing the /usr/obj partition (sd0g in this case). Everything seems to be working fine on both computers, but I didn't expect the partitions to disappear. Did nobody else run into this problem? Or did everybody else who saw it thought it was too obvious to mention it to the mailing list? I had a similar problem on sparc64 with a snapshot from jun 2. The system was unable to fsck some partitions and dropped to single user mode. Here the problems were with the /usr, /var, /tmp and /home partitions. Some further (and larger partitions) weren't affected. I installed an older snapshot. Any suggestions how to get this fixed or what to test/try? There were some validations checkc added to partitions. If a bad partition is found, it will be marked unused. The checks were a little to strict for some cases. A fix for that went in yesterday, so try a new snap. Thanks for your info. After rebuilding kernel and userland the problem still exists, but now the affected partitions are /var, /home and /data. Hmm. Unmounting /data and doing a manual fsck -f runs without problems. If the problem persists, please report with full disklabel output. $ cat /etc/fstab /dev/wd0a / ffs rw 1 1 /dev/wd0d /tmp ffs rw,nodev,nosuid 1 2 /dev/wd0e /usr ffs rw,nodev 1 2 /dev/wd0f /var ffs rw,nodev,nosuid 1 2 /dev/wd0g /home ffs rw,nodev,nosuid 1 2 /dev/wd0h /data ffs rw,nodev,nosuid 1 2 /dev/wd1d /backup ffs rw,nodev,nosuid 1 2 with an actual kernel: $ sudo disklabel wd0 # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: ST3120213A flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 16383 total sectors: 16514064 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 1024128 0 4.2BSD 2048 16384 16 # Cyl 0 - 1015 b: 3072384 1024128swap # Cyl 1016 - 4063 c: 234441648 0 unused 0 0 # Cyl 0 -232580 d: 2048256 4096512 4.2BSD 2048 16384 16 # Cyl 4064 - 6095 e: 20479536 6144768 4.2BSD 2048 16384 16 # Cyl 6096 - 26412 disklabel: partition c: partition extends past end of unit disklabel: partition e: partition extends past end of unit older kernel: $ sudo disklabel wd0 [...] 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 1024128 0 4.2BSD 0 0 16 # Cyl 0 - 1015 b: 3072384 1024128swap # Cyl 1016 - 4063 c: 234441648 0 unused 0 0 # Cyl 0 -232580 d: 2048256 4096512 4.2BSD 0 0 16 # Cyl 4064 - 6095 e: 20479536 6144768 4.2BSD 0 0 16 # Cyl 6096 - 26412 f: 4095504 26624304 4.2BSD 0 0 16 # Cyl 26413 - 30475 g: 20479536 30719808 4.2BSD 0 0 16 # Cyl 30476 - 50792 h: 183242304 51199344 4.2BSD 0 0 16 # Cyl 50793 -232580 disklabel: partition c: partition extends past end of unit disklabel: partition e: partition extends past end of unit disklabel: partition f: offset past end of unit disklabel: partition f: partition extends past end of unit disklabel: partition g: offset past end of unit disklabel: partition g: partition extends past end of unit disklabel: partition h: offset past end of unit disklabel: partition h: partition extends past end of unit Any hints how to fix this beside repartition and reinstall? Regards, Markus
Re: /usr/obj partition AWOL
On Wed, 6 Jun 2007, Markus Lude wrote: On Tue, Jun 05, 2007 at 07:51:48AM +0200, Otto Moerbeek wrote: On Tue, 5 Jun 2007, Markus Lude wrote: On Mon, Jun 04, 2007 at 06:02:59PM -0500, Emilio Perea wrote: I follow -current on an i386 at work and an amd64 at home, and rarely run into any problem which is not self-inflicted. So when I had a weird experience this weekend, I assumed it was my fault. What happened was that after the usual sequence of [build kernel; reboot; build userland; reboot] the system complained that it could not fsck wd1j and dropped into single-user mode. wd1j is mounted on /usr/obj, and I thought that something in the last build had messed it up, so I ran newfs wd1j and got newfs: /dev/rwd1j: Device not configured disklabel wd1 showed partitions d-i and k-p, but no j. I added the partition, ran newfs, and everything seemed fine. This afternoon I installed the i386 snapshot downloaded this morning (dated Jun 3 19:19) on the work pc, and after reboot it was missing the /usr/obj partition (sd0g in this case). Everything seems to be working fine on both computers, but I didn't expect the partitions to disappear. Did nobody else run into this problem? Or did everybody else who saw it thought it was too obvious to mention it to the mailing list? I had a similar problem on sparc64 with a snapshot from jun 2. The system was unable to fsck some partitions and dropped to single user mode. Here the problems were with the /usr, /var, /tmp and /home partitions. Some further (and larger partitions) weren't affected. I installed an older snapshot. Any suggestions how to get this fixed or what to test/try? There were some validations checkc added to partitions. If a bad partition is found, it will be marked unused. The checks were a little to strict for some cases. A fix for that went in yesterday, so try a new snap. Thanks for your info. After rebuilding kernel and userland the problem still exists, but now the affected partitions are /var, /home and /data. Hmm. Unmounting /data and doing a manual fsck -f runs without problems. If the problem persists, please report with full disklabel output. $ cat /etc/fstab /dev/wd0a / ffs rw 1 1 /dev/wd0d /tmp ffs rw,nodev,nosuid 1 2 /dev/wd0e /usr ffs rw,nodev 1 2 /dev/wd0f /var ffs rw,nodev,nosuid 1 2 /dev/wd0g /home ffs rw,nodev,nosuid 1 2 /dev/wd0h /data ffs rw,nodev,nosuid 1 2 /dev/wd1d /backup ffs rw,nodev,nosuid 1 2 with an actual kernel: $ sudo disklabel wd0 # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: ST3120213A flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 16 sectors/cylinder: 1008 cylinders: 16383 total sectors: 16514064 ^^^ 1008 * 16383 = 16514064 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 1024128 0 4.2BSD 2048 16384 16 # Cyl 0 - 1015 b: 3072384 1024128swap # Cyl 1016 - 4063 c: 234441648 0 unused 0 0 # Cyl 0 -232580 ^ Your disk size and c partition size do not match. Can you send a dmesg, to see what the actual size of your disk is? This is really needed to see what is going on. Did you at any time edit the disk size by hand? d: 2048256 4096512 4.2BSD 2048 16384 16 # Cyl 4064 - 6095 e: 20479536 6144768 4.2BSD 2048 16384 16 # Cyl 6096 - 26412 disklabel: partition c: partition extends past end of unit disklabel: partition e: partition extends past end of unit older kernel: $ sudo disklabel wd0 [...] 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 1024128 0 4.2BSD 0 0 16 # Cyl 0 - 1015 b: 3072384 1024128swap # Cyl 1016 - 4063 c: 234441648 0 unused 0 0 # Cyl 0 -232580 d: 2048256 4096512 4.2BSD 0 0 16 # Cyl 4064 - 6095 e: 20479536 6144768 4.2BSD 0 0 16 # Cyl 6096 - 26412 f: 4095504 26624304 4.2BSD 0 0 16 # Cyl 26413 - 30475 g: 20479536 30719808 4.2BSD 0 0 16 # Cyl 30476 - 50792 h: 183242304 51199344 4.2BSD 0 0 16 # Cyl 50793 -232580 disklabel: partition c: partition extends past end of unit disklabel: partition e: partition extends past end of unit disklabel: partition f: offset past end of unit disklabel: partition f: partition extends past end of unit disklabel: partition g: offset past end of unit disklabel: partition g:
Re: /usr/obj partition AWOL
On Tue, 5 Jun 2007, Markus Lude wrote: On Mon, Jun 04, 2007 at 06:02:59PM -0500, Emilio Perea wrote: I follow -current on an i386 at work and an amd64 at home, and rarely run into any problem which is not self-inflicted. So when I had a weird experience this weekend, I assumed it was my fault. What happened was that after the usual sequence of [build kernel; reboot; build userland; reboot] the system complained that it could not fsck wd1j and dropped into single-user mode. wd1j is mounted on /usr/obj, and I thought that something in the last build had messed it up, so I ran newfs wd1j and got newfs: /dev/rwd1j: Device not configured disklabel wd1 showed partitions d-i and k-p, but no j. I added the partition, ran newfs, and everything seemed fine. This afternoon I installed the i386 snapshot downloaded this morning (dated Jun 3 19:19) on the work pc, and after reboot it was missing the /usr/obj partition (sd0g in this case). Everything seems to be working fine on both computers, but I didn't expect the partitions to disappear. Did nobody else run into this problem? Or did everybody else who saw it thought it was too obvious to mention it to the mailing list? I had a similar problem on sparc64 with a snapshot from jun 2. The system was unable to fsck some partitions and dropped to single user mode. Here the problems were with the /usr, /var, /tmp and /home partitions. Some further (and larger partitions) weren't affected. I installed an older snapshot. Any suggestions how to get this fixed or what to test/try? There were some validations checkc added to partitions. If a bad partition is found, it will be marked unused. The checks were a little to strict for some cases. A fix for that went in yesterday, so try a new snap. If the problem persists, please report with full disklabel output. -Otto
Re: /usr/obj partition AWOL
On Tue, Jun 05, 2007 at 07:51:48AM +0200, Otto Moerbeek wrote: There were some validations checkc added to partitions. If a bad partition is found, it will be marked unused. The checks were a little to strict for some cases. A fix for that went in yesterday, so try a new snap. If the problem persists, please report with full disklabel output. The problem showed up on the latest snapshot as of now, which may well have been built before the fix you mention was incorporated. The home PC running -current has not had a problem since Saturday afternoon. The daily insecurity reports show four changes in this partition during the last couple of months. (Note that since this is on /usr/obj on a PC running -current, newfs is run just about every day.) It seems funny that on May 29 the fsize and bsize were changed to 0, but nothing weird happened until the day after they were changed to what appeared to be more reasonable numbers. Anyhow, in case the information is useful, the insecurity messages and current disklabel follow: == sd0 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd0.current Fri Apr 21 01:31:35 2006 +++ /var/backups/disklabel.sd0 Tue Apr 17 01:31:10 2007 @@ -26,4 +26,4 @@ d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 - g: 4139682 13628208 4.2BSD 2048 16384 480 # Cyl 5357 - 6984* + g: 4139682 13628208 4.2BSD 2048 163841 # Cyl 5357 - 6984* == sd0 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd0.current Tue Apr 17 01:31:10 2007 +++ /var/backups/disklabel.sd0 Wed May 30 01:32:08 2007 @@ -26,4 +26,4 @@ d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 - g: 4139682 13628208 4.2BSD 2048 163841 # Cyl 5357 - 6984* + g: 4139682 13628208 4.2BSD 0 01 # Cyl 5357 - 6984* == sd0 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd0.current Wed May 30 01:32:08 2007 +++ /var/backups/disklabel.sd0 Fri Jun 1 01:32:15 2007 @@ -26,4 +26,4 @@ d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 - g: 4139682 13628208 4.2BSD 0 01 # Cyl 5357 - 6984* + g: 4139682 13628208 4.2BSD 2048 81921 # Cyl 5357 - 6984* == sd0 diffs (-OLD +NEW) == --- /var/backups/disklabel.sd0.current Fri Jun 1 01:32:15 2007 +++ /var/backups/disklabel.sd0 Tue Jun 5 01:32:10 2007 @@ -26,4 +26,4 @@ d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 - g: 4139682 13628208 4.2BSD 2048 81921 # Cyl 5357 - 6984* + g: 4139682 13628208 4.2BSD 2048 163841 # Cyl 5357 - 6984* # Inside MBR partition 3: type A6 start 63 size 17767827 # /dev/rsd0c: type: SCSI disk: SCSI disk label: ST39102LW flags: bytes/sector: 512 sectors/track: 212 tracks/cylinder: 12 sectors/cylinder: 2544 cylinders: 6962 total sectors: 17783240 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: # sizeoffset fstype [fsize bsize cpg] a: 209619363 4.2BSD 2048 16384 480 # Cyl 0*- 823 b: 1048128 2096256swap # Cyl 824 - 1235 c: 17783240 0 unused 0 0 # Cyl 0 - 6990* d: 1048128 3144384 4.2BSD 2048 16384 416 # Cyl 1236 - 1647 e: 1048128 4192512 4.2BSD 2048 16384 416 # Cyl 1648 - 2059 f: 8387568 5240640 4.2BSD 2048 16384 480 # Cyl 2060 - 5356 g: 4139682 13628208 4.2BSD 2048 163841 # Cyl 5357 - 6984*
/usr/obj partition AWOL
I follow -current on an i386 at work and an amd64 at home, and rarely run into any problem which is not self-inflicted. So when I had a weird experience this weekend, I assumed it was my fault. What happened was that after the usual sequence of [build kernel; reboot; build userland; reboot] the system complained that it could not fsck wd1j and dropped into single-user mode. wd1j is mounted on /usr/obj, and I thought that something in the last build had messed it up, so I ran newfs wd1j and got newfs: /dev/rwd1j: Device not configured disklabel wd1 showed partitions d-i and k-p, but no j. I added the partition, ran newfs, and everything seemed fine. This afternoon I installed the i386 snapshot downloaded this morning (dated Jun 3 19:19) on the work pc, and after reboot it was missing the /usr/obj partition (sd0g in this case). Everything seems to be working fine on both computers, but I didn't expect the partitions to disappear. Did nobody else run into this problem? Or did everybody else who saw it thought it was too obvious to mention it to the mailing list? Emilio
Re: /usr/obj partition AWOL
On Mon, Jun 04, 2007 at 06:02:59PM -0500, Emilio Perea wrote: I follow -current on an i386 at work and an amd64 at home, and rarely run into any problem which is not self-inflicted. So when I had a weird experience this weekend, I assumed it was my fault. What happened was that after the usual sequence of [build kernel; reboot; build userland; reboot] the system complained that it could not fsck wd1j and dropped into single-user mode. wd1j is mounted on /usr/obj, and I thought that something in the last build had messed it up, so I ran newfs wd1j and got newfs: /dev/rwd1j: Device not configured disklabel wd1 showed partitions d-i and k-p, but no j. I added the partition, ran newfs, and everything seemed fine. This afternoon I installed the i386 snapshot downloaded this morning (dated Jun 3 19:19) on the work pc, and after reboot it was missing the /usr/obj partition (sd0g in this case). Everything seems to be working fine on both computers, but I didn't expect the partitions to disappear. Did nobody else run into this problem? Or did everybody else who saw it thought it was too obvious to mention it to the mailing list? I had a similar problem on sparc64 with a snapshot from jun 2. The system was unable to fsck some partitions and dropped to single user mode. Here the problems were with the /usr, /var, /tmp and /home partitions. Some further (and larger partitions) weren't affected. I installed an older snapshot. Any suggestions how to get this fixed or what to test/try? Regards, Markus