Re: Boot-time hard drive errors
On Sunday 24 February 2013 14:33:06 Ronald F. Guilmette wrote: > I have a somewhat eclectic system, currently running (or at any rate, > trying to run) 9.1-RELEASE. The system in question contains three > drives, to wit: > > ATA-8 SATA 3.x device > ATA-8 SATA 1.x device > ATA-8 SATA 3.x device > > Previously, I had the ST3500320AS in this system, along with one other > entirely different Seagate drive, i.e. one not shown in the list above. > (Also, I was previously running 8.3-RELEASE and only recently updated > to 9.1-RELEASE.) > > Since I reconfigured the system to its current state, i.e. with the set > of three drives listed above, whenever I reboot the system, about 50% > of the time, when the boot process gets down to the point where it > would ordinarily be printing out the messages relating to ada0, ada1, > etc. suddenly I start to get a massive and apparently endless stream > of error messages, apparently relating to one of the drives listed > above, but the stream actually alternates between two consecutive > error messages, both undoubtedly related to each other. > Does your HDD controller is SATA 3? I had a similar problem (some times could not boot) and was caused because my HDD controller is SATA 1 Intel ICH5 SATA150 controller And my hard disk is SATA 2 WDC WD2500AVVS-00L2B0 01.03A01 The problem disapear when I lock the HDD at 150 MB/s (jumper settings the HDD to SATA 1) ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Boot-time hard drive errors
Have you tried Pause/Break to see if you could feeze the screen to get the error message? I would stress test all three drives to see if they pass with flying colors. One or more of your drives could be indeed flaky, regardless being new, that means little. Also, something could be conflicting from time to time, that could also show up under stress testing. Make backup if you have important data before stress testing. -Simon On Sun, 24 Feb 2013 13:33:06 -0800, Ronald F. Guilmette wrote: >I have a somewhat eclectic system, currently running (or at any rate, >trying to run) 9.1-RELEASE. The system in question contains three >drives, to wit: >ATA-8 SATA 3.x device >ATA-8 SATA 1.x device >ATA-8 SATA 3.x device >Previously, I had the ST3500320AS in this system, along with one other >entirely different Seagate drive, i.e. one not shown in the list above. >(Also, I was previously running 8.3-RELEASE and only recently updated >to 9.1-RELEASE.) >Since I reconfigured the system to its current state, i.e. with the set >of three drives listed above, whenever I reboot the system, about 50% >of the time, when the boot process gets down to the point where it >would ordinarily be printing out the messages relating to ada0, ada1, >etc. suddenly I start to get a massive and apparently endless stream >of error messages, apparently relating to one of the drives listed >above, but the stream actually alternates between two consecutive >error messages, both undoubtedly related to each other. >The boot process never completes, and I am just left staring at a >screen that's displaying, in very rapid succession, first the one >error message and then the other, and then the first one again, and >then the second one again, and on and on like that. >Unfortunately, the two error messages are being printed on the screen >so fast (and alternating, as described above) that I cannot even read >them, but I could just barely make out that they seem to relate to ada2... >well, anyway, one or another of the hard drives. >I do not know the proper way to rectify whatever is causing these "flaky" >errors. I use the term "flaky" because, as I have said, this boot-time >problem only seems to occur maybe about 50% of the time, and the rest >of the time when I boot up there is no problem whatsoever. >Because I am able to boot up successfully, with no problems whatsoever, >a significant fraction of the time, I am inclined to think that whatever >is causing the failure is not actually a hardware fault. (And by the way, >the WDC drive and the Hitachi drive are both practically brand new. That >doesn't prove anything, of course, but it does make me think that they >are unlikely to have serious hardware faults.) >I would report this problem by filing a standard PR, but as I've said >above, I can't even read the error messages, because they are being >printed in such rapid succession, so I'm not sure that filing a PR >would be useful to anybody. I mean what would it say? That I'm getting >some unspecified failure at boot time that seems to relate to the hard >drives in this system? That kind of PR would clearly not be very helpful. >Has anyone else ever encountered symptoms like those I have listed >above, either with 9.1-RELEASE or with any other version of FreeBSD? >Regards, >rfg >___ >freebsd-questions@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-questions >To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org" ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Boot-time hard drive errors
I have a somewhat eclectic system, currently running (or at any rate, trying to run) 9.1-RELEASE. The system in question contains three drives, to wit: ATA-8 SATA 3.x device ATA-8 SATA 1.x device ATA-8 SATA 3.x device Previously, I had the ST3500320AS in this system, along with one other entirely different Seagate drive, i.e. one not shown in the list above. (Also, I was previously running 8.3-RELEASE and only recently updated to 9.1-RELEASE.) Since I reconfigured the system to its current state, i.e. with the set of three drives listed above, whenever I reboot the system, about 50% of the time, when the boot process gets down to the point where it would ordinarily be printing out the messages relating to ada0, ada1, etc. suddenly I start to get a massive and apparently endless stream of error messages, apparently relating to one of the drives listed above, but the stream actually alternates between two consecutive error messages, both undoubtedly related to each other. The boot process never completes, and I am just left staring at a screen that's displaying, in very rapid succession, first the one error message and then the other, and then the first one again, and then the second one again, and on and on like that. Unfortunately, the two error messages are being printed on the screen so fast (and alternating, as described above) that I cannot even read them, but I could just barely make out that they seem to relate to ada2... well, anyway, one or another of the hard drives. I do not know the proper way to rectify whatever is causing these "flaky" errors. I use the term "flaky" because, as I have said, this boot-time problem only seems to occur maybe about 50% of the time, and the rest of the time when I boot up there is no problem whatsoever. Because I am able to boot up successfully, with no problems whatsoever, a significant fraction of the time, I am inclined to think that whatever is causing the failure is not actually a hardware fault. (And by the way, the WDC drive and the Hitachi drive are both practically brand new. That doesn't prove anything, of course, but it does make me think that they are unlikely to have serious hardware faults.) I would report this problem by filing a standard PR, but as I've said above, I can't even read the error messages, because they are being printed in such rapid succession, so I'm not sure that filing a PR would be useful to anybody. I mean what would it say? That I'm getting some unspecified failure at boot time that seems to relate to the hard drives in this system? That kind of PR would clearly not be very helpful. Has anyone else ever encountered symptoms like those I have listed above, either with 9.1-RELEASE or with any other version of FreeBSD? Regards, rfg ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: "make package" vs "pkg create"
On 24/02/2013 16:33, Joshua Isom wrote: > I tried making a build jail, not with pourdriere or tinderbox. I just > went to the ports and ran `make -DBATCH package-recursive clean` to get > packages created. I ran `pkg add *` in the packages/All directory, but > all failed because of MANIFEST missing. I'm guessing this is a bug in > the .mk files, since I do have WITH_PKGNG set. Is this a known problem > or is there supposed to be a different way to do it? Am I just supposed > to use pourdriere or the source to keep my ports up to date until all > the packages are rebuilt on freebsd.org? 'MANIFEST' is pretty fundamental to pkgs -- probably the error you are seeing is because there are some other sort of files that aren't pkgng packages present. That's going to upset pkg add. What's the history of this jail? Did it start out using pkgng, or did it get converted from pkg_tools? If the latter, did the conversion go smoothly? Can you use eg. 'pkg info' in your jail to get an accurate listing of the packages installed there? If 'WITH_PKGNG' is set in your make.conf, then 'make package' will certainly use pkgng to generate packages. I do that a lot in testing, and it works just fine. If you can clear out the non-pkgng stuff, the recommended way to do what you intend is to generate a repository catalogue, and then use 'pkg install'. 'pkg add' really should only be considered for installing single packages when there is absolutely no alternative. You should be able to run 'pkg repo /usr/ports/packages' to build a repository catalogue for all the pkgng packages you've built in your jail. Then you can either mount the jail's package tree on the machine where you want to install packages, or make it available through a web server. Set PACKAGESITE appropriately in ${LOCALBASE}/etc/pkg.conf -- for instance, this is what you'ld set to use a repo made as above and mounted in the same location: PACKAGESITE : file:/usr/ports/packages You can then use 'pkg install' or 'pkg upgrade' in the usual way. Note: you won't need to install every package in your repo -- many of them will exist solely in order to facilitate building other packages. If you choose the packages you specifically want, pkgng will sort out installing the required dependencies, and moreover will set the autoremove flags appropriately, so you could later purge things installed solely as dependencies of packages you no longer want. Cheers, Matthew -- Dr Matthew J Seaman MA, D.Phil. PGP: http://www.infracaninophile.co.uk/pgpkey signature.asc Description: OpenPGP digital signature
Re: "make package" vs "pkg create"
On Sun, 24 Feb 2013 10:33:45 -0600 Joshua Isom wrote: > I tried making a build jail, not with pourdriere or tinderbox. I just > went to the ports and ran `make -DBATCH package-recursive clean` to get > packages created. I ran `pkg add *` in the packages/All directory, but > all failed because of MANIFEST missing. I'm guessing this is a bug in > the .mk files, since I do have WITH_PKGNG set. No bug, but you will need to run pkg repo to turn the collection of packages you've built into a pkgng repository. > Is this a known problem > or is there supposed to be a different way to do it? Am I just supposed > to use pourdriere or the source to keep my ports up to date until all > the packages are rebuilt on freebsd.org? You don't need to use poudriere but it is very convenient once set up. For example - updating the ports tree and rebuilding the affected ports poudriere ports -u poudriere bulk -f /root/packages -j build build is my build jail, and /root/packages is a file listing the packages I want. -- Steve O'Hara-Smith ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Can't build kernel
On 02/23/2013 07:04 PM, ill...@gmail.com wrote: > On 22 February 2013 18:56, Andre Goree wrote: > >> cc1: warnings being treated as errors > > Need to set NO_WERROR perhaps? > Thanks for the suggestion, though it did not help. This turned out to be user error (i.e. a failed patch). After erasing /usr/src and pulling everything down again, I was able to rebuild without issue. Thanks. -- Andre Goree an...@drenet.info signature.asc Description: OpenPGP digital signature
"make package" vs "pkg create"
I tried making a build jail, not with pourdriere or tinderbox. I just went to the ports and ran `make -DBATCH package-recursive clean` to get packages created. I ran `pkg add *` in the packages/All directory, but all failed because of MANIFEST missing. I'm guessing this is a bug in the .mk files, since I do have WITH_PKGNG set. Is this a known problem or is there supposed to be a different way to do it? Am I just supposed to use pourdriere or the source to keep my ports up to date until all the packages are rebuilt on freebsd.org? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: Strange delays in ZFS scrub or resilver
A bit of a stab in the dark here, but are any of the disks in your array Advanced Format drives? If so, did you create a pool with a block size of 4k? Lastly, are all the partitions on your disks (if any) aligned to 4k sector boundaries (in the case of the Advanced Format disks)? Rob On Feb 23, 2013, at 11:23 PM, John Levine wrote: > I have a raidz of three 1 TB SATA drives, in USB enclosures. One of > the disks went bad, so I replaced it last night and it's been > resilvering ever since. I can watch the activity lights on the disks > and it cranks away for a minute or so, then stops for a minute, then > cranks for a minute, and so forth. If I do a zpool status while it's > stopped, the zpool waits until the I/O resumes, and a ^T shows it > waiting for zio->io_cv. > > I'm running FreeBSD 9.1, amd64 version, totally vanilla install on a > mini-itx box with 4GB of RAM. The root/swap disk is an SSD separate > from the zfs disks. When the disks are active, top shows about 10% > system time and 4% interrupt. When it isn't, top shows about 99.8% > idle. The server isn't doing much else, and nothing else currently > touches the disks. (They're for remote backup of a system somewhere > else, and I have the backup job turned off until resilvering > completes.) > > I'm running this on the console, and there are no disk error messages. > > Any idea what's going on or how to fix it? I could move the disks to > an ESATA enclosure if USB is losing interrupts or something. > > My recollection is that when I've done a scrub, it does the same thing, > work, pause, work, pause. > > R's, > John > ___ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org" > ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
Re: HAST - detect failure and restore avoiding an outage?
On Sun, Feb 24, 2013 at 12:05:06PM +0200, Mikolaj Golub wrote: > On Sat, Feb 23, 2013 at 09:51:03PM +0100, Pawel Jakub Dawidek wrote: > > > I'm fine with the patchi except for missing breaks in switch added to > > hastd/primary.c. > > Oops. Fixed. Thanks! > > > I'm also wondering... You count all those errors separately just to > > print them as one number. If we do that already let's print them > > separately, eg. > > > > local i/o errors: read(0), write(3), delete(5), flush(9) > > The idea was that hastd provided all available counters, and hastctl > showed only aggregated counter just to save a screen space, but if one > wanted to write its own utility to monitor hastd, which would talk > directly to hastd via socket, she would be able to see all counters > separately. > > But your idea with writing errors in one string looks better, as it > allows to save a screen space and provide more detailed info. I would > prefer a little different output though: > > role: secondary > provname: test > localpath: /dev/md102 > extentsize: 2097152 (2.0MB) > keepdirty: 0 > remoteaddr: kopusha:7771 > replication: memsync > status: complete > dirty: 0 (0B) > statistics: > reads: 13 > writes: 521 > deletes: 0 > flushes: 0 > activemap updates: 0 > local i/o errors: > read: 13, write: 425, delete: 0, flush: 0 > > but don't have a strong opinion and will be ok with yours if you don't > like my version. My only comment would be to keep that in one line so it is easier to grep. And merging those two lines won't exceed 80 chars. > > BTW. Why not to count activemap update errors as write and flush errors? > > I need (internally) separate counters for activemap errors because > they are updated by the different thread and I wouldn't want to > introduce locking for error counter update operations. As hastctl was > supposed to show an aggregated counter I didn't bother much how to > make activemap update errors to count as write and flush errors. I > improved this too in the updated patch: > > http://people.freebsd.org/~trociny/hast.stat_error.2.patch The patch looks good. -- Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl pgpWIEUPJOQes.pgp Description: PGP signature
Re: HAST - detect failure and restore avoiding an outage?
On Sat, Feb 23, 2013 at 09:51:03PM +0100, Pawel Jakub Dawidek wrote: > I'm fine with the patchi except for missing breaks in switch added to > hastd/primary.c. Oops. Fixed. Thanks! > I'm also wondering... You count all those errors separately just to > print them as one number. If we do that already let's print them > separately, eg. > > local i/o errors: read(0), write(3), delete(5), flush(9) The idea was that hastd provided all available counters, and hastctl showed only aggregated counter just to save a screen space, but if one wanted to write its own utility to monitor hastd, which would talk directly to hastd via socket, she would be able to see all counters separately. But your idea with writing errors in one string looks better, as it allows to save a screen space and provide more detailed info. I would prefer a little different output though: role: secondary provname: test localpath: /dev/md102 extentsize: 2097152 (2.0MB) keepdirty: 0 remoteaddr: kopusha:7771 replication: memsync status: complete dirty: 0 (0B) statistics: reads: 13 writes: 521 deletes: 0 flushes: 0 activemap updates: 0 local i/o errors: read: 13, write: 425, delete: 0, flush: 0 but don't have a strong opinion and will be ok with yours if you don't like my version. > > BTW. Why not to count activemap update errors as write and flush errors? I need (internally) separate counters for activemap errors because they are updated by the different thread and I wouldn't want to introduce locking for error counter update operations. As hastctl was supposed to show an aggregated counter I didn't bother much how to make activemap update errors to count as write and flush errors. I improved this too in the updated patch: http://people.freebsd.org/~trociny/hast.stat_error.2.patch -- Mikolaj Golub ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"
ZFS root, error 2 when mounting root
Basically, I tried to follow https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/9.0-RELEASE, but ended up with a system that didn't know how to mount /. There are two scripts attached. zfsnocache.sh follows the instructions on the wiki. The system booted just fine, but when it got to the part where it mounts the root partition, it stopped with 'error 2' 'unknown file system'. I could import the pool when booting from LiveFS, I wrote to it, it was working fine, but at boot it just refused to be mounted as /. zfswithcache.sh from http://strahlert.net/wordpress/?p=142, I think. This worked with no issues. The main difference I see between those two scripts is that one doesn't use a cache file and the other one does, hence the name of the scripts. But it should work without cachefile too, shouldn't it? The other difference is how mountpoints are set, but I can't figure out what could be wrong there. Can someone please explain why zfsnocache fails to mount / ? ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"