ZFS root, error 2 when mounting root

2013-02-24 Thread bw.mail.lists
Basically, I tried to follow 
https://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/9.0-RELEASE, but ended up 
with a system that didn't know how to mount /.


There are two scripts attached.

zfsnocache.sh follows the instructions on the wiki. The system booted 
just fine, but when it got to the part where it mounts the root 
partition, it stopped with 'error 2' 'unknown file system'. I could 
import the pool when booting from LiveFS, I wrote to it, it was working 
fine, but at boot it just refused to be mounted as /.


zfswithcache.sh from http://strahlert.net/wordpress/?p=142, I think. 
This worked with no issues.


The main difference I see between those two scripts is that one doesn't 
use a cache file and the other one does, hence the name of the scripts. 
But it should work without cachefile too, shouldn't it? The other 
difference is how mountpoints are set, but I can't figure out what could 
be wrong there.


Can someone please explain why zfsnocache fails to mount / ?
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org

Re: HAST - detect failure and restore avoiding an outage?

2013-02-24 Thread Mikolaj Golub
On Sat, Feb 23, 2013 at 09:51:03PM +0100, Pawel Jakub Dawidek wrote:

 I'm fine with the patchi except for missing breaks in switch added to
 hastd/primary.c.

Oops. Fixed. Thanks!

 I'm also wondering... You count all those errors separately just to
 print them as one number. If we do that already let's print them
 separately, eg.
 
   local i/o errors: read(0), write(3), delete(5), flush(9)

The idea was that hastd provided all available counters, and hastctl
showed only aggregated counter just to save a screen space, but if one
wanted to write its own utility to monitor hastd, which would talk
directly to hastd via socket, she would be able to see all counters
separately.

But your idea with writing errors in one string looks better, as it
allows to save a screen space and provide more detailed info. I would
prefer a little different output though:

  role: secondary
  provname: test
  localpath: /dev/md102
  extentsize: 2097152 (2.0MB)
  keepdirty: 0
  remoteaddr: kopusha:7771
  replication: memsync
  status: complete
  dirty: 0 (0B)
  statistics:
reads: 13
writes: 521
deletes: 0
flushes: 0
activemap updates: 0
local i/o errors:
  read: 13, write: 425, delete: 0, flush: 0

but don't have a strong opinion and will be ok with yours if you don't
like my version.

 
 BTW. Why not to count activemap update errors as write and flush errors?

I need (internally) separate counters for activemap errors because
they are updated by the different thread and I wouldn't want to
introduce locking for error counter update operations. As hastctl was
supposed to show an aggregated counter I didn't bother much how to
make activemap update errors to count as write and flush errors. I
improved this too in the updated patch:

http://people.freebsd.org/~trociny/hast.stat_error.2.patch

-- 
Mikolaj Golub
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: HAST - detect failure and restore avoiding an outage?

2013-02-24 Thread Pawel Jakub Dawidek
On Sun, Feb 24, 2013 at 12:05:06PM +0200, Mikolaj Golub wrote:
 On Sat, Feb 23, 2013 at 09:51:03PM +0100, Pawel Jakub Dawidek wrote:
 
  I'm fine with the patchi except for missing breaks in switch added to
  hastd/primary.c.
 
 Oops. Fixed. Thanks!
 
  I'm also wondering... You count all those errors separately just to
  print them as one number. If we do that already let's print them
  separately, eg.
  
  local i/o errors: read(0), write(3), delete(5), flush(9)
 
 The idea was that hastd provided all available counters, and hastctl
 showed only aggregated counter just to save a screen space, but if one
 wanted to write its own utility to monitor hastd, which would talk
 directly to hastd via socket, she would be able to see all counters
 separately.
 
 But your idea with writing errors in one string looks better, as it
 allows to save a screen space and provide more detailed info. I would
 prefer a little different output though:
 
   role: secondary
   provname: test
   localpath: /dev/md102
   extentsize: 2097152 (2.0MB)
   keepdirty: 0
   remoteaddr: kopusha:7771
   replication: memsync
   status: complete
   dirty: 0 (0B)
   statistics:
 reads: 13
 writes: 521
 deletes: 0
 flushes: 0
 activemap updates: 0
 local i/o errors:
   read: 13, write: 425, delete: 0, flush: 0
 
 but don't have a strong opinion and will be ok with yours if you don't
 like my version.

My only comment would be to keep that in one line so it is easier to
grep. And merging those two lines won't exceed 80 chars.

  BTW. Why not to count activemap update errors as write and flush errors?
 
 I need (internally) separate counters for activemap errors because
 they are updated by the different thread and I wouldn't want to
 introduce locking for error counter update operations. As hastctl was
 supposed to show an aggregated counter I didn't bother much how to
 make activemap update errors to count as write and flush errors. I
 improved this too in the updated patch:
 
 http://people.freebsd.org/~trociny/hast.stat_error.2.patch

The patch looks good.

-- 
Pawel Jakub Dawidek   http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://tupytaj.pl


pgpWIEUPJOQes.pgp
Description: PGP signature


Re: Strange delays in ZFS scrub or resilver

2013-02-24 Thread Rob Rati
A bit of a stab in the dark here, but are any of the disks in your array 
Advanced Format drives?  If so, did you create a pool with a block size of 4k?  
Lastly, are all the partitions on your disks (if any) aligned to 4k sector 
boundaries (in the case of the Advanced Format disks)?

Rob

On Feb 23, 2013, at 11:23 PM, John Levine wrote:

 I have a raidz of three 1 TB SATA drives, in USB enclosures.  One of
 the disks went bad, so I replaced it last night and it's been
 resilvering ever since.  I can watch the activity lights on the disks
 and it cranks away for a minute or so, then stops for a minute, then
 cranks for a minute, and so forth.  If I do a zpool status while it's
 stopped, the zpool waits until the I/O resumes, and a ^T shows it
 waiting for zio-io_cv.
 
 I'm running FreeBSD 9.1, amd64 version, totally vanilla install on a
 mini-itx box with 4GB of RAM.  The root/swap disk is an SSD separate
 from the zfs disks.  When the disks are active, top shows about 10%
 system time and 4% interrupt.  When it isn't, top shows about 99.8%
 idle.  The server isn't doing much else, and nothing else currently
 touches the disks.  (They're for remote backup of a system somewhere
 else, and I have the backup job turned off until resilvering
 completes.)
 
 I'm running this on the console, and there are no disk error messages.
 
 Any idea what's going on or how to fix it?  I could move the disks to
 an ESATA enclosure if USB is losing interrupts or something.
 
 My recollection is that when I've done a scrub, it does the same thing,
 work, pause, work, pause.
 
 R's,
 John
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
 

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


make package vs pkg create

2013-02-24 Thread Joshua Isom
I tried making a build jail, not with pourdriere or tinderbox.  I just 
went to the ports and ran `make -DBATCH package-recursive clean` to get 
packages created.  I ran `pkg add *` in the packages/All directory, but 
all failed because of MANIFEST missing.  I'm guessing this is a bug in 
the .mk files, since I do have WITH_PKGNG set.  Is this a known problem 
or is there supposed to be a different way to do it?  Am I just supposed 
to use pourdriere or the source to keep my ports up to date until all 
the packages are rebuilt on freebsd.org?

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Can't build kernel

2013-02-24 Thread Andre Goree
On 02/23/2013 07:04 PM, ill...@gmail.com wrote:
 On 22 February 2013 18:56, Andre Goree an...@drenet.info wrote:
 
 cc1: warnings being treated as errors
 
 Need to set NO_WERROR perhaps?
 

Thanks for the suggestion, though it did not help.  This turned out to
be user error (i.e. a failed patch).  After erasing /usr/src and pulling
everything down again, I was able to rebuild without issue.  Thanks.

-- 
Andre Goree
an...@drenet.info



signature.asc
Description: OpenPGP digital signature


Re: make package vs pkg create

2013-02-24 Thread Steve O'Hara-Smith
On Sun, 24 Feb 2013 10:33:45 -0600
Joshua Isom jri...@gmail.com wrote:

 I tried making a build jail, not with pourdriere or tinderbox.  I just 
 went to the ports and ran `make -DBATCH package-recursive clean` to get 
 packages created.  I ran `pkg add *` in the packages/All directory, but 
 all failed because of MANIFEST missing.  I'm guessing this is a bug in 
 the .mk files, since I do have WITH_PKGNG set.

No bug, but you will need to run pkg repo  to turn the collection
of packages you've built into a pkgng repository.

  Is this a known problem 
 or is there supposed to be a different way to do it?  Am I just supposed 
 to use pourdriere or the source to keep my ports up to date until all 
 the packages are rebuilt on freebsd.org?

You don't need to use poudriere but it is very convenient once set
up. For example - updating the ports tree and rebuilding the affected ports

poudriere ports -u
poudriere bulk -f /root/packages -j build

build is my build jail, and /root/packages is a file listing the
packages I want.

-- 
Steve O'Hara-Smith st...@sohara.org
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: make package vs pkg create

2013-02-24 Thread Matthew Seaman
On 24/02/2013 16:33, Joshua Isom wrote:
 I tried making a build jail, not with pourdriere or tinderbox.  I just
 went to the ports and ran `make -DBATCH package-recursive clean` to get
 packages created.  I ran `pkg add *` in the packages/All directory, but
 all failed because of MANIFEST missing.  I'm guessing this is a bug in
 the .mk files, since I do have WITH_PKGNG set.  Is this a known problem
 or is there supposed to be a different way to do it?  Am I just supposed
 to use pourdriere or the source to keep my ports up to date until all
 the packages are rebuilt on freebsd.org?

'MANIFEST' is pretty fundamental to pkgs -- probably the error you are
seeing is because there are some other sort of files that aren't pkgng
packages present.  That's going to upset pkg add.  What's the history of
this jail?  Did it start out using pkgng, or did it get converted from
pkg_tools?  If the latter, did the conversion go smoothly?  Can you use
eg. 'pkg info' in your jail to get an accurate listing of the packages
installed there?

If 'WITH_PKGNG' is set in your make.conf, then 'make package' will
certainly use pkgng to generate packages.  I do that a lot in testing,
and it works just fine.

If you can clear out the non-pkgng stuff, the recommended way to do what
you intend is to generate a repository catalogue, and then use 'pkg
install'.  'pkg add' really should only be considered for installing
single packages when there is absolutely no alternative.

You should be able to run 'pkg repo /usr/ports/packages' to build a
repository catalogue for all the pkgng packages you've built in your
jail.  Then you can either mount the jail's package tree on the machine
where you want to install packages, or make it available through a web
server.  Set PACKAGESITE appropriately in ${LOCALBASE}/etc/pkg.conf --
for instance, this is what you'ld set to use a repo made as above and
mounted in the same location:

   PACKAGESITE : file:/usr/ports/packages

You can then use 'pkg install' or 'pkg upgrade' in the usual way.

Note: you won't need to install every package in your repo -- many of
them will exist solely in order to facilitate building other packages.
 If you choose the packages you specifically want, pkgng will sort out
installing the required dependencies, and moreover will set the
autoremove flags appropriately, so you could later purge things
installed solely as dependencies of packages you no longer want.

Cheers,

Matthew

-- 
Dr Matthew J Seaman MA, D.Phil.
PGP: http://www.infracaninophile.co.uk/pgpkey




signature.asc
Description: OpenPGP digital signature


Boot-time hard drive errors

2013-02-24 Thread Ronald F. Guilmette


I have a somewhat eclectic system, currently running (or at any rate,
trying to run) 9.1-RELEASE.  The system in question contains three
drives, to wit:

   WDC WD1002FAEX-00Z3A0 05.01D05 ATA-8 SATA 3.x device
   ST3500320AS SD1A ATA-8 SATA 1.x device
   Hitachi HTS541010A9E680 JA0OA480 ATA-8 SATA 3.x device

Previously, I had the ST3500320AS in this system, along with one other
entirely different Seagate drive, i.e. one not shown in the list above.
(Also, I was previously running 8.3-RELEASE and only recently updated
to 9.1-RELEASE.)

Since I reconfigured the system to its current state, i.e. with the set
of three drives listed above, whenever I reboot the system, about 50%
of the time, when the boot process gets down to the point where it
would ordinarily be printing out the messages relating to ada0, ada1,
etc. suddenly I start to get a massive and apparently endless stream
of error messages, apparently relating to one of the drives listed
above, but the stream actually alternates between two consecutive
error messages, both undoubtedly related to each other.

The boot process never completes, and I am just left staring at a
screen that's displaying, in very rapid succession, first the one
error message and then the other, and then the first one again, and
then the second one again, and on and on like that.

Unfortunately, the two error messages are being printed on the screen
so fast (and alternating, as described above) that I cannot even read
them, but I could just barely make out that they seem to relate to ada2...
well, anyway, one or another of the hard drives.

I do not know the proper way to rectify whatever is causing these flaky
errors.  I use the term flaky because, as I have said, this boot-time
problem only seems to occur maybe about 50% of the time, and the rest
of the time when I boot up there is no problem whatsoever.

Because I am able to boot up successfully, with no problems whatsoever,
a significant fraction of the time, I am inclined to think that whatever
is causing the failure is not actually a hardware fault.  (And by the way,
the WDC drive and the Hitachi drive are both practically brand new.  That
doesn't prove anything, of course, but it does make me think that they
are unlikely to have serious hardware faults.)

I would report this problem by filing a standard PR, but as I've said
above, I can't even read the error messages, because they are being
printed in such rapid succession, so I'm not sure that filing a PR
would be useful to anybody.  I mean what would it say?  That I'm getting
some unspecified failure at boot time that seems to relate to the hard
drives in this system?  That kind of PR would clearly not be very helpful.

Has anyone else ever encountered symptoms like those I have listed
above, either with 9.1-RELEASE or with any other version of FreeBSD?


Regards,
rfg
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Boot-time hard drive errors

2013-02-24 Thread Simon


Have you tried Pause/Break to see if you could feeze the screen to get the
error message?

I would stress test all three drives to see if they pass with flying colors. One
or more of your drives could be indeed flaky, regardless being new, that means
little. Also, something could be conflicting from time to time, that could also
show up under stress testing.

Make backup if you have important data before stress testing.

-Simon

On Sun, 24 Feb 2013 13:33:06 -0800, Ronald F. Guilmette wrote:



I have a somewhat eclectic system, currently running (or at any rate,
trying to run) 9.1-RELEASE.  The system in question contains three
drives, to wit:

   WDC WD1002FAEX-00Z3A0 05.01D05 ATA-8 SATA 3.x device
   ST3500320AS SD1A ATA-8 SATA 1.x device
   Hitachi HTS541010A9E680 JA0OA480 ATA-8 SATA 3.x device

Previously, I had the ST3500320AS in this system, along with one other
entirely different Seagate drive, i.e. one not shown in the list above.
(Also, I was previously running 8.3-RELEASE and only recently updated
to 9.1-RELEASE.)

Since I reconfigured the system to its current state, i.e. with the set
of three drives listed above, whenever I reboot the system, about 50%
of the time, when the boot process gets down to the point where it
would ordinarily be printing out the messages relating to ada0, ada1,
etc. suddenly I start to get a massive and apparently endless stream
of error messages, apparently relating to one of the drives listed
above, but the stream actually alternates between two consecutive
error messages, both undoubtedly related to each other.

The boot process never completes, and I am just left staring at a
screen that's displaying, in very rapid succession, first the one
error message and then the other, and then the first one again, and
then the second one again, and on and on like that.

Unfortunately, the two error messages are being printed on the screen
so fast (and alternating, as described above) that I cannot even read
them, but I could just barely make out that they seem to relate to ada2...
well, anyway, one or another of the hard drives.

I do not know the proper way to rectify whatever is causing these flaky
errors.  I use the term flaky because, as I have said, this boot-time
problem only seems to occur maybe about 50% of the time, and the rest
of the time when I boot up there is no problem whatsoever.

Because I am able to boot up successfully, with no problems whatsoever,
a significant fraction of the time, I am inclined to think that whatever
is causing the failure is not actually a hardware fault.  (And by the way,
the WDC drive and the Hitachi drive are both practically brand new.  That
doesn't prove anything, of course, but it does make me think that they
are unlikely to have serious hardware faults.)

I would report this problem by filing a standard PR, but as I've said
above, I can't even read the error messages, because they are being
printed in such rapid succession, so I'm not sure that filing a PR
would be useful to anybody.  I mean what would it say?  That I'm getting
some unspecified failure at boot time that seems to relate to the hard
drives in this system?  That kind of PR would clearly not be very helpful.

Has anyone else ever encountered symptoms like those I have listed
above, either with 9.1-RELEASE or with any other version of FreeBSD?


Regards,
rfg
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org




___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: Boot-time hard drive errors

2013-02-24 Thread Martin Alejandro Paredes Sanchez
On Sunday 24 February 2013 14:33:06 Ronald F. Guilmette wrote:
 I have a somewhat eclectic system, currently running (or at any rate,
 trying to run) 9.1-RELEASE.  The system in question contains three
 drives, to wit:

WDC WD1002FAEX-00Z3A0 05.01D05 ATA-8 SATA 3.x device
ST3500320AS SD1A ATA-8 SATA 1.x device
Hitachi HTS541010A9E680 JA0OA480 ATA-8 SATA 3.x device

 Previously, I had the ST3500320AS in this system, along with one other
 entirely different Seagate drive, i.e. one not shown in the list above.
 (Also, I was previously running 8.3-RELEASE and only recently updated
 to 9.1-RELEASE.)

 Since I reconfigured the system to its current state, i.e. with the set
 of three drives listed above, whenever I reboot the system, about 50%
 of the time, when the boot process gets down to the point where it
 would ordinarily be printing out the messages relating to ada0, ada1,
 etc. suddenly I start to get a massive and apparently endless stream
 of error messages, apparently relating to one of the drives listed
 above, but the stream actually alternates between two consecutive
 error messages, both undoubtedly related to each other.


Does your HDD controller is SATA 3?

I had a similar problem (some times could not boot) and was caused because my 
HDD controller is SATA 1

Intel ICH5 SATA150 controller

And my hard disk is SATA 2

WDC WD2500AVVS-00L2B0 01.03A01

The problem disapear when I lock the HDD at 150 MB/s (jumper settings the HDD 
to SATA 1)
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org