Re: moutnroot failing on zpools in Azure after upgrade from 10 to 11 due to lack of waiting for da0
[[ stupid mouse ]] On Thu, Mar 16, 2017 at 10:01 AM, Warner Loshwrote: > On Thu, Mar 16, 2017 at 6:06 AM, Pete French > wrote: >>> I don't like the delay and retry approach at all. >> >> Its not ideal, but it is what we do for UFS after all... >> >>> Imagine that you told the kernel that you want to mount your root from a ZFS >>> pool which is on a USB driver which you have already thrown out. Should the >>> kernel just keep waiting for that pool to appear? >> >> I'm not talking about an infinite loop here, just making it honour >> the 'vfs.mountroot.timeout' setting like it does ofr UFS. So it >> should wait for the timeout I have set and then proceed as it would if >> there had been no timeout. Default behaviout is for it to behave as it >> does now, its onyl when you need the retry that you enable it. > > Put another way: With UFS is keeps retrying until the timeout expires. > If the first try succeeds, the boot is immediate. > >> Right now this works for UFS, but not for ZFS, which is an inconsistency >> that I dont like, and also means I am being forced down a UFS root >> path if I require this. > > Yes. ZFS is special, but I don't think the assumptions behind its > specialness are quite right: > > /* > * In case of ZFS and NFS we don't have a way to wait for > * specific device. Also do the wait if the user forced that > * behaviour by setting vfs.root_mount_always_wait=1. > */ > if (strcmp(fs, "zfs") == 0 || strstr(fs, "nfs") != NULL || > dev[0] == '\0' || root_mount_always_wait != 0) { > vfs_mountroot_wait(); > return (0); > } > > So you can make it always succeed by forcing the wait, but that's lame... Later we check to see if a device by a given name is present. Since ZFS doesn't present its pool names as devices to the rest of the system, that's not going to work quite right. That's the real reason that ZFS is special. It isn't that we can't wait for individual devices, it's that we can't wait for the 'mount token' that we use for what to mount to be 'ready'. NFS suffers from the same problem, but since its device is always ready since it's stateless, it isn't as noticeable. Warner ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: moutnroot failing on zpools in Azure after upgrade from 10 to 11 due to lack of waiting for da0
On Thu, Mar 16, 2017 at 6:06 AM, Pete Frenchwrote: >> I don't like the delay and retry approach at all. > > Its not ideal, but it is what we do for UFS after all... > >> Imagine that you told the kernel that you want to mount your root from a ZFS >> pool which is on a USB driver which you have already thrown out. Should the >> kernel just keep waiting for that pool to appear? > > I'm not talking about an infinite loop here, just making it honour > the 'vfs.mountroot.timeout' setting like it does ofr UFS. So it > should wait for the timeout I have set and then proceed as it would if > there had been no timeout. Default behaviout is for it to behave as it > does now, its onyl when you need the retry that you enable it. Put another way: With UFS is keeps retrying until the timeout expires. If the first try succeeds, the boot is immediate. > Right now this works for UFS, but not for ZFS, which is an inconsistency > that I dont like, and also means I am being forced down a UFS root > path if I require this. Yes. ZFS is special, but I don't think the assumptions behind its specialness are quite right: /* * In case of ZFS and NFS we don't have a way to wait for * specific device. Also do the wait if the user forced that * behaviour by setting vfs.root_mount_always_wait=1. */ if (strcmp(fs, "zfs") == 0 || strstr(fs, "nfs") != NULL || dev[0] == '\0' || root_mount_always_wait != 0) { vfs_mountroot_wait(); return (0); } So you can make it always succeed by forcing the wait, but that's lame... ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: moutnroot failing on zpools in Azure after upgrade from 10 to 11 due to lack of waiting for da0
> I don't like the delay and retry approach at all. Its not ideal, but it is what we do for UFS after all... > Imagine that you told the kernel that you want to mount your root from a ZFS > pool which is on a USB driver which you have already thrown out. Should the > kernel just keep waiting for that pool to appear? I'm not talking about an infinite loop here, just making it honour the 'vfs.mountroot.timeout' setting like it does ofr UFS. So it should wait for the timeout I have set and then proceed as it would if there had been no timeout. Default behaviout is for it to behave as it does now, its onyl when you need the retry that you enable it. Right now this works for UFS, but not for ZFS, which is an inconsistency that I dont like, and also means I am being forced down a UFS root path if I require this. > Microsoft provides support for FreeBSD Hyper-V drivers. > Please try to discuss this problem on virtualization@ or with sephe@ directly. OK, will do, thanks... -pete. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: moutnroot failing on zpools in Azure after upgrade from 10 to 11 due to lack of waiting for da0
On 16/03/2017 13:18, Pete French wrote: > >> So, the kernel attempted to mount the root even before vmbus was attached >> and, >> thus, before storvsc appeared and informed the kernel that it might be >> holding >> the root. >> How ZFS was supposed to know that vmbus is ever going to appear? >> To me this sounds more like a problem with the Hyper-V drivers. > > I am currently running with the patch which waits for a number fo seconds and > retries the mount, and that appears t fix it. However I dont really like > rnning > a patched OS. How would I set about reporting this to Microsoft and getting it > fixed, or getting the timeoutpatch commited ? Preferably both, as the timeout > patch is generally a useful thing to have working for ZFS I think. I don't like the delay and retry approach at all. Imagine that you told the kernel that you want to mount your root from a ZFS pool which is on a USB driver which you have already thrown out. Should the kernel just keep waiting for that pool to appear? Microsoft provides support for FreeBSD Hyper-V drivers. Please try to discuss this problem on virtualization@ or with sephe@ directly. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: moutnroot failing on zpools in Azure after upgrade from 10 to 11 due to lack of waiting for da0
So, the kernel attempted to mount the root even before vmbus was attached and, thus, before storvsc appeared and informed the kernel that it might be holding the root. How ZFS was supposed to know that vmbus is ever going to appear? To me this sounds more like a problem with the Hyper-V drivers. I am currently running with the patch which waits for a number fo seconds and retries the mount, and that appears t fix it. However I dont really like rnning a patched OS. How would I set about reporting this to Microsoft and getting it fixed, or getting the timeoutpatch commited ? Preferably both, as the timeout patch is generally a useful thing to have working for ZFS I think. -pete. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 11-STABLE fails to build with MK_OFED enabled
Thanks - that is a better fix than my hack ;-) On 03/15/17 20:12, Dimitry Andric wrote: On 15 Mar 2017, at 13:42, Pete Frenchwrote: /usr/src/sys/modules/mlx4ib/../../ofed/drivers/infiniband/hw/mlx4/sysfs.c:90:22: error: format specifies type 'unsigned long long *' but the argument has type 'u64 *' (aka 'unsigned long *') [-Werror,-Wformat] sscanf(buf, "%llx", _ag_val); ^~~~ %lx Fairly trivial to fix obviously - I chnaged it to %lx - but not sure that would work on non 64 bit platforms. Hi Pete, I have merged the fix (r310232) in r315328. -Dimitry ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: arm64 fork/swap data corruptions: A ~110 line C program demonstrating an example (Pine64+ 2GB context) [Corrected subject: arm64!]
On 2017-Mar-15, at 11:07 PM, Scott Bennett wrote: > Mark Millard wrote: > >> [Something strange happened to the automatic CC: fill-in for my original >> reply. Also I should have mentioned that for my test program if a >> variant is made that does not fork the swapping works fine.] >> >> On 2017-Mar-15, at 9:37 AM, Mark Millard wrote: >> >>> On 2017-Mar-15, at 6:15 AM, Scott Bennett wrote: >>> On Tue, 14 Mar 2017 18:18:56 -0700 Mark Millard wrote: > On 2017-Mar-14, at 4:44 PM, Bernd Walterwrote: > >> On Tue, Mar 14, 2017 at 03:28:53PM -0700, Mark Millard wrote: >>> [test_check() between the fork and the wait/sleep prevents the >>> failure from occurring. Even a small access to the memory at >>> that stage prevents the failure. Details follow.] >> >> Maybe a stupid question, since you might have written it somewhere. >> What medium do you swap to? >> I've seen broken firmware on microSD cards doing silent data >> corruption for some access patterns. > > The root filesystem is on a USB SSD on a powered hub. > > Only the kernel is from the microSD card. > > I have several examples of the USB SSD model and have > never observed such problems in any other context. > > [remainder of irrelevant material deleted --SB] You gave a very long-winded non-answer to Bernd's question, so I'll repeat it here. What medium do you swap to? >>> >>> My wording of: >>> >>> The root filesystem is on a USB SSD on a powered hub. >>> >>> was definitely poor. It should have explicitly mentioned the >>> swap partition too: >>> >>> The root filesystem and swap partition are both on the same >>> USB SSD on a powered hub. >>> >>> More detail from dmesg -a for usb: >>> >>> usbus0: 12Mbps Full Speed USB v1.0 >>> usbus1: 480Mbps High Speed USB v2.0 >>> usbus2: 12Mbps Full Speed USB v1.0 >>> usbus3: 480Mbps High Speed USB v2.0 >>> ugen0.1: at usbus0 >>> uhub0: on usbus0 >>> ugen1.1: at usbus1 >>> uhub1: on usbus1 >>> ugen2.1: at usbus2 >>> uhub2: on usbus2 >>> ugen3.1: at usbus3 >>> uhub3: on usbus3 >>> . . . >>> uhub0: 1 port with 1 removable, self powered >>> uhub2: 1 port with 1 removable, self powered >>> uhub1: 1 port with 1 removable, self powered >>> uhub3: 1 port with 1 removable, self powered >>> ugen3.2: at usbus3 >>> uhub4 on uhub3 >>> uhub4: on >>> usbus3 >>> uhub4: MTT enabled >>> uhub4: 4 ports with 4 removable, self powered >>> ugen3.3: at usbus3 >>> umass0 on uhub4 >>> umass0: on usbus3 >>> umass0: SCSI over Bulk-Only; quirks = 0x0100 >>> umass0:0:0: Attached to scbus0 >>> . . . >>> da0 at umass-sim0 bus 0 scbus0 target 0 lun 0 >>> da0: Fixed Direct Access SPC-4 SCSI device >>> da0: Serial Number >>> da0: 40.000MB/s transfers >>> >>> (Edited a bit because there is other material interlaced, even >>> internal to some lines. Also: I removed the serial number of the >>> specific example device.) > > Thank you. That presents a much clearer picture. >>> I will further note that any kind of USB device cannot automatically be trusted to behave properly. USB devices are notorious, for example, [reasons why deleted --SB] You should identify where you page/swap to and then try substituting a different device for that function as a test to eliminate the possibility of a bad storage device/controller. If the problem still occurs, that means there still remains the possibility that another controller or its firmware is defective instead. It could be a kernel bug, it is true, but making sure there is no hardware or firmware error occurring is important, and as I say, USB devices should always be considered suspect unless and until proven innocent. >>> >>> [FYI: This is a ufs context, not a zfs one.] > > Right. It's only a Pi, after all. :-) It is a Pine64+ 2GB, not an rpi3. >>> >>> I'm aware of such things. There is no evidence that has resulted in >>> suggesting the USB devices that I can replace are a problem. Otherwise >>> I'd not be going down this path. I only have access to the one arm64 >>> device (a Pine64+ 2GB) so I've no ability to substitution-test what >>> is on that board. > > There isn't even one open port on that hub that you could plug a > flash drive into temporarily to be the paging device? Why do you think that I've never tried alternative devices? It is just that the result was no evidence that my usually-in-use SSD is having a special/local problem: the behavior continues across all such contexts when the Pine64+ 2GB is involved. (Again I have not had access to an alternate to the one arm64 board. That limits my substitution testing possibilities.) Why would you expect a Flash drive to be better than another SSD for such testing? (The SSD that I usually use even happens to be a USB 3.0 SSD, capable of USB 3.0 speeds in USB 3.0 contexts. So is
Re: moutnroot failing on zpools in Azure after upgrade from 10 to 11 due to lack of waiting for da0
On 13/03/2017 21:07, Edward Tomasz NapieraĆa wrote: > Are you sure the above transcript is right? There are three reasons > I'm asking. First, you'll see the "Root mount waiting" message, > which means the root mount code is, well, waiting for storvsc, exactly > as expected. Second - there is no "Trying to mount root". But most > of all - for some reason the "Mounting failed" is shown _before_ the > "Root mount waiting", and I have no idea how this could ever happen. Edward, your observation is not completely correct. https://www.twisted.org.uk/~pete/914893a3-249e-4a91-851c-f467fc185eec.txt We have: Trying to mount root from zfs:rpool/ROOT/default []... <=== vmbus0: version 3.0 ... storvsc0: on vmbus0 Solaris: NOTICE: Cannot find the pool label for 'rpool' Mounting from zfs:rpool/ROOT/default failed with error 5. <=== Root mount waiting for: storvsc <=== ... So, the kernel attempted to mount the root even before vmbus was attached and, thus, before storvsc appeared and informed the kernel that it might be holding the root. How ZFS was supposed to know that vmbus is ever going to appear? To me this sounds more like a problem with the Hyper-V drivers. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: arm64 fork/swap data corruptions: A ~110 line C program demonstrating an example (Pine64+ 2GB context) [Corrected subject: arm64!]
Mark Millard wrote: > [Something strange happened to the automatic CC: fill-in for my original > reply. Also I should have mentioned that for my test program if a > variant is made that does not fork the swapping works fine.] > > On 2017-Mar-15, at 9:37 AM, Mark Millard wrote: > > > On 2017-Mar-15, at 6:15 AM, Scott Bennett wrote: > > > >>On Tue, 14 Mar 2017 18:18:56 -0700 Mark Millard > >> wrote: > >>> On 2017-Mar-14, at 4:44 PM, Bernd Walterwrote: > >>> > On Tue, Mar 14, 2017 at 03:28:53PM -0700, Mark Millard wrote: > > [test_check() between the fork and the wait/sleep prevents the > > failure from occurring. Even a small access to the memory at > > that stage prevents the failure. Details follow.] > > Maybe a stupid question, since you might have written it somewhere. > What medium do you swap to? > I've seen broken firmware on microSD cards doing silent data > corruption for some access patterns. > >>> > >>> The root filesystem is on a USB SSD on a powered hub. > >>> > >>> Only the kernel is from the microSD card. > >>> > >>> I have several examples of the USB SSD model and have > >>> never observed such problems in any other context. > >>> > >>> [remainder of irrelevant material deleted --SB] > >> > >>You gave a very long-winded non-answer to Bernd's question, so I'll > >> repeat it here. What medium do you swap to? > > > > My wording of: > > > > The root filesystem is on a USB SSD on a powered hub. > > > > was definitely poor. It should have explicitly mentioned the > > swap partition too: > > > > The root filesystem and swap partition are both on the same > > USB SSD on a powered hub. > > > > More detail from dmesg -a for usb: > > > > usbus0: 12Mbps Full Speed USB v1.0 > > usbus1: 480Mbps High Speed USB v2.0 > > usbus2: 12Mbps Full Speed USB v1.0 > > usbus3: 480Mbps High Speed USB v2.0 > > ugen0.1: at usbus0 > > uhub0: on usbus0 > > ugen1.1: at usbus1 > > uhub1: on usbus1 > > ugen2.1: at usbus2 > > uhub2: on usbus2 > > ugen3.1: at usbus3 > > uhub3: on usbus3 > > . . . > > uhub0: 1 port with 1 removable, self powered > > uhub2: 1 port with 1 removable, self powered > > uhub1: 1 port with 1 removable, self powered > > uhub3: 1 port with 1 removable, self powered > > ugen3.2: at usbus3 > > uhub4 on uhub3 > > uhub4: on > > usbus3 > > uhub4: MTT enabled > > uhub4: 4 ports with 4 removable, self powered > > ugen3.3: at usbus3 > > umass0 on uhub4 > > umass0: on usbus3 > > umass0: SCSI over Bulk-Only; quirks = 0x0100 > > umass0:0:0: Attached to scbus0 > > . . . > > da0 at umass-sim0 bus 0 scbus0 target 0 lun 0 > > da0: Fixed Direct Access SPC-4 SCSI device > > da0: Serial Number > > da0: 40.000MB/s transfers > > > > (Edited a bit because there is other material interlaced, even > > internal to some lines. Also: I removed the serial number of the > > specific example device.) Thank you. That presents a much clearer picture. > > > >>I will further note that any kind of USB device cannot automatically > >> be trusted to behave properly. USB devices are notorious, for example, > >> > >> [reasons why deleted --SB] > >> > >>You should identify where you page/swap to and then try substituting > >> a different device for that function as a test to eliminate the possibility > >> of a bad storage device/controller. If the problem still occurs, that > >> means there still remains the possibility that another controller or its > >> firmware is defective instead. It could be a kernel bug, it is true, but > >> making sure there is no hardware or firmware error occurring is important, > >> and as I say, USB devices should always be considered suspect unless and > >> until proven innocent. > > > > [FYI: This is a ufs context, not a zfs one.] Right. It's only a Pi, after all. :-) > > > > I'm aware of such things. There is no evidence that has resulted in > > suggesting the USB devices that I can replace are a problem. Otherwise > > I'd not be going down this path. I only have access to the one arm64 > > device (a Pine64+ 2GB) so I've no ability to substitution-test what > > is on that board. There isn't even one open port on that hub that you could plug a flash drive into temporarily to be the paging device? You could then try your tests before returning to the normal configuration. If there isn't an open port, then how about plugging a second hub into one of the first hub's ports and moving the displaced device to the second hub? A flash drive could then be plugged in. That kind of configuration is obviously a bad idea for the long run, but just to try your tests it ought to work well enough. (BTW, if a USB storage device containing a paging area drops off=line even momentarily and the system needs to use it, that is the beginning of the end, even though it may take up to a few minutes for everything to lock up. You probably won't be able to do an orderly