Re: NFS 4.1 RECLAIM_COMPLETE FS failed error in combination with ESXi client
Thanks! Please keep me updated if you find put more or when a updated version is available. As I now know it is working, I will start tomorrow to build up a testsystem with 3 NFS servers (two of them in a ha with CARP and HAST) and several ESXi hosts which will all access there NFS datastores over 4 uplinks with NICs on different subnets. It should always be possible to do there some testing. andi Von: Rick MacklemGesendet: 10.03.2018 11:20 nachm. An: NAGY Andreas; 'freebsd-stable@freebsd.org' Betreff: Re: NFS 4.1 RECLAIM_COMPLETE FS failed error in combination with ESXi client NAGY Andreas wrote: >Thanks, the not issuing delegation warnings disappeared with this patch. > >But now there are some new warnings I haven't seen so far: >2018-03-10T13:01:39.441Z cpu8:68046)WARNING: NFS41: NFS41FSOpGetObject:2148: >Failed to >get object 0x43910e71b386 [36 c6b10167 9b157f95 5aa100fb 8ffcf2c1 c >2 9f22ad6d 0 0 0 0 0]: >Stale file handle I doubt these would be related to the patch. A stale FH means that the client tried to access a file via its FH after it was removed. (Normally this is a client bug, but hopefully not one that will cause grief.) >These only appear several times after a the NFS share is mounted or remounted >after a >connection loss. >Everything works fine, but haven't seen them till I applied the last patch. > >andi Ok. Thanks for testing all of these patches. I will probably get cleaned up versions of them committed in April. The main outstanding issue is the Readdir one about directory changing too much. Hopefully I can find out something about it via email. Have fun with it, rick ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs problems after rebuilding system [SOLVED]
On Sat, 2018-03-10 at 23:42 +, Pete French wrote: > > > > It looks like r330745 applies fine to stable-11 without any changes, > > and there's plenty of value in testing that as well, if you're already > > set up for that world. > > > > Ive been running the patch from the PR in production since the original > bug report and it works fine. I havent looked at r330745 yes, but can > replace the PR patch with that and give it a whirl will take a look > Monday at whats possible. > > -pete. > I based my fix heavily on that patch from the PR, but I rewrote it enough that I might've made any number of mistakes, so it needs fresh testing. The main change I made was to make it a lot less noisy while waiting (it only mentions the wait once, unless bootverbose is set, in which case it's once per second). I also removed the logic that limited the retries to nfs and zfs, because I think we can remove all the old code related to waiting that only worked for ufs and let this new retry be the way it waits for all filesystems. But that's a bigger change we can do separately; I didn't want to hold up this fix any longer. -- Ian ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs problems after rebuilding system [SOLVED]
It looks like r330745 applies fine to stable-11 without any changes, and there's plenty of value in testing that as well, if you're already set up for that world. Ive been running the patch from the PR in production since the original bug report and it works fine. I havent looked at r330745 yes, but can replace the PR patch with that and give it a whirl will take a look Monday at whats possible. -pete. ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs problems after rebuilding system [SOLVED]
On Sat, 2018-03-10 at 23:08 +, Pete French wrote: > Ah, thankyou! I haven;t run current before, but as this is such an issue > for us I;ll setup an Azure machine running it and have it reboot every > five minutes or so to check it works OK. Unfortunately the error doesnt > show up consisntently, as its a race condition. Will let you know if it > fails for any reason. > > -pete. [time to take a dive into the exiting world of current] It looks like r330745 applies fine to stable-11 without any changes, and there's plenty of value in testing that as well, if you're already set up for that world. -- Ian ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs problems after rebuilding system [SOLVED]
Ah, thankyou! I haven;t run current before, but as this is such an issue for us I;ll setup an Azure machine running it and have it reboot every five minutes or so to check it works OK. Unfortunately the error doesnt show up consisntently, as its a race condition. Will let you know if it fails for any reason. -pete. [time to take a dive into the exiting world of current] ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs problems after rebuilding system [SOLVED]
On Sat, 2018-03-03 at 16:19 +, Pete French wrote: > > > > > That won't work for the boot drive. > > > > When no boot drive is detected early enough, the kernel goes to the > > mountroot prompt. That seems to hold a Giant lock which inhibits > > further progress being made. Sometimes progress can be made by > > trying > > to mount unmountable partitions on other drives, but this usually > > goes > > too fast, especially if the USB drive often times out. > > > We have this problem in Azure with a ZFS root, was fixed by the pacth > in > this bug report, which actually starts off being about USB. > > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=208882 > > You can then set the mountroot timeout as normal and it works. > > I wold really like this patch to be applied, but it seems to have > languished since last summer. We use this as standard on all our > cloud > machines now, and it works very nicely. > > -pete. I've committed a fix to -current (r330745) based on that patch. It would be good if people running -current who've had this problem could give it some testing. I'd like to get it merged back to 11 before the 11.1 release (and back to 10-stable as well). With r330745 in place, the only setting that should be needed if your rootfs is on a device that is slow to arrive is vfs.mountroot.timeout= in loader.conf; the value is the number of seconds to wait before giving up and going to the mountroot prompt. -- Ian ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS 4.1 RECLAIM_COMPLETE FS failed error in combination with ESXi client
NAGY Andreas wrote: >Thanks, the not issuing delegation warnings disappeared with this patch. > >But now there are some new warnings I haven't seen so far: >2018-03-10T13:01:39.441Z cpu8:68046)WARNING: NFS41: NFS41FSOpGetObject:2148: >Failed to >get object 0x43910e71b386 [36 c6b10167 9b157f95 5aa100fb 8ffcf2c1 c >2 9f22ad6d 0 0 0 0 0]: >Stale file handle I doubt these would be related to the patch. A stale FH means that the client tried to access a file via its FH after it was removed. (Normally this is a client bug, but hopefully not one that will cause grief.) >These only appear several times after a the NFS share is mounted or remounted >after a >connection loss. >Everything works fine, but haven't seen them till I applied the last patch. > >andi Ok. Thanks for testing all of these patches. I will probably get cleaned up versions of them committed in April. The main outstanding issue is the Readdir one about directory changing too much. Hopefully I can find out something about it via email. Have fun with it, rick ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: NFS 4.1 RECLAIM_COMPLETE FS failed error in combination with ESXi client
Thanks, the not issuing delegation warnings disappeared with this patch. But now there are some new warnings I haven't seen so far: 2018-03-10T13:01:39.441Z cpu8:68046)WARNING: NFS41: NFS41FSOpGetObject:2148: Failed to get object 0x43910e71b386 [36 c6b10167 9b157f95 5aa100fb 8ffcf2c1 c 2 9f22ad6d 0 0 0 0 0]: Stale file handle These only appear several times after a the NFS share is mounted or remounted after a connection loss. Everything works fine, but haven't seen them till I applied the last patch. andi ___ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"