subject:"FreeBSD NFS client goes into infinite retry loop"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-24 Thread John Baldwin

On Tuesday 23 March 2010 7:03:06 pm Rick Macklem wrote:
> 
> On Tue, 23 Mar 2010, John Baldwin wrote:
> 
> >
> > Ah, I had read that patch as being a temporary testing hack.  If you think
> > that would be a good approach in general that would be ok with me.
> >
> Well, it kinda was. I wasn't betting on it fixing the problem, but since
> it does...
> 
> I think just mapping VFS_FHTOVP() errors to ESTALE is ok. Do you think
> I should ask pjd@ about it or just go ahead with a commit?

Go ahead and fix it I think.

-- 
John Baldwin
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-23 Thread Rick Macklem




On Tue, 23 Mar 2010, John Baldwin wrote:



Ah, I had read that patch as being a temporary testing hack.  If you think
that would be a good approach in general that would be ok with me.


Well, it kinda was. I wasn't betting on it fixing the problem, but since
it does...

I think just mapping VFS_FHTOVP() errors to ESTALE is ok. Do you think
I should ask pjd@ about it or just go ahead with a commit?

Thanks for the help, rick

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-23 Thread John Baldwin

On Monday 22 March 2010 7:53:23 pm Rick Macklem wrote:
> > That I have no idea on.  Maybe Rick can chime in?  I'm actually not sure why
> > we would want to treat a FHTOVP failure as anything but an ESTALE error in 
> > the
> > NFS server to be honest.
> >
> As far as I know, only if the underlying file system somehow has a 
> situation where the file handle can't be translated at that point in time, 
> but could be able to later. I have no idea if any file system is like that 
> and I don't such a file system would be an appropriate choice for an NFS 
> server, even if such a beast exists. (Even then, although FreeBSD's client 
> assumes EIO might recover on a retry, that isn't specified in any RFC, as 
> far as I know.)
> 
> That's why I proposed a patch that simply translates all VFS_FHTOVP()
> errors to ESTALE in the NFS server. (It seems simpler than chasing down 
> cases in all the underlying file systems?)

Ah, I had read that patch as being a temporary testing hack.  If you think
that would be a good approach in general that would be ok with me.

-- 
John Baldwin
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: Fwd: Re: FreeBSD NFS client goes into infinite retry loop

2010-03-23 Thread Steve Polyack


On 03/22/10 19:53, Rick Macklem wrote:


On Mon, 22 Mar 2010, John Baldwin wrote:

>>  It looks like it also returns ESTALE when the inode is invalid (<
>>  ROOTINO ||>  max inodes?) - would an unlinked file in FFS referenced at
>>  a later time report an invalid inode?
>>

I'm no ufs guy, but the only way I can think of is if the file system
on the server was newfs'd with fewer i-nodes? (Unlikely, but...)
(Basically, it is safe to return ESTALE for anything that is not
  a transient failure that could recover on a retry.)

>>  But back to your point, zfs_zget() seems to be failing and returning the
>>  EINVAL before zfs_fhtovp() even has a chance to set and check zp_gen.
>>  I'm trying to get some more details through the use of gratuitous
>>  dprintf()'s, but they don't seem to be making it to any logs or the
>>  console even with vfs.zfs.debug=1 set.  Any pointers on how to get these
>>  dprintf() calls working?

I know diddly (as in absolutely nothing about zfs).
>
>  That I have no idea on.  Maybe Rick can chime in?  I'm actually not sure why
>  we would want to treat a FHTOVP failure as anything but an ESTALE error in 
the
>  NFS server to be honest.
>
As far as I know, only if the underlying file system somehow has a
situation where the file handle can't be translated at that point in time,
but could be able to later. I have no idea if any file system is like that
and I don't such a file system would be an appropriate choice for an NFS
server, even if such a beast exists. (Even then, although FreeBSD's client
assumes EIO might recover on a retry, that isn't specified in any RFC, as
far as I know.)

That's why I proposed a patch that simply translates all VFS_FHTOVP()
errors to ESTALE in the NFS server. (It seems simpler than chasing down
cases in all the underlying file systems?)

rick, chiming in:-)


   


Makes sense to me.  I'll continue to bang on NFS with your initial patch 
in my lab for a while.  Should I open a PR for further discussion / 
resolution of the issue in -CURRENT / STABLE?


Thanks,
Steve Polyack
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-22 Thread Rick Macklem




On Mon, 22 Mar 2010, John Baldwin wrote:


It looks like it also returns ESTALE when the inode is invalid (<
ROOTINO || > max inodes?) - would an unlinked file in FFS referenced at
a later time report an invalid inode?



I'm no ufs guy, but the only way I can think of is if the file system
on the server was newfs'd with fewer i-nodes? (Unlikely, but...)
(Basically, it is safe to return ESTALE for anything that is not
 a transient failure that could recover on a retry.)


But back to your point, zfs_zget() seems to be failing and returning the
EINVAL before zfs_fhtovp() even has a chance to set and check zp_gen.
I'm trying to get some more details through the use of gratuitous
dprintf()'s, but they don't seem to be making it to any logs or the
console even with vfs.zfs.debug=1 set.  Any pointers on how to get these
dprintf() calls working?


I know diddly (as in absolutely nothing about zfs).


That I have no idea on.  Maybe Rick can chime in?  I'm actually not sure why
we would want to treat a FHTOVP failure as anything but an ESTALE error in the
NFS server to be honest.

As far as I know, only if the underlying file system somehow has a 
situation where the file handle can't be translated at that point in time, 
but could be able to later. I have no idea if any file system is like that 
and I don't such a file system would be an appropriate choice for an NFS 
server, even if such a beast exists. (Even then, although FreeBSD's client 
assumes EIO might recover on a retry, that isn't specified in any RFC, as 
far as I know.)


That's why I proposed a patch that simply translates all VFS_FHTOVP()
errors to ESTALE in the NFS server. (It seems simpler than chasing down 
cases in all the underlying file systems?)


rick, chiming in:-)

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-22 Thread Steve Polyack


On 03/22/10 13:39, John Baldwin wrote:

On Monday 22 March 2010 12:44:04 pm Steve Polyack wrote:
   

On 03/22/10 12:00, John Baldwin wrote:
 

On Monday 22 March 2010 11:47:43 am Steve Polyack wrote:

   

On 03/22/10 10:52, Steve Polyack wrote:

 

On 3/19/2010 11:27 PM, Rick Macklem wrote:

   

On Fri, 19 Mar 2010, Steve Polyack wrote:

[good stuff snipped]

 

This makes sense.  According to wireshark, the server is indeed
transmitting "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE
instead; it sounds more correct than marking it a general IO error.
Also, the NFS server is serving its share off of a ZFS filesystem,
if it makes any difference.  I suppose ZFS could be talking to the
NFS server threads with some mismatched language, but I doubt it.


   

Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
ESTALE when the file no longer exists, the NFS server returns whatever
error it has returned.

So, either VFS_FHTOVP() succeeds after the file has been deleted, which
would be a problem that needs to be fixed within ZFS
OR
ZFS returns an error other than ESTALE when it doesn't exist.

Try the following patch on the server (which just makes any error
returned by VFS_FHTOVP() into ESTALE) and see if that helps.

--- nfsserver/nfs_srvsubs.c.sav2010-03-19 22:06:43.0 -0400
+++ nfsserver/nfs_srvsubs.c2010-03-19 22:07:22.0 -0400
@@ -1127,6 +1127,8 @@
   }
   }
   error = VFS_FHTOVP(mp,&fhp->fh_fid, vpp);
+if (error != 0)
+error = ESTALE;
   vfs_unbusy(mp);
   if (error)
   goto out;

Please let me know if the patch helps, rick



 

The patch seems to fix the bad behavior.  Running with the patch, I
see the following output from my patch (return code of nfs_doio from
within nfsiod):
nfssvc_iod: iod 0 nfs_doio returned errno: 70

Furthermore, when inspecting the transaction with Wireshark, after
deleting the file on the NFS server it looks like there is only a
single error.  This time there it is a reply to a V3 Lookup call that
contains a status of "NFS3ERR_NOENT (2)" coming from the NFS server.
The client also does not repeatedly try to complete the failed request.

Any suggestions on the next step here?  Based on what you said it
looks like ZFS is falsely reporting an IO error to VFS instead of
ESTALE / NOENT.  I tried looking around zfs_fhtovp() and only saw
returns of EINVAL, but I'm not even sure I'm looking in the right place.

   

Further on down the rabbit hole... here's the piece in zfs_fhtovp()
where it's kicking out EINVAL instead of ESTALE - the following patch
corrects the behavior, but of course also suggests further digging
within the zfs_zget() function to ensure that _it_ is returning the
correct thing and whether or not it needs to be handled there or within
zfs_fhtovp().

---
src-orig/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
2010-03-22 11:41:21.0 -0400
+++ src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
2010-03-22 16:25:21.0 -0400
@@ -1246,7 +1246,7 @@
dprintf("getting %llu [%u mask %llx]\n", object, fid_gen,
 

gen_mask);
   

if (err = zfs_zget(zfsvfs, object,&zp)) {
ZFS_EXIT(zfsvfs);
-return (err);
+return (ESTALE);
}
zp_gen = zp->z_phys->zp_gen&   gen_mask;
if (zp_gen == 0)

 

So the odd thing here is that ffs_fhtovp() doesn't return ESTALE if
   

VFS_VGET()
   

(which calls ffs_vget()) fails, it only returns ESTALE if the generation
   

count
   

doesn't matter.


   

It looks like it also returns ESTALE when the inode is invalid (<
ROOTINO ||>  max inodes?) - would an unlinked file in FFS referenced at
a later time report an invalid inode?

But back to your point, zfs_zget() seems to be failing and returning the
EINVAL before zfs_fhtovp() even has a chance to set and check zp_gen.
I'm trying to get some more details through the use of gratuitous
dprintf()'s, but they don't seem to be making it to any logs or the
console even with vfs.zfs.debug=1 set.  Any pointers on how to get these
dprintf() calls working?
 

That I have no idea on.  Maybe Rick can chime in?  I'm actually not sure why
we would want to treat a FHTOVP failure as anything but an ESTALE error in the
NFS server to be honest.

   


I resorted to changing dprintf()s to printf()s.  The failure in 
zfs_fhtovp() is indeed from zfs_zget(), which fails right at the top 
where it calls dmu_bonus_hold():
Mar 22 16:55:44 zfs-dev kernel: zfs_zget(): dmu_bonus_hold() failed, 
returning err: 17
Mar 22 16:55:44 zfs-dev kernel: zfs_fhtovp(): zfs_zget() failed, bailing 
out with err: 17

errno 17 seems to map to EEXIST.

in zfs_zget():
err = dmu_bonus_hold(zfsvfs->z_os, obj_num, NULL, &db);
if (err) {
ZFS_OBJ_HOLD_EXIT(zfsvfs, obj_num);
printf("zfs_zget(): dmu_bonus_hold()

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-22 Thread John Baldwin

On Monday 22 March 2010 12:44:04 pm Steve Polyack wrote:
> On 03/22/10 12:00, John Baldwin wrote:
> > On Monday 22 March 2010 11:47:43 am Steve Polyack wrote:
> >
> >> On 03/22/10 10:52, Steve Polyack wrote:
> >>  
> >>> On 3/19/2010 11:27 PM, Rick Macklem wrote:
> >>>
>  On Fri, 19 Mar 2010, Steve Polyack wrote:
> 
>  [good stuff snipped]
>   
> > This makes sense.  According to wireshark, the server is indeed
> > transmitting "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE
> > instead; it sounds more correct than marking it a general IO error.
> > Also, the NFS server is serving its share off of a ZFS filesystem,
> > if it makes any difference.  I suppose ZFS could be talking to the
> > NFS server threads with some mismatched language, but I doubt it.
> >
> >
>  Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
>  ESTALE when the file no longer exists, the NFS server returns whatever
>  error it has returned.
> 
>  So, either VFS_FHTOVP() succeeds after the file has been deleted, which
>  would be a problem that needs to be fixed within ZFS
>  OR
>  ZFS returns an error other than ESTALE when it doesn't exist.
> 
>  Try the following patch on the server (which just makes any error
>  returned by VFS_FHTOVP() into ESTALE) and see if that helps.
> 
>  --- nfsserver/nfs_srvsubs.c.sav2010-03-19 22:06:43.0 -0400
>  +++ nfsserver/nfs_srvsubs.c2010-03-19 22:07:22.0 -0400
>  @@ -1127,6 +1127,8 @@
>    }
>    }
>    error = VFS_FHTOVP(mp,&fhp->fh_fid, vpp);
>  +if (error != 0)
>  +error = ESTALE;
>    vfs_unbusy(mp);
>    if (error)
>    goto out;
> 
>  Please let me know if the patch helps, rick
> 
> 
>   
> >>> The patch seems to fix the bad behavior.  Running with the patch, I
> >>> see the following output from my patch (return code of nfs_doio from
> >>> within nfsiod):
> >>> nfssvc_iod: iod 0 nfs_doio returned errno: 70
> >>>
> >>> Furthermore, when inspecting the transaction with Wireshark, after
> >>> deleting the file on the NFS server it looks like there is only a
> >>> single error.  This time there it is a reply to a V3 Lookup call that
> >>> contains a status of "NFS3ERR_NOENT (2)" coming from the NFS server.
> >>> The client also does not repeatedly try to complete the failed request.
> >>>
> >>> Any suggestions on the next step here?  Based on what you said it
> >>> looks like ZFS is falsely reporting an IO error to VFS instead of
> >>> ESTALE / NOENT.  I tried looking around zfs_fhtovp() and only saw
> >>> returns of EINVAL, but I'm not even sure I'm looking in the right place.
> >>>
> >> Further on down the rabbit hole... here's the piece in zfs_fhtovp()
> >> where it's kicking out EINVAL instead of ESTALE - the following patch
> >> corrects the behavior, but of course also suggests further digging
> >> within the zfs_zget() function to ensure that _it_ is returning the
> >> correct thing and whether or not it needs to be handled there or within
> >> zfs_fhtovp().
> >>
> >> ---
> >> src-orig/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
> >> 2010-03-22 11:41:21.0 -0400
> >> +++ src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
> >> 2010-03-22 16:25:21.0 -0400
> >> @@ -1246,7 +1246,7 @@
> >>dprintf("getting %llu [%u mask %llx]\n", object, fid_gen, 
gen_mask);
> >>if (err = zfs_zget(zfsvfs, object,&zp)) {
> >>ZFS_EXIT(zfsvfs);
> >> -return (err);
> >> +return (ESTALE);
> >>}
> >>zp_gen = zp->z_phys->zp_gen&  gen_mask;
> >>if (zp_gen == 0)
> >>  
> > So the odd thing here is that ffs_fhtovp() doesn't return ESTALE if 
VFS_VGET()
> > (which calls ffs_vget()) fails, it only returns ESTALE if the generation 
count
> > doesn't matter.
> >
> >
> It looks like it also returns ESTALE when the inode is invalid (< 
> ROOTINO || > max inodes?) - would an unlinked file in FFS referenced at 
> a later time report an invalid inode?
> 
> But back to your point, zfs_zget() seems to be failing and returning the 
> EINVAL before zfs_fhtovp() even has a chance to set and check zp_gen.  
> I'm trying to get some more details through the use of gratuitous 
> dprintf()'s, but they don't seem to be making it to any logs or the 
> console even with vfs.zfs.debug=1 set.  Any pointers on how to get these 
> dprintf() calls working?

That I have no idea on.  Maybe Rick can chime in?  I'm actually not sure why 
we would want to treat a FHTOVP failure as anything but an ESTALE error in the 
NFS server to be honest.

-- 
John Baldwin
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any m

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-22 Thread Steve Polyack


On 03/22/10 12:00, John Baldwin wrote:

On Monday 22 March 2010 11:47:43 am Steve Polyack wrote:
   

On 03/22/10 10:52, Steve Polyack wrote:
 

On 3/19/2010 11:27 PM, Rick Macklem wrote:
   

On Fri, 19 Mar 2010, Steve Polyack wrote:

[good stuff snipped]
 

This makes sense.  According to wireshark, the server is indeed
transmitting "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE
instead; it sounds more correct than marking it a general IO error.
Also, the NFS server is serving its share off of a ZFS filesystem,
if it makes any difference.  I suppose ZFS could be talking to the
NFS server threads with some mismatched language, but I doubt it.

   

Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
ESTALE when the file no longer exists, the NFS server returns whatever
error it has returned.

So, either VFS_FHTOVP() succeeds after the file has been deleted, which
would be a problem that needs to be fixed within ZFS
OR
ZFS returns an error other than ESTALE when it doesn't exist.

Try the following patch on the server (which just makes any error
returned by VFS_FHTOVP() into ESTALE) and see if that helps.

--- nfsserver/nfs_srvsubs.c.sav2010-03-19 22:06:43.0 -0400
+++ nfsserver/nfs_srvsubs.c2010-03-19 22:07:22.0 -0400
@@ -1127,6 +1127,8 @@
  }
  }
  error = VFS_FHTOVP(mp,&fhp->fh_fid, vpp);
+if (error != 0)
+error = ESTALE;
  vfs_unbusy(mp);
  if (error)
  goto out;

Please let me know if the patch helps, rick


 

The patch seems to fix the bad behavior.  Running with the patch, I
see the following output from my patch (return code of nfs_doio from
within nfsiod):
nfssvc_iod: iod 0 nfs_doio returned errno: 70

Furthermore, when inspecting the transaction with Wireshark, after
deleting the file on the NFS server it looks like there is only a
single error.  This time there it is a reply to a V3 Lookup call that
contains a status of "NFS3ERR_NOENT (2)" coming from the NFS server.
The client also does not repeatedly try to complete the failed request.

Any suggestions on the next step here?  Based on what you said it
looks like ZFS is falsely reporting an IO error to VFS instead of
ESTALE / NOENT.  I tried looking around zfs_fhtovp() and only saw
returns of EINVAL, but I'm not even sure I'm looking in the right place.
   

Further on down the rabbit hole... here's the piece in zfs_fhtovp()
where it's kicking out EINVAL instead of ESTALE - the following patch
corrects the behavior, but of course also suggests further digging
within the zfs_zget() function to ensure that _it_ is returning the
correct thing and whether or not it needs to be handled there or within
zfs_fhtovp().

---
src-orig/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
2010-03-22 11:41:21.0 -0400
+++ src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
2010-03-22 16:25:21.0 -0400
@@ -1246,7 +1246,7 @@
   dprintf("getting %llu [%u mask %llx]\n", object, fid_gen, gen_mask);
   if (err = zfs_zget(zfsvfs, object,&zp)) {
   ZFS_EXIT(zfsvfs);
-return (err);
+return (ESTALE);
   }
   zp_gen = zp->z_phys->zp_gen&  gen_mask;
   if (zp_gen == 0)
 

So the odd thing here is that ffs_fhtovp() doesn't return ESTALE if VFS_VGET()
(which calls ffs_vget()) fails, it only returns ESTALE if the generation count
doesn't matter.

   
It looks like it also returns ESTALE when the inode is invalid (< 
ROOTINO || > max inodes?) - would an unlinked file in FFS referenced at 
a later time report an invalid inode?


But back to your point, zfs_zget() seems to be failing and returning the 
EINVAL before zfs_fhtovp() even has a chance to set and check zp_gen.  
I'm trying to get some more details through the use of gratuitous 
dprintf()'s, but they don't seem to be making it to any logs or the 
console even with vfs.zfs.debug=1 set.  Any pointers on how to get these 
dprintf() calls working?


Thanks again.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-22 Thread John Baldwin

On Monday 22 March 2010 11:47:43 am Steve Polyack wrote:
> On 03/22/10 10:52, Steve Polyack wrote:
> > On 3/19/2010 11:27 PM, Rick Macklem wrote:
> >> On Fri, 19 Mar 2010, Steve Polyack wrote:
> >>
> >> [good stuff snipped]
> >>>
> >>> This makes sense.  According to wireshark, the server is indeed 
> >>> transmitting "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE 
> >>> instead; it sounds more correct than marking it a general IO error.  
> >>> Also, the NFS server is serving its share off of a ZFS filesystem, 
> >>> if it makes any difference.  I suppose ZFS could be talking to the 
> >>> NFS server threads with some mismatched language, but I doubt it.
> >>>
> >> Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
> >> ESTALE when the file no longer exists, the NFS server returns whatever
> >> error it has returned.
> >>
> >> So, either VFS_FHTOVP() succeeds after the file has been deleted, which
> >> would be a problem that needs to be fixed within ZFS
> >> OR
> >> ZFS returns an error other than ESTALE when it doesn't exist.
> >>
> >> Try the following patch on the server (which just makes any error
> >> returned by VFS_FHTOVP() into ESTALE) and see if that helps.
> >>
> >> --- nfsserver/nfs_srvsubs.c.sav2010-03-19 22:06:43.0 -0400
> >> +++ nfsserver/nfs_srvsubs.c2010-03-19 22:07:22.0 -0400
> >> @@ -1127,6 +1127,8 @@
> >>  }
> >>  }
> >>  error = VFS_FHTOVP(mp, &fhp->fh_fid, vpp);
> >> +if (error != 0)
> >> +error = ESTALE;
> >>  vfs_unbusy(mp);
> >>  if (error)
> >>  goto out;
> >>
> >> Please let me know if the patch helps, rick
> >>
> >>
> > The patch seems to fix the bad behavior.  Running with the patch, I 
> > see the following output from my patch (return code of nfs_doio from 
> > within nfsiod):
> > nfssvc_iod: iod 0 nfs_doio returned errno: 70
> >
> > Furthermore, when inspecting the transaction with Wireshark, after 
> > deleting the file on the NFS server it looks like there is only a 
> > single error.  This time there it is a reply to a V3 Lookup call that 
> > contains a status of "NFS3ERR_NOENT (2)" coming from the NFS server.  
> > The client also does not repeatedly try to complete the failed request.
> >
> > Any suggestions on the next step here?  Based on what you said it 
> > looks like ZFS is falsely reporting an IO error to VFS instead of 
> > ESTALE / NOENT.  I tried looking around zfs_fhtovp() and only saw 
> > returns of EINVAL, but I'm not even sure I'm looking in the right place.
> 
> Further on down the rabbit hole... here's the piece in zfs_fhtovp() 
> where it's kicking out EINVAL instead of ESTALE - the following patch 
> corrects the behavior, but of course also suggests further digging 
> within the zfs_zget() function to ensure that _it_ is returning the 
> correct thing and whether or not it needs to be handled there or within 
> zfs_fhtovp().
> 
> --- 
> src-orig/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
> 2010-03-22 11:41:21.0 -0400
> +++ src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
> 2010-03-22 16:25:21.0 -0400
> @@ -1246,7 +1246,7 @@
>   dprintf("getting %llu [%u mask %llx]\n", object, fid_gen, gen_mask);
>   if (err = zfs_zget(zfsvfs, object, &zp)) {
>   ZFS_EXIT(zfsvfs);
> -return (err);
> +return (ESTALE);
>   }
>   zp_gen = zp->z_phys->zp_gen & gen_mask;
>   if (zp_gen == 0)

So the odd thing here is that ffs_fhtovp() doesn't return ESTALE if VFS_VGET() 
(which calls ffs_vget()) fails, it only returns ESTALE if the generation count 
doesn't matter.

-- 
John Baldwin
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-22 Thread Steve Polyack


On 03/22/10 10:52, Steve Polyack wrote:

On 3/19/2010 11:27 PM, Rick Macklem wrote:

On Fri, 19 Mar 2010, Steve Polyack wrote:

[good stuff snipped]


This makes sense.  According to wireshark, the server is indeed 
transmitting "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE 
instead; it sounds more correct than marking it a general IO error.  
Also, the NFS server is serving its share off of a ZFS filesystem, 
if it makes any difference.  I suppose ZFS could be talking to the 
NFS server threads with some mismatched language, but I doubt it.



Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
ESTALE when the file no longer exists, the NFS server returns whatever
error it has returned.

So, either VFS_FHTOVP() succeeds after the file has been deleted, which
would be a problem that needs to be fixed within ZFS
OR
ZFS returns an error other than ESTALE when it doesn't exist.

Try the following patch on the server (which just makes any error
returned by VFS_FHTOVP() into ESTALE) and see if that helps.

--- nfsserver/nfs_srvsubs.c.sav2010-03-19 22:06:43.0 -0400
+++ nfsserver/nfs_srvsubs.c2010-03-19 22:07:22.0 -0400
@@ -1127,6 +1127,8 @@
 }
 }
 error = VFS_FHTOVP(mp, &fhp->fh_fid, vpp);
+if (error != 0)
+error = ESTALE;
 vfs_unbusy(mp);
 if (error)
 goto out;

Please let me know if the patch helps, rick


The patch seems to fix the bad behavior.  Running with the patch, I 
see the following output from my patch (return code of nfs_doio from 
within nfsiod):

nfssvc_iod: iod 0 nfs_doio returned errno: 70

Furthermore, when inspecting the transaction with Wireshark, after 
deleting the file on the NFS server it looks like there is only a 
single error.  This time there it is a reply to a V3 Lookup call that 
contains a status of "NFS3ERR_NOENT (2)" coming from the NFS server.  
The client also does not repeatedly try to complete the failed request.


Any suggestions on the next step here?  Based on what you said it 
looks like ZFS is falsely reporting an IO error to VFS instead of 
ESTALE / NOENT.  I tried looking around zfs_fhtovp() and only saw 
returns of EINVAL, but I'm not even sure I'm looking in the right place.


Further on down the rabbit hole... here's the piece in zfs_fhtovp() 
where it's kicking out EINVAL instead of ESTALE - the following patch 
corrects the behavior, but of course also suggests further digging 
within the zfs_zget() function to ensure that _it_ is returning the 
correct thing and whether or not it needs to be handled there or within 
zfs_fhtovp().


--- 
src-orig/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
2010-03-22 11:41:21.0 -0400
+++ src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c
2010-03-22 16:25:21.0 -0400

@@ -1246,7 +1246,7 @@
 dprintf("getting %llu [%u mask %llx]\n", object, fid_gen, gen_mask);
 if (err = zfs_zget(zfsvfs, object, &zp)) {
 ZFS_EXIT(zfsvfs);
-return (err);
+return (ESTALE);
 }
 zp_gen = zp->z_phys->zp_gen & gen_mask;
 if (zp_gen == 0)


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-22 Thread Steve Polyack


On 3/19/2010 11:27 PM, Rick Macklem wrote:



On Fri, 19 Mar 2010, Steve Polyack wrote:

[good stuff snipped]


This makes sense.  According to wireshark, the server is indeed 
transmitting "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE 
instead; it sounds more correct than marking it a general IO error.  
Also, the NFS server is serving its share off of a ZFS filesystem, if 
it makes any difference.  I suppose ZFS could be talking to the NFS 
server threads with some mismatched language, but I doubt it.



Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
ESTALE when the file no longer exists, the NFS server returns whatever
error it has returned.

So, either VFS_FHTOVP() succeeds after the file has been deleted, which
would be a problem that needs to be fixed within ZFS
OR
ZFS returns an error other than ESTALE when it doesn't exist.

Try the following patch on the server (which just makes any error
returned by VFS_FHTOVP() into ESTALE) and see if that helps.

--- nfsserver/nfs_srvsubs.c.sav2010-03-19 22:06:43.0 -0400
+++ nfsserver/nfs_srvsubs.c2010-03-19 22:07:22.0 -0400
@@ -1127,6 +1127,8 @@
 }
 }
 error = VFS_FHTOVP(mp, &fhp->fh_fid, vpp);
+if (error != 0)
+error = ESTALE;
 vfs_unbusy(mp);
 if (error)
 goto out;

Please let me know if the patch helps, rick


The patch seems to fix the bad behavior.  Running with the patch, I see 
the following output from my patch (return code of nfs_doio from within 
nfsiod):

nfssvc_iod: iod 0 nfs_doio returned errno: 70

Furthermore, when inspecting the transaction with Wireshark, after 
deleting the file on the NFS server it looks like there is only a single 
error.  This time there it is a reply to a V3 Lookup call that contains 
a status of "NFS3ERR_NOENT (2)" coming from the NFS server.  The client 
also does not repeatedly try to complete the failed request.


Any suggestions on the next step here?  Based on what you said it looks 
like ZFS is falsely reporting an IO error to VFS instead of ESTALE / 
NOENT.  I tried looking around zfs_fhtovp() and only saw returns of 
EINVAL, but I'm not even sure I'm looking in the right place.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-22 Thread John Baldwin

On Friday 19 March 2010 11:27:13 pm Rick Macklem wrote:
> 
> On Fri, 19 Mar 2010, Steve Polyack wrote:
> 
> [good stuff snipped]
> >
> > This makes sense.  According to wireshark, the server is indeed 
> > transmitting 
> > "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE instead; it sounds 
> > more correct than marking it a general IO error.  Also, the NFS server is 
> > serving its share off of a ZFS filesystem, if it makes any difference.  I 
> > suppose ZFS could be talking to the NFS server threads with some mismatched 
> > language, but I doubt it.
> >
> Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
> ESTALE when the file no longer exists, the NFS server returns whatever
> error it has returned.
> 
> So, either VFS_FHTOVP() succeeds after the file has been deleted, which
> would be a problem that needs to be fixed within ZFS
> OR
> ZFS returns an error other than ESTALE when it doesn't exist.
> 
> Try the following patch on the server (which just makes any error
> returned by VFS_FHTOVP() into ESTALE) and see if that helps.
> 
> --- nfsserver/nfs_srvsubs.c.sav   2010-03-19 22:06:43.0 -0400
> +++ nfsserver/nfs_srvsubs.c   2010-03-19 22:07:22.0 -0400
> @@ -1127,6 +1127,8 @@
>   }
>   }
>   error = VFS_FHTOVP(mp, &fhp->fh_fid, vpp);
> + if (error != 0)
> + error = ESTALE;
>   vfs_unbusy(mp);
>   if (error)
>   goto out;
> 
> Please let me know if the patch helps, rick

I can confirm that ZFS's FHTOVP() method never returns ESTALE.  Perhaps this
patch would fix it?  It changes zfs_fhtovp() to return ESTALE if the
generation count doesn't match.  If this doesn't help, you can try changing
some of the other return cases in this function to ESTALE (many use EINVAL)
until you find the one that matches this condition.

Index: zfs_vfsops.c
===
--- zfs_vfsops.c(revision 205334)
+++ zfs_vfsops.c(working copy)
@@ -1256,7 +1256,7 @@
dprintf("znode gen (%u) != fid gen (%u)\n", zp_gen, fid_gen);
VN_RELE(ZTOV(zp));
ZFS_EXIT(zfsvfs);
-   return (EINVAL);
+   return (ESTALE);
}
 
*vpp = ZTOV(zp);

-- 
John Baldwin
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-19 Thread Rick Macklem




On Fri, 19 Mar 2010, Steve Polyack wrote:

[good stuff snipped]


This makes sense.  According to wireshark, the server is indeed transmitting 
"Status: NFS3ERR_IO (5)".  Perhaps this should be STALE instead; it sounds 
more correct than marking it a general IO error.  Also, the NFS server is 
serving its share off of a ZFS filesystem, if it makes any difference.  I 
suppose ZFS could be talking to the NFS server threads with some mismatched 
language, but I doubt it.



Ok, now I think we're making progress. If VFS_FHTOVP() doesn't return
ESTALE when the file no longer exists, the NFS server returns whatever
error it has returned.

So, either VFS_FHTOVP() succeeds after the file has been deleted, which
would be a problem that needs to be fixed within ZFS
OR
ZFS returns an error other than ESTALE when it doesn't exist.

Try the following patch on the server (which just makes any error
returned by VFS_FHTOVP() into ESTALE) and see if that helps.

--- nfsserver/nfs_srvsubs.c.sav 2010-03-19 22:06:43.0 -0400
+++ nfsserver/nfs_srvsubs.c 2010-03-19 22:07:22.0 -0400
@@ -1127,6 +1127,8 @@
}
}
error = VFS_FHTOVP(mp, &fhp->fh_fid, vpp);
+   if (error != 0)
+   error = ESTALE;
vfs_unbusy(mp);
if (error)
goto out;

Please let me know if the patch helps, rick

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-19 Thread Steve Polyack


On 3/19/2010 9:32 PM, Rick Macklem wrote:


On Fri, 19 Mar 2010, Steve Polyack wrote:



To anyone who is interested: I did some poking around with DTrace, 
which led me to the nfsiod client code.

In src/sys/nfsclient/nfs_nfsiod.c:
   } else {
   if (bp->b_iocmd == BIO_READ)
   (void) nfs_doio(bp->b_vp, bp, bp->b_rcred, NULL);
   else
   (void) nfs_doio(bp->b_vp, bp, bp->b_wcred, NULL);
   }



If you look t nfs_doio(), it decides whether or not to mark the buffer
invalid, based on the return value it gets. Some (EINTR, ETIMEDOUT, EIO)
are not considered fatal, but the others are. (When the async I/O
daemons call nfs_doio(), they are threads that couldn't care less if
the underlying I/O op succeeded. The outcome of the I/O operation
determines what nfs_doio() does with the buffer cache block.)


I was looking at this and noticed the above after my last post.



The result is that my problematic repeatable circumstance begins 
logging "nfssvc_iod: iod 0 nfs_doio returned errno: 5" (corresponding 
to NFSERR_INVAL?) for each repetition of the failed write.  The only 
things triggering this are my failed writes.  I can also see the 
nfsiod0 process waking up each iteration.




Nope, errno 5 is EIO and that's where the problem is. I don't know why
the server is returning EIO after the file has been deleted on the
server (I assume you did that when running your little shell script?).


Yes, while running the simple shell script I simply deleted the file on 
the NFS server itself.


Do we need some kind of "retry x times then abort" logic within 
nfsiod_iod(), or does this belong in the subsequent functions, such 
as nfs_doio()?  I think it's best to avoid these sorts of infinite 
loops which have the potential to take out the system or overload the 
network due to dumb decisions made by unprivileged users.



Nope, people don't like data not getting written back to a server when
it is slow or temporarily network partitioned. The only thing that should
stop a client from retrying a write back to the server is a fatal error
from the server that says "this won't ever succeed".

I think we need to figure out if the EIO (NFS3ERR_IO in wireshark) or
if the server is sending NFS3ERR_STALE and the client is somehow munging
that into EIO, causing the confusion.


This makes sense.  According to wireshark, the server is indeed 
transmitting "Status: NFS3ERR_IO (5)".  Perhaps this should be STALE 
instead; it sounds more correct than marking it a general IO error.  
Also, the NFS server is serving its share off of a ZFS filesystem, if it 
makes any difference.  I suppose ZFS could be talking to the NFS server 
threads with some mismatched language, but I doubt it.


Thanks for the informative response,
Steve

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-19 Thread Rick Macklem




On Fri, 19 Mar 2010, Steve Polyack wrote:



To anyone who is interested: I did some poking around with DTrace, which led 
me to the nfsiod client code.

In src/sys/nfsclient/nfs_nfsiod.c:
   } else {
   if (bp->b_iocmd == BIO_READ)
   (void) nfs_doio(bp->b_vp, bp, bp->b_rcred, NULL);
   else
   (void) nfs_doio(bp->b_vp, bp, bp->b_wcred, NULL);
   }



If you look t nfs_doio(), it decides whether or not to mark the buffer
invalid, based on the return value it gets. Some (EINTR, ETIMEDOUT, EIO)
are not considered fatal, but the others are. (When the async I/O
daemons call nfs_doio(), they are threads that couldn't care less if
the underlying I/O op succeeded. The outcome of the I/O operation
determines what nfs_doio() does with the buffer cache block.)



The result is that my problematic repeatable circumstance begins logging 
"nfssvc_iod: iod 0 nfs_doio returned errno: 5" (corresponding to 
NFSERR_INVAL?) for each repetition of the failed write.  The only things 
triggering this are my failed writes.  I can also see the nfsiod0 process 
waking up each iteration.




Nope, errno 5 is EIO and that's where the problem is. I don't know why
the server is returning EIO after the file has been deleted on the
server (I assume you did that when running your little shell script?).


Do we need some kind of "retry x times then abort" logic within nfsiod_iod(), 
or does this belong in the subsequent functions, such as nfs_doio()?  I think 
it's best to avoid these sorts of infinite loops which have the potential to 
take out the system or overload the network due to dumb decisions made by 
unprivileged users.



Nope, people don't like data not getting written back to a server when
it is slow or temporarily network partitioned. The only thing that should
stop a client from retrying a write back to the server is a fatal error
from the server that says "this won't ever succeed".

I think we need to figure out if the EIO (NFS3ERR_IO in wireshark) or
if the server is sending NFS3ERR_STALE and the client is somehow munging
that into EIO, causing the confusion.

rick

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-19 Thread Rick Macklem




On Fri, 19 Mar 2010, John Baldwin wrote:


On Friday 19 March 2010 7:34:23 am Steve Polyack wrote:

Hi, we use a FreeBSD 8-STABLE (from shortly after release) system as an
NFS server to provide user home directories which get mounted across a
few machines (all 6.3-RELEASE).  For the past few weeks we have been
running into problems where one particular client will go into an
infinite loop where it is repeatedly trying to write data which causes
the NFS server to return "reply ok 40 write ERROR: Input/output error
PRE: POST:".  This retry loop can cause between 20mbps and 500mbps of


I'm afraid I don't quite understand what you mean by "causes the NFS
server to return "reply ok 40 write ERROR..."". Is this something
logged by syslog (I can't find a printf like this in the kernel
sources) or is this something that tcpdump is giving you or ???

Why I ask is that it seems to say that the server is returning EIO
(or maybe 40 == EMSGSIZE).

The server should return ESTALE (NFSERR_STALE) after a file has
been deleted. If it is returning EIO, then that will cause the
client to keep trying to write the dirty block to the server.
(EIO is interpreted by the client as a "transient error".)

[good stuff snipped]


I have a feeling that using NFS in such a matter may simply be prone to
such problems, but what confuses me is why the NFS client system is
infinitely retrying the write operation and causing itself so much grief.


Yes, your feeling is correct.  This sort of race is inherent to NFS if you do
not use some sort of locking protocol to resolve the race.  The infinite
retries sound like a client-side issue.  Have you been able to try a newer OS
version on a client to see if it still causes the same behavior?


As John notes, having one client delete a file while another is trying
to write it, is not a good thing.

However, the server should return ESTALE after the file is deleted and
that tells the client that the write can never succeed, so it marks the
buffer cache block invalid and returns the error to the app. (The app.
may not see it, if it doesn't check for error returns upon close as well
as write, but that's another story...)

If you could look at a packet trace via wireshark when the problem
occurs, it would be nice to see what the server is returning. (If it
isn't ESTALE and the file no longer exists on the server, then thats
a server problem.) If it is returning ESTALE, then the client is busted.
(At a glance, the client code looks like it would handle ESTALE as a
fatal error for the buffer cache, but that doesn't mean it isn't broken,
just that it doesn't appear wrong. Also, it looks like mmap'd writes
won't recognize a fatal write error and will just keep trying to write
the dirty page back to the server. Take this with a big grain of salt,
since I just took a quick look at the sources. FreeBSD6->8 appear to
be pretty much the same as far as this goes, in the client.

Please let us know if you can see the server's error reply code.

Good luck with it, rick
ps: If the server isn't returning ESTALE, you could try switching to
the experimental nfs server and see if it exhibits the same behaviour?
("-e" option on both mountd and nfsd, assuming the server is
 FreeBSD8.)
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-19 Thread Steve Polyack


On 03/19/10 11:05, Steve Polyack wrote:

On 03/19/10 09:23, Steve Polyack wrote:

On 03/19/10 08:31, John Baldwin wrote:

On Friday 19 March 2010 7:34:23 am Steve Polyack wrote:
Hi, we use a FreeBSD 8-STABLE (from shortly after release) system 
as an

NFS server to provide user home directories which get mounted across a
few machines (all 6.3-RELEASE).  For the past few weeks we have been
running into problems where one particular client will go into an
infinite loop where it is repeatedly trying to write data which causes
the NFS server to return "reply ok 40 write ERROR: Input/output error
PRE: POST:".  This retry loop can cause between 20mbps and 500mbps of
constant traffic on our network, depending on the size of the data
associated with the failed write.

Yes, your feeling is correct.  This sort of race is inherent to NFS 
if you do
not use some sort of locking protocol to resolve the race.  The 
infinite
retries sound like a client-side issue.  Have you been able to try a 
newer OS

version on a client to see if it still causes the same behavior?

I can't try a newer FBSD version on the client where we are seeing 
the problems, but I can recreate the problem fairly easily.  Perhaps 
I'll try it with an 8.0 client.  If I remember correctly, one of the 
strange things is that it doesn't seem to hit "critical mass" until a 
few hours after the operation first fails.  I may be wrong, but I'll 
double check that when I check vs. 8.0-release.


I forgot to add this in the first post, but these are all TCP NFS v3 
mounts.


Thanks for the response.


Ok, so I'm still able to trigger what appears to be the same retry 
loop with an 8.0-RELEASE nfsv3 client (going on 1.5 hours now):

$ cat nfs.sh
client#!/usr/local/bin/bash
for a in {1..15} ; do
  sleep 1;
  echo "$a$a$";
done
client$ ./nfs.sh >~/output

the on the server while the above is running:
server$ rm ~/output

What happens is that you will see 3-4 of the same write attempts 
happen per minute via tcpdump.  Our previous logs show that this is 
how it starts, and then ~4 hours later it begins to spiral out of 
control, throwing out up to 3,000 of the same failed write requests 
per second.


To anyone who is interested: I did some poking around with DTrace, which 
led me to the nfsiod client code.

In src/sys/nfsclient/nfs_nfsiod.c:
} else {
if (bp->b_iocmd == BIO_READ)
(void) nfs_doio(bp->b_vp, bp, bp->b_rcred, NULL);
else
(void) nfs_doio(bp->b_vp, bp, bp->b_wcred, NULL);
}

These two calls to nfs_doio trash the return codes (which are errors 
cascading up from various other NFS write-related functions).  I'm not 
entirely familiar with the way nfsiod works, but if nfs_doio() or other 
subsequent functions are supposed to be removing the current async NFS 
operation from a queue which nfsiod handles, they are not doing so when 
they encounter an error.  They simply report the error back to the 
caller, who in this case is not even looking at the value.


I've tested this by pushing the return code into a new int, errno, and 
adding:

  if (errno) {
NFS_DPF(ASYNCIO,
 ("nfssvc_iod: iod %d nfs_doio returned 
errno: %d\n",

 myiod, errno));
   }

The result is that my problematic repeatable circumstance begins logging 
"nfssvc_iod: iod 0 nfs_doio returned errno: 5" (corresponding to 
NFSERR_INVAL?) for each repetition of the failed write.  The only things 
triggering this are my failed writes.  I can also see the nfsiod0 
process waking up each iteration.


Do we need some kind of "retry x times then abort" logic within 
nfsiod_iod(), or does this belong in the subsequent functions, such as 
nfs_doio()?  I think it's best to avoid these sorts of infinite loops 
which have the potential to take out the system or overload the network 
due to dumb decisions made by unprivileged users.


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-19 Thread Steve Polyack


On 03/19/10 09:23, Steve Polyack wrote:

On 03/19/10 08:31, John Baldwin wrote:

On Friday 19 March 2010 7:34:23 am Steve Polyack wrote:

Hi, we use a FreeBSD 8-STABLE (from shortly after release) system as an
NFS server to provide user home directories which get mounted across a
few machines (all 6.3-RELEASE).  For the past few weeks we have been
running into problems where one particular client will go into an
infinite loop where it is repeatedly trying to write data which causes
the NFS server to return "reply ok 40 write ERROR: Input/output error
PRE: POST:".  This retry loop can cause between 20mbps and 500mbps of
constant traffic on our network, depending on the size of the data
associated with the failed write.

We spent some time on the issue and determined that something on one of
the clients is deleting a file as it is being written to by another NFS
client.  We were able to enable the NFS lockmgr and use lockf(1) to fix
most of these conditions, and the frequency of this problem has dropped
from once a night to once a week.  However, it's still a problem and we
can't necessarily force all of our users to "play nice" and use 
lockf/flock.


Has anyone seen this before?  No errors are being logged on the NFS
server itself, but the "Server Ret-Failed" counter begins to increase
rapidly whenever a client gets stuck in this infinite retry loop:
Server Ret-Failed
  224768961

I have a feeling that using NFS in such a matter may simply be prone to
such problems, but what confuses me is why the NFS client system is
infinitely retrying the write operation and causing itself so much 
grief.
Yes, your feeling is correct.  This sort of race is inherent to NFS 
if you do

not use some sort of locking protocol to resolve the race.  The infinite
retries sound like a client-side issue.  Have you been able to try a 
newer OS

version on a client to see if it still causes the same behavior?

I can't try a newer FBSD version on the client where we are seeing the 
problems, but I can recreate the problem fairly easily.  Perhaps I'll 
try it with an 8.0 client.  If I remember correctly, one of the 
strange things is that it doesn't seem to hit "critical mass" until a 
few hours after the operation first fails.  I may be wrong, but I'll 
double check that when I check vs. 8.0-release.


I forgot to add this in the first post, but these are all TCP NFS v3 
mounts.


Thanks for the response.


Ok, so I'm still able to trigger what appears to be the same retry loop 
with an 8.0-RELEASE nfsv3 client (going on 1.5 hours now):

$ cat nfs.sh
client#!/usr/local/bin/bash
for a in {1..15} ; do
  sleep 1;
  echo "$a$a$";
done
client$ ./nfs.sh >~/output

the on the server while the above is running:
server$ rm ~/output

What happens is that you will see 3-4 of the same write attempts happen 
per minute via tcpdump.  Our previous logs show that this is how it 
starts, and then ~4 hours later it begins to spiral out of control, 
throwing out up to 3,000 of the same failed write requests per second.


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-19 Thread Steve Polyack


On 03/19/10 08:31, John Baldwin wrote:

On Friday 19 March 2010 7:34:23 am Steve Polyack wrote:
   

Hi, we use a FreeBSD 8-STABLE (from shortly after release) system as an
NFS server to provide user home directories which get mounted across a
few machines (all 6.3-RELEASE).  For the past few weeks we have been
running into problems where one particular client will go into an
infinite loop where it is repeatedly trying to write data which causes
the NFS server to return "reply ok 40 write ERROR: Input/output error
PRE: POST:".  This retry loop can cause between 20mbps and 500mbps of
constant traffic on our network, depending on the size of the data
associated with the failed write.

We spent some time on the issue and determined that something on one of
the clients is deleting a file as it is being written to by another NFS
client.  We were able to enable the NFS lockmgr and use lockf(1) to fix
most of these conditions, and the frequency of this problem has dropped
from once a night to once a week.  However, it's still a problem and we
can't necessarily force all of our users to "play nice" and use lockf/flock.

Has anyone seen this before?  No errors are being logged on the NFS
server itself, but the "Server Ret-Failed" counter begins to increase
rapidly whenever a client gets stuck in this infinite retry loop:
Server Ret-Failed
  224768961

I have a feeling that using NFS in such a matter may simply be prone to
such problems, but what confuses me is why the NFS client system is
infinitely retrying the write operation and causing itself so much grief.
 

Yes, your feeling is correct.  This sort of race is inherent to NFS if you do
not use some sort of locking protocol to resolve the race.  The infinite
retries sound like a client-side issue.  Have you been able to try a newer OS
version on a client to see if it still causes the same behavior?

   
I can't try a newer FBSD version on the client where we are seeing the 
problems, but I can recreate the problem fairly easily.  Perhaps I'll 
try it with an 8.0 client.  If I remember correctly, one of the strange 
things is that it doesn't seem to hit "critical mass" until a few hours 
after the operation first fails.  I may be wrong, but I'll double check 
that when I check vs. 8.0-release.


I forgot to add this in the first post, but these are all TCP NFS v3 mounts.

Thanks for the response.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

2010-03-19 Thread John Baldwin

On Friday 19 March 2010 7:34:23 am Steve Polyack wrote:
> Hi, we use a FreeBSD 8-STABLE (from shortly after release) system as an 
> NFS server to provide user home directories which get mounted across a 
> few machines (all 6.3-RELEASE).  For the past few weeks we have been 
> running into problems where one particular client will go into an 
> infinite loop where it is repeatedly trying to write data which causes 
> the NFS server to return "reply ok 40 write ERROR: Input/output error 
> PRE: POST:".  This retry loop can cause between 20mbps and 500mbps of 
> constant traffic on our network, depending on the size of the data 
> associated with the failed write.
> 
> We spent some time on the issue and determined that something on one of 
> the clients is deleting a file as it is being written to by another NFS 
> client.  We were able to enable the NFS lockmgr and use lockf(1) to fix 
> most of these conditions, and the frequency of this problem has dropped 
> from once a night to once a week.  However, it's still a problem and we 
> can't necessarily force all of our users to "play nice" and use lockf/flock.
> 
> Has anyone seen this before?  No errors are being logged on the NFS 
> server itself, but the "Server Ret-Failed" counter begins to increase 
> rapidly whenever a client gets stuck in this infinite retry loop:
> Server Ret-Failed
>  224768961
> 
> I have a feeling that using NFS in such a matter may simply be prone to 
> such problems, but what confuses me is why the NFS client system is 
> infinitely retrying the write operation and causing itself so much grief.

Yes, your feeling is correct.  This sort of race is inherent to NFS if you do 
not use some sort of locking protocol to resolve the race.  The infinite 
retries sound like a client-side issue.  Have you been able to try a newer OS 
version on a client to see if it still causes the same behavior?

-- 
John Baldwin
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

FreeBSD NFS client goes into infinite retry loop

2010-03-19 Thread Steve Polyack

Hi, we use a FreeBSD 8-STABLE (from shortly after release) system as an 
NFS server to provide user home directories which get mounted across a 
few machines (all 6.3-RELEASE).  For the past few weeks we have been 
running into problems where one particular client will go into an 
infinite loop where it is repeatedly trying to write data which causes 
the NFS server to return "reply ok 40 write ERROR: Input/output error 
PRE: POST:".  This retry loop can cause between 20mbps and 500mbps of 
constant traffic on our network, depending on the size of the data 
associated with the failed write.


We spent some time on the issue and determined that something on one of 
the clients is deleting a file as it is being written to by another NFS 
client.  We were able to enable the NFS lockmgr and use lockf(1) to fix 
most of these conditions, and the frequency of this problem has dropped 
from once a night to once a week.  However, it's still a problem and we 
can't necessarily force all of our users to "play nice" and use lockf/flock.


Has anyone seen this before?  No errors are being logged on the NFS 
server itself, but the "Server Ret-Failed" counter begins to increase 
rapidly whenever a client gets stuck in this infinite retry loop:

Server Ret-Failed
224768961

I have a feeling that using NFS in such a matter may simply be prone to 
such problems, but what confuses me is why the NFS client system is 
infinitely retrying the write operation and causing itself so much grief.


Thanks for any suggestions anyone can provide,
Steve Polyack

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: Fwd: Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

Re: FreeBSD NFS client goes into infinite retry loop

FreeBSD NFS client goes into infinite retry loop

21 matches

Site Navigation

Mail list logo

Footer information