Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-02-06 Thread Andrew Morton
On Wed, 06 Feb 2008 22:55:02 +0100
Gianluca Alberici <[EMAIL PROTECTED]> wrote:

> I finally got it. Problem and solution have been found from 6 month but 
> nobody cared...up to now those servers have not been mantained, this 
> problem is not discussed anywhere else than the following link.
> The bug (userspace server side i would say at this point) is well 
> described from the author of an nfs-user-server patch which has not been 
> managed yet. The magic hint to find it on google was 'nfs server 
> utimensat' :-)
> 
> http://marc.info/?l=linux-nfs=118724649406144=2

This is pretty significant.  We have on several occasions in recent years
tightened up the argument checking on long-standing system calls and it's
always a concern that this will break previously-working applications.

And now it has happened.

If we put buggy code into the kernel then we're largely stuck with it: we
need to be back-compatible with our bugs so we don't break things like
this.

> I have already prepared a working patch for cfsd based upon the one ive 
> listed. The nfs patch is of course waiting for commit since august, 
> 2007. Ill submit it to debian cfsd mantainers, hoping to have more 
> chance than my predecessor.
> It doesnt seem to me that there was any kernel related issue.
> 
> Thanks a lot again, sorry for the lots of noise i have done. I will try 
> to be more appropriate next time.

That wasn't noise - it was quite valuable.  Thanks for all the work you did
on this.


Given that our broken-by-unbreaking code has been out there in several
releases there isn't really any point in rebreaking it to fix this - the
offending applications need to be repaired so they'll work on 2.6.22 and
2.6.23 anyway.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-02-06 Thread Gianluca Alberici

Hi Chuck,

I finally got it. Problem and solution have been found from 6 month but 
nobody cared...up to now those servers have not been mantained, this 
problem is not discussed anywhere else than the following link.
The bug (userspace server side i would say at this point) is well 
described from the author of an nfs-user-server patch which has not been 
managed yet. The magic hint to find it on google was 'nfs server 
utimensat' :-)


http://marc.info/?l=linux-nfs=118724649406144=2

I have already prepared a working patch for cfsd based upon the one ive 
listed. The nfs patch is of course waiting for commit since august, 
2007. Ill submit it to debian cfsd mantainers, hoping to have more 
chance than my predecessor.

It doesnt seem to me that there was any kernel related issue.

Thanks a lot again, sorry for the lots of noise i have done. I will try 
to be more appropriate next time.


Cheers,

Gianluca

Chuck Lever wrote:


Hi Gianluca-

On Feb 6, 2008, at 1:25 PM, Gianluca Alberici wrote:


Hello all,

Thanks to Chuck's help i finally decided to proceed to a git bisect 
and found the bad patch. Is there anybody that has an idea why it 
breaks userspace nfs servers as we have seen ? Sorry for emailing 
directly Chuck Lever and Andrew Morton but i really wanted to thank 
Chuck for his precious help and thought that /akpm/ having signed 
this commit maybe he's going to figure out whats wrong easily



The commit you found is a plausible source of the trouble (based on 
our current theory about the problem).


What isn't quite clear to me is whether this commit causes your user- 
space server to start failing suddenly, or it causes the client to 
start sending the special non-standard time stamps in the SETATTR 
request. My guess is the latter, but I want to confirm this guess 
against reality :-)


Are you running the client and server concurrently on the same system? 
If so, it would be helpful if you could run this test with a constant 
kernel version on one side while varying it on the other. If client 
and server are already on different systems, can you tell us which 
system and which kernel combinations caused the failure?


A matrix of combinations might be:

1. server kernel is before 1c710c89, client kernel is before 1c710c89
2. server kernel is before 1c710c89, client kernel is after 1c710c89
3. server kernel is after 1c710c89, client kernel is before 1c710c89
4. server kernel is after 1c710c89, client kernel is after 1c710c89

Thanks.


This is what i finally get from git:

1c710c896eb461895d3c399e15bb5f20b39c9073 is first bad commit
commit 1c710c896eb461895d3c399e15bb5f20b39c9073
Author: Ulrich Drepper <[EMAIL PROTECTED]>
Date: Tue May 8 00:33:25 2007 -0700

utimensat implementation

Implement utimensat(2) which is an extension to futimesat(2) in that it

a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions

[...]

[EMAIL PROTECTED]: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper <[EMAIL PROTECTED]>
Cc: Alexey Dobriyan <[EMAIL PROTECTED]>
Cc: Michael Kerrisk <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

:04 04 3bedbc7fd919ba167b8e5f208a630261570853bb 
927002a9423dcb51ba4f7bee53e60cdca6c1df43 M arch
:04 04 fd688c5b534efd3111cbf1e1095d6ff631738325 
3d0fbf20fb3da1cb380c92f5b2b39815897376d3 M fs
:04 04 bfb1a907a9a842db4fa3543e12a8381d4e11b1eb 
9c1d99324db12e066c0d17870fe48457809ad43b M include


Thanks in advance, regards,

Gianluca


Hi Gianluca-

On Jan 30, 2008, at 7:40 AM, Gianluca Alberici wrote:


Hello again everybody

Here follows the testbench:

- I got two mirrors, same machine, same disk etc...chaged hostname, 
IP, and on the second i have recompiled kernel.

- First: 2.6.21.7 on debian sarge
- Second: 2.6.22 same system.
- Onto both i got nfs-user-server and cfsd last versions
- The export file is the same (localhost /opt/nfs (rw, async), 
stripping off the async option does not changes anything)

- Mount options are exactly the same.

The problem arises in the very same manner with both nfs and cfsd:

NFS:setattr {
...
...
RPC:call_decode {
return 22;
}
...
return 22;
}




Again, there is nothing wrong with the RPC client or call_decode. 
The *server* is returning NFSERR_INVAL (22) to a SETATTR request; 
the RPC client is simply passing that along to the NFS client, as it 
is designed to do.



I have tried these kernels:

2.6.16.11 works
2.6.20 works
2.6.21 works
2.6.21.7 works
2.6.22 doesnt work (contiguous to previous version)
2.6.23 doesnt work (same behavior as previous)
2.6.23.9 doesnt work (as above)
2.6.24rc7 doesnt work (as above)

I would really like to do more, client or server side, if you ave 
any 

Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-02-06 Thread Chuck Lever

Hi Gianluca-

On Feb 6, 2008, at 1:25 PM, Gianluca Alberici wrote:

Hello all,

Thanks to Chuck's help i finally decided to proceed to a git bisect  
and found the bad patch. Is there anybody that has an idea why it  
breaks userspace nfs servers as we have seen ? Sorry for emailing  
directly Chuck Lever and Andrew Morton but i really wanted to thank  
Chuck for his precious help and thought that /akpm/ having signed  
this commit maybe he's going to figure out whats wrong easily


The commit you found is a plausible source of the trouble (based on  
our current theory about the problem).


What isn't quite clear to me is whether this commit causes your user- 
space server to start failing suddenly, or it causes the client to  
start sending the special non-standard time stamps in the SETATTR  
request.  My guess is the latter, but I want to confirm this guess  
against reality  :-)


Are you running the client and server concurrently on the same  
system?  If so, it would be helpful if you could run this test with a  
constant kernel version on one side while varying it on the other.   
If client and server are already on different systems, can you tell  
us which system and which kernel combinations caused the failure?


A matrix of combinations might be:

1. server kernel is before 1c710c89, client kernel is before 1c710c89
2. server kernel is before 1c710c89, client kernel is after 1c710c89
3. server kernel is after 1c710c89, client kernel is before 1c710c89
4. server kernel is after 1c710c89, client kernel is after 1c710c89

Thanks.


This is what i finally get from git:

1c710c896eb461895d3c399e15bb5f20b39c9073 is first bad commit
commit 1c710c896eb461895d3c399e15bb5f20b39c9073
Author: Ulrich Drepper <[EMAIL PROTECTED]>
Date:   Tue May 8 00:33:25 2007 -0700

   utimensat implementation

   Implement utimensat(2) which is an extension to futimesat(2) in  
that it


   a) supports nano-second resolution for the timestamps
   b) allows to selectively ignore the atime/mtime value
   c) allows to selectively use the current time for either atime  
or mtime
   d) supports changing the atime/mtime of a symlink itself along  
the lines

  of the BSD lutimes(3) functions

[...]

   [EMAIL PROTECTED]: add missing i386 syscall table entry]
   Signed-off-by: Ulrich Drepper <[EMAIL PROTECTED]>
   Cc: Alexey Dobriyan <[EMAIL PROTECTED]>
   Cc: Michael Kerrisk <[EMAIL PROTECTED]>
   Cc: <[EMAIL PROTECTED]>
   Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
   Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

:04 04 3bedbc7fd919ba167b8e5f208a630261570853bb  
927002a9423dcb51ba4f7bee53e60cdca6c1df43 M  arch
:04 04 fd688c5b534efd3111cbf1e1095d6ff631738325  
3d0fbf20fb3da1cb380c92f5b2b39815897376d3 M  fs
:04 04 bfb1a907a9a842db4fa3543e12a8381d4e11b1eb  
9c1d99324db12e066c0d17870fe48457809ad43b M  include


Thanks in advance, regards,

Gianluca


Hi Gianluca-

On Jan 30, 2008, at 7:40 AM, Gianluca Alberici wrote:


Hello again everybody

Here follows the testbench:

- I got two mirrors, same machine, same disk etc...chaged  
hostname, IP, and on the second i have recompiled kernel.

- First: 2.6.21.7 on debian sarge
- Second: 2.6.22 same system.
- Onto both i got nfs-user-server and cfsd last versions
- The export file is the same (localhost /opt/nfs (rw, async),  
stripping off the async option does not changes anything)

- Mount options are exactly the same.

The problem arises in the very same manner with both nfs and cfsd:

NFS:setattr {
...
...
RPC:call_decode {
return 22;
}
...
return 22;
}



Again, there is nothing wrong with the RPC client or call_decode.  
The *server* is returning NFSERR_INVAL (22) to a SETATTR request;  
the RPC client is simply passing that along to the NFS client, as  
it is designed to do.



I have tried these kernels:

2.6.16.11 works
2.6.20 works
2.6.21 works
2.6.21.7 works
2.6.22 doesnt work (contiguous to previous version)
2.6.23 doesnt work (same behavior as previous)
2.6.23.9 doesnt work (as above)
2.6.24rc7 doesnt work (as above)

I would really like to do more, client or server side, if you ave  
any suggestions.
Can we find out what is the change (doesnt matter if it is a buf  
or bug fix) that caused this problem ?



The goal here is to identify the kernel change between 2.6.21 and  
2.6.22 that makes the client generate SETATTR requests the user- 
space server chokes on. It may be a change in the NFS client, or  
it could be somewhere else in the file system stack, like the VFS.


The usual procedure is to use "git bisect". It does a binary  
search on the kernel patches between the working kernel version  
and the kernel version that is known not to work. It works like this:


1. You clone a linux kernel git repository (if you don't have a git
repository already)

2. You tell git bisect which kernel version is working, and which  
isn't.
git bisect then selects a commit about half way in between the  
working

and non-working 

Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-02-06 Thread Gianluca Alberici

Hello all,

Thanks to Chuck's help i finally decided to proceed to a git bisect and 
found the bad patch. Is there anybody that has an idea why it breaks 
userspace nfs servers as we have seen ? Sorry for emailing directly 
Chuck Lever and Andrew Morton but i really wanted to thank Chuck for his 
precious help and thought that /akpm/ having signed this commit maybe 
he's going to figure out whats wrong easily


This is what i finally get from git:

1c710c896eb461895d3c399e15bb5f20b39c9073 is first bad commit
commit 1c710c896eb461895d3c399e15bb5f20b39c9073
Author: Ulrich Drepper <[EMAIL PROTECTED]>
Date:   Tue May 8 00:33:25 2007 -0700

   utimensat implementation

   Implement utimensat(2) which is an extension to futimesat(2) in that it

   a) supports nano-second resolution for the timestamps
   b) allows to selectively ignore the atime/mtime value
   c) allows to selectively use the current time for either atime or mtime
   d) supports changing the atime/mtime of a symlink itself along the lines
  of the BSD lutimes(3) functions

[...]

   [EMAIL PROTECTED]: add missing i386 syscall table entry]
   Signed-off-by: Ulrich Drepper <[EMAIL PROTECTED]>
   Cc: Alexey Dobriyan <[EMAIL PROTECTED]>
   Cc: Michael Kerrisk <[EMAIL PROTECTED]>
   Cc: <[EMAIL PROTECTED]>
   Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
   Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>

:04 04 3bedbc7fd919ba167b8e5f208a630261570853bb 
927002a9423dcb51ba4f7bee53e60cdca6c1df43 M  arch
:04 04 fd688c5b534efd3111cbf1e1095d6ff631738325 
3d0fbf20fb3da1cb380c92f5b2b39815897376d3 M  fs
:04 04 bfb1a907a9a842db4fa3543e12a8381d4e11b1eb 
9c1d99324db12e066c0d17870fe48457809ad43b M  include

Thanks in advance, regards,

Gianluca


Hi Gianluca-

On Jan 30, 2008, at 7:40 AM, Gianluca Alberici wrote:


Hello again everybody

Here follows the testbench:

- I got two mirrors, same machine, same disk etc...chaged hostname, 
IP, and on the second i have recompiled kernel.

- First: 2.6.21.7 on debian sarge
- Second: 2.6.22 same system.
- Onto both i got nfs-user-server and cfsd last versions
- The export file is the same (localhost /opt/nfs (rw, async), 
stripping off the async option does not changes anything)

- Mount options are exactly the same.

The problem arises in the very same manner with both nfs and cfsd:

NFS:setattr {
...
...
RPC:call_decode {
return 22;
}
...
return 22;
}



Again, there is nothing wrong with the RPC client or call_decode. The 
*server* is returning NFSERR_INVAL (22) to a SETATTR request; the RPC 
client is simply passing that along to the NFS client, as it is 
designed to do.



I have tried these kernels:

2.6.16.11 works
2.6.20 works
2.6.21 works
2.6.21.7 works
2.6.22 doesnt work (contiguous to previous version)
2.6.23 doesnt work (same behavior as previous)
2.6.23.9 doesnt work (as above)
2.6.24rc7 doesnt work (as above)

I would really like to do more, client or server side, if you ave any 
suggestions.
Can we find out what is the change (doesnt matter if it is a buf or 
bug fix) that caused this problem ?



The goal here is to identify the kernel change between 2.6.21 and 
2.6.22 that makes the client generate SETATTR requests the user-space 
server chokes on. It may be a change in the NFS client, or it could be 
somewhere else in the file system stack, like the VFS.


The usual procedure is to use "git bisect". It does a binary search on 
the kernel patches between the working kernel version and the kernel 
version that is known not to work. It works like this:


1. You clone a linux kernel git repository (if you don't have a git
repository already)

2. You tell git bisect which kernel version is working, and which isn't.
git bisect then selects a commit about half way in between the working
and non-working versions, and checks out that version of the kernel

3. You build that kernel, and run your test case

4. You tell git bisect whether the resulting kernel passes your test 
case,

it selects a new commit, and checks out that version of the kernel.

5. Repeat steps 3 and 4 until git bisect has identified the commit that
causes the kernel to stop passing your test case

If the number of patches between 2.6.21 and 2.6.22 is N, then git 
bisect will find the faulty patch in O(log2(N)) steps. For example, if 
there are 250 patches between 2.6.21 and 2.6.22, it will take about 8 
iterations of steps 3 and 4 to find the faulty patch, if all goes 
well; far fewer than the total number of patches you would need to 
test one at a time.


Naturally you can also do this by applying and reverting patches with 
"patch -p1", but it's a little more work.



Chuck Lever wrote:


On Jan 29, 2008, at 3:31 PM, Trond Myklebust wrote:


On Tue, 2008-01-29 at 20:50 +0100, Gianluca Alberici wrote:


Hello,

I confirm that i have encountered this same problem (EINVAL on open
(...O | TRUNC) with the following userspace servers:

- nfs-user-server shipped with debian sarge/etch etc...
- cfsd 

Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-02-06 Thread Chuck Lever

Hi Gianluca-

On Feb 6, 2008, at 1:25 PM, Gianluca Alberici wrote:

Hello all,

Thanks to Chuck's help i finally decided to proceed to a git bisect  
and found the bad patch. Is there anybody that has an idea why it  
breaks userspace nfs servers as we have seen ? Sorry for emailing  
directly Chuck Lever and Andrew Morton but i really wanted to thank  
Chuck for his precious help and thought that /akpm/ having signed  
this commit maybe he's going to figure out whats wrong easily


The commit you found is a plausible source of the trouble (based on  
our current theory about the problem).


What isn't quite clear to me is whether this commit causes your user- 
space server to start failing suddenly, or it causes the client to  
start sending the special non-standard time stamps in the SETATTR  
request.  My guess is the latter, but I want to confirm this guess  
against reality  :-)


Are you running the client and server concurrently on the same  
system?  If so, it would be helpful if you could run this test with a  
constant kernel version on one side while varying it on the other.   
If client and server are already on different systems, can you tell  
us which system and which kernel combinations caused the failure?


A matrix of combinations might be:

1. server kernel is before 1c710c89, client kernel is before 1c710c89
2. server kernel is before 1c710c89, client kernel is after 1c710c89
3. server kernel is after 1c710c89, client kernel is before 1c710c89
4. server kernel is after 1c710c89, client kernel is after 1c710c89

Thanks.


This is what i finally get from git:

1c710c896eb461895d3c399e15bb5f20b39c9073 is first bad commit
commit 1c710c896eb461895d3c399e15bb5f20b39c9073
Author: Ulrich Drepper [EMAIL PROTECTED]
Date:   Tue May 8 00:33:25 2007 -0700

   utimensat implementation

   Implement utimensat(2) which is an extension to futimesat(2) in  
that it


   a) supports nano-second resolution for the timestamps
   b) allows to selectively ignore the atime/mtime value
   c) allows to selectively use the current time for either atime  
or mtime
   d) supports changing the atime/mtime of a symlink itself along  
the lines

  of the BSD lutimes(3) functions

[...]

   [EMAIL PROTECTED]: add missing i386 syscall table entry]
   Signed-off-by: Ulrich Drepper [EMAIL PROTECTED]
   Cc: Alexey Dobriyan [EMAIL PROTECTED]
   Cc: Michael Kerrisk [EMAIL PROTECTED]
   Cc: [EMAIL PROTECTED]
   Signed-off-by: Andrew Morton [EMAIL PROTECTED]
   Signed-off-by: Linus Torvalds [EMAIL PROTECTED]

:04 04 3bedbc7fd919ba167b8e5f208a630261570853bb  
927002a9423dcb51ba4f7bee53e60cdca6c1df43 M  arch
:04 04 fd688c5b534efd3111cbf1e1095d6ff631738325  
3d0fbf20fb3da1cb380c92f5b2b39815897376d3 M  fs
:04 04 bfb1a907a9a842db4fa3543e12a8381d4e11b1eb  
9c1d99324db12e066c0d17870fe48457809ad43b M  include


Thanks in advance, regards,

Gianluca


Hi Gianluca-

On Jan 30, 2008, at 7:40 AM, Gianluca Alberici wrote:


Hello again everybody

Here follows the testbench:

- I got two mirrors, same machine, same disk etc...chaged  
hostname, IP, and on the second i have recompiled kernel.

- First: 2.6.21.7 on debian sarge
- Second: 2.6.22 same system.
- Onto both i got nfs-user-server and cfsd last versions
- The export file is the same (localhost /opt/nfs (rw, async),  
stripping off the async option does not changes anything)

- Mount options are exactly the same.

The problem arises in the very same manner with both nfs and cfsd:

NFS:setattr {
...
...
RPC:call_decode {
return 22;
}
...
return 22;
}



Again, there is nothing wrong with the RPC client or call_decode.  
The *server* is returning NFSERR_INVAL (22) to a SETATTR request;  
the RPC client is simply passing that along to the NFS client, as  
it is designed to do.



I have tried these kernels:

2.6.16.11 works
2.6.20 works
2.6.21 works
2.6.21.7 works
2.6.22 doesnt work (contiguous to previous version)
2.6.23 doesnt work (same behavior as previous)
2.6.23.9 doesnt work (as above)
2.6.24rc7 doesnt work (as above)

I would really like to do more, client or server side, if you ave  
any suggestions.
Can we find out what is the change (doesnt matter if it is a buf  
or bug fix) that caused this problem ?



The goal here is to identify the kernel change between 2.6.21 and  
2.6.22 that makes the client generate SETATTR requests the user- 
space server chokes on. It may be a change in the NFS client, or  
it could be somewhere else in the file system stack, like the VFS.


The usual procedure is to use git bisect. It does a binary  
search on the kernel patches between the working kernel version  
and the kernel version that is known not to work. It works like this:


1. You clone a linux kernel git repository (if you don't have a git
repository already)

2. You tell git bisect which kernel version is working, and which  
isn't.
git bisect then selects a commit about half way in between the  
working

and non-working versions, and checks 

Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-02-06 Thread Gianluca Alberici

Hi Chuck,

I finally got it. Problem and solution have been found from 6 month but 
nobody cared...up to now those servers have not been mantained, this 
problem is not discussed anywhere else than the following link.
The bug (userspace server side i would say at this point) is well 
described from the author of an nfs-user-server patch which has not been 
managed yet. The magic hint to find it on google was 'nfs server 
utimensat' :-)


http://marc.info/?l=linux-nfsm=118724649406144w=2

I have already prepared a working patch for cfsd based upon the one ive 
listed. The nfs patch is of course waiting for commit since august, 
2007. Ill submit it to debian cfsd mantainers, hoping to have more 
chance than my predecessor.

It doesnt seem to me that there was any kernel related issue.

Thanks a lot again, sorry for the lots of noise i have done. I will try 
to be more appropriate next time.


Cheers,

Gianluca

Chuck Lever wrote:


Hi Gianluca-

On Feb 6, 2008, at 1:25 PM, Gianluca Alberici wrote:


Hello all,

Thanks to Chuck's help i finally decided to proceed to a git bisect 
and found the bad patch. Is there anybody that has an idea why it 
breaks userspace nfs servers as we have seen ? Sorry for emailing 
directly Chuck Lever and Andrew Morton but i really wanted to thank 
Chuck for his precious help and thought that /akpm/ having signed 
this commit maybe he's going to figure out whats wrong easily



The commit you found is a plausible source of the trouble (based on 
our current theory about the problem).


What isn't quite clear to me is whether this commit causes your user- 
space server to start failing suddenly, or it causes the client to 
start sending the special non-standard time stamps in the SETATTR 
request. My guess is the latter, but I want to confirm this guess 
against reality :-)


Are you running the client and server concurrently on the same system? 
If so, it would be helpful if you could run this test with a constant 
kernel version on one side while varying it on the other. If client 
and server are already on different systems, can you tell us which 
system and which kernel combinations caused the failure?


A matrix of combinations might be:

1. server kernel is before 1c710c89, client kernel is before 1c710c89
2. server kernel is before 1c710c89, client kernel is after 1c710c89
3. server kernel is after 1c710c89, client kernel is before 1c710c89
4. server kernel is after 1c710c89, client kernel is after 1c710c89

Thanks.


This is what i finally get from git:

1c710c896eb461895d3c399e15bb5f20b39c9073 is first bad commit
commit 1c710c896eb461895d3c399e15bb5f20b39c9073
Author: Ulrich Drepper [EMAIL PROTECTED]
Date: Tue May 8 00:33:25 2007 -0700

utimensat implementation

Implement utimensat(2) which is an extension to futimesat(2) in that it

a) supports nano-second resolution for the timestamps
b) allows to selectively ignore the atime/mtime value
c) allows to selectively use the current time for either atime or mtime
d) supports changing the atime/mtime of a symlink itself along the lines
of the BSD lutimes(3) functions

[...]

[EMAIL PROTECTED]: add missing i386 syscall table entry]
Signed-off-by: Ulrich Drepper [EMAIL PROTECTED]
Cc: Alexey Dobriyan [EMAIL PROTECTED]
Cc: Michael Kerrisk [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Signed-off-by: Linus Torvalds [EMAIL PROTECTED]

:04 04 3bedbc7fd919ba167b8e5f208a630261570853bb 
927002a9423dcb51ba4f7bee53e60cdca6c1df43 M arch
:04 04 fd688c5b534efd3111cbf1e1095d6ff631738325 
3d0fbf20fb3da1cb380c92f5b2b39815897376d3 M fs
:04 04 bfb1a907a9a842db4fa3543e12a8381d4e11b1eb 
9c1d99324db12e066c0d17870fe48457809ad43b M include


Thanks in advance, regards,

Gianluca


Hi Gianluca-

On Jan 30, 2008, at 7:40 AM, Gianluca Alberici wrote:


Hello again everybody

Here follows the testbench:

- I got two mirrors, same machine, same disk etc...chaged hostname, 
IP, and on the second i have recompiled kernel.

- First: 2.6.21.7 on debian sarge
- Second: 2.6.22 same system.
- Onto both i got nfs-user-server and cfsd last versions
- The export file is the same (localhost /opt/nfs (rw, async), 
stripping off the async option does not changes anything)

- Mount options are exactly the same.

The problem arises in the very same manner with both nfs and cfsd:

NFS:setattr {
...
...
RPC:call_decode {
return 22;
}
...
return 22;
}




Again, there is nothing wrong with the RPC client or call_decode. 
The *server* is returning NFSERR_INVAL (22) to a SETATTR request; 
the RPC client is simply passing that along to the NFS client, as it 
is designed to do.



I have tried these kernels:

2.6.16.11 works
2.6.20 works
2.6.21 works
2.6.21.7 works
2.6.22 doesnt work (contiguous to previous version)
2.6.23 doesnt work (same behavior as previous)
2.6.23.9 doesnt work (as above)
2.6.24rc7 doesnt work (as above)

I would really like to do more, client or server side, if you ave 
any suggestions.

Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-02-06 Thread Andrew Morton
On Wed, 06 Feb 2008 22:55:02 +0100
Gianluca Alberici [EMAIL PROTECTED] wrote:

 I finally got it. Problem and solution have been found from 6 month but 
 nobody cared...up to now those servers have not been mantained, this 
 problem is not discussed anywhere else than the following link.
 The bug (userspace server side i would say at this point) is well 
 described from the author of an nfs-user-server patch which has not been 
 managed yet. The magic hint to find it on google was 'nfs server 
 utimensat' :-)
 
 http://marc.info/?l=linux-nfsm=118724649406144w=2

This is pretty significant.  We have on several occasions in recent years
tightened up the argument checking on long-standing system calls and it's
always a concern that this will break previously-working applications.

And now it has happened.

If we put buggy code into the kernel then we're largely stuck with it: we
need to be back-compatible with our bugs so we don't break things like
this.

 I have already prepared a working patch for cfsd based upon the one ive 
 listed. The nfs patch is of course waiting for commit since august, 
 2007. Ill submit it to debian cfsd mantainers, hoping to have more 
 chance than my predecessor.
 It doesnt seem to me that there was any kernel related issue.
 
 Thanks a lot again, sorry for the lots of noise i have done. I will try 
 to be more appropriate next time.

That wasn't noise - it was quite valuable.  Thanks for all the work you did
on this.


Given that our broken-by-unbreaking code has been out there in several
releases there isn't really any point in rebreaking it to fix this - the
offending applications need to be repaired so they'll work on 2.6.22 and
2.6.23 anyway.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-01-19 Thread Gianluca Alberici

Hello Chuck,

As you requested i have finally found out that the problem was 
introduced in 2.6.22 (2.6.21.7 works, after 2.6.21.7 theres 2.6.22)


I have also examined the changeslogs: 2.6.22 introduce a certain number 
of patches into NFS client. Many of them are yours and Trond's.


Pls can you help with this ?

TIA

Gianluca
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-01-19 Thread Gianluca Alberici

Hello all,

Sorry for being down for so long but i have decided to test the latest 
2.6.24 kernel because of the huge number of NFS patches (especially at 
RPC UDP protocol level) introduced mainly by Netapp people (Trond 
Myklebust et al.), hoping that maybe the problem would have magically 
disappeared, but...


On kernel 2.6.24 rc7:

- Debian nfs-user-server (v2) is giving the very same problem i reported 
last times: EINVAL on open() syscall with O|TRUNC on existing file

- CFSD, same as debian nfs-user-server (v2)

To answer Chuck's questions:

i have already straced all the applications, which puts in evidence that 
the problem is into the open() syscall with O|TRUNC on existing files, 
always.
I prepared that very basic piece of c code to rise the problem, as you 
can see it just 'syscalls' open(), and that is enough.

NFS version is 2, no options (for CFSD [ -oport=3049,intr ] as i ever did)

When im done with this email i will work on finding where exactly the 
problem arises, for now i can say that 2.6.21.7 is working so well, and 
2.6.22.16 is not working exactly as 2.6.24rc7 does.


I am really interested in finding out whats wrong with NFSD/CFSD and if 
it was a problem of NFS compliance i would really be glad to find out 
what to do to patch those servers, but im really far to be able to do it 
alone.
But: dont you think that we should find out that the problem is surely 
into the userspace servers before saying that everything is alright with 
linux kernel ?


I propose to anybody interested to test cfsd or nfs-user-server 
2.2beta47-23 (from debian etch) on any kernel >= 2.6.23. You ca use 
something like the piece of code included in this email, or you can simply:


echo "Hello" > /mnt/nfsmount/test

which is gonna fail after the first time (the first it creates a new 
file and it works) with EINVAL.


Now as i already said the problem is an RPC call_decode which fails, 
this is evident from an RPC debug. I have noticed a different number of 
bytes in the response from knfsd to userspace (28 instead of 96).


Somebody can help ?

Gianluca

Chuck Lever wrote:


On Dec 25, 2007, at 5:04 PM, Andrew Morton wrote:

On Sun, 23 Dec 2007 12:35:17 +0100 Gianluca Alberici 
<[EMAIL PROTECTED]> wrote:



Hello,

I can do better. I have investigated a bit the problem:

1) The problem arises only with the userspace nfsd (Universal nfsd 
2.2).

I have realized that the latest patches introduced in 2.6.22 have
changed a lot of things into NFS.

2) The following code has been debugged with sunrpc.rpc_debug and
sunrpc.nfs_debug

#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[]) {
int fp;
if ((fp=open("/mnt/tmp/test",O_WRONLY | O_CREAT | O_TRUNC, S_IRWXU 
)) == -1) printf("ERR: %d\n",errno);

else {
write(fp, argv[1], strlen(argv[1]));
printf("OK\n");
close(fp);
};
}


2b) The output

[...]

<8>NFS call setattr
<8>RPC: new task initialized, procpid 18656
<8>RPC: allocated task f7bec500
<8>RPC: 0 looking up UNIX cred
<8>RPC: 740 __rpc_execute flags=0x480
<8>RPC: 740 call_start nfs2 proc 2 (sync)
<8>RPC: 740 call_reserve (status 0)
<8>RPC: 740 reserved req f1416000 xid 643f3c42
<8>RPC: 740 call_reserveresult (status 0)
<8>RPC: 740 call_allocate (status 0)
<8>RPC: 740 allocated buffer of size 528 at e627b800
<8>RPC: 740 call_bind (status 0)
<8>RPC: 740 call_connect xprt f70d4000 is connected
<8>RPC: 740 call_transmit (status 0)
<8>RPC: 740 xprt_prepare_transmit
<8>RPC: 740 xprt_cwnd_limited cong = 0 cwnd = 512
<8>RPC: 740 call_encode (status 0)
<8>RPC: 740 marshaling UNIX cred f4efcb40
<8>RPC: 740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
<8>RPC: 740 xprt_transmit(148)
<8>RPC: xs_udp_data_ready...
<8>RPC: cong 256, cwnd was 512, now 512
<8>RPC: 740 xid 643f3c42 complete (28 bytes received)
<8>RPC: xs_udp_send_request(148) = 148
<8>RPC: 740 xmit complete
<8>RPC: wake_up_next(f70d4114 "xprt_resend")
<8>RPC: wake_up_next(f70d40e4 "xprt_sending")
<8>RPC: 740 call_status (status 28)
<8>RPC: 740 call_decode (status 28)
<8>RPC: 740 validating UNIX cred f4efcb40
<8>RPC: 740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
<8>RPC: 740 call_decode result -22
<8>RPC: 740 return 0, status -22
<8>RPC: 740 release task
<8>RPC: freeing buffer of size 528 at e627b800
<8>RPC: 740 release request f1416000
<8>RPC: wake_up_next(f70d4174 "xprt_backlog")
<8>RPC: 740 releasing UNIX cred f4efcb40
<8>RPC: rpc_release_client(f6f1f880)
<8>NFS reply setattr: -22

[...]

3) It turns out the following:

- setattr returns EINVAL due to...
- RPC call_decode returns status 22

4) In conclusion in my understanding:

- At present linux nfs filesystem support is not anymore compatible 
with
old userspace servers like universal nfsd and crypto filesystems as 
cfsd

(which is an nfs server pretty old fashioned).




Linux NFS client issues with servers that are old or not widely used 
should be reported to [EMAIL PROTECTED] (thanks for forwarding 
this, Andrew).


User space servers are usually not tested 

Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-01-19 Thread Gianluca Alberici

Hello all,

Sorry for being down for so long but i have decided to test the latest 
2.6.24 kernel because of the huge number of NFS patches (especially at 
RPC UDP protocol level) introduced mainly by Netapp people (Trond 
Myklebust et al.), hoping that maybe the problem would have magically 
disappeared, but...


On kernel 2.6.24 rc7:

- Debian nfs-user-server (v2) is giving the very same problem i reported 
last times: EINVAL on open() syscall with O|TRUNC on existing file

- CFSD, same as debian nfs-user-server (v2)

To answer Chuck's questions:

i have already straced all the applications, which puts in evidence that 
the problem is into the open() syscall with O|TRUNC on existing files, 
always.
I prepared that very basic piece of c code to rise the problem, as you 
can see it just 'syscalls' open(), and that is enough.

NFS version is 2, no options (for CFSD [ -oport=3049,intr ] as i ever did)

When im done with this email i will work on finding where exactly the 
problem arises, for now i can say that 2.6.21.7 is working so well, and 
2.6.22.16 is not working exactly as 2.6.24rc7 does.


I am really interested in finding out whats wrong with NFSD/CFSD and if 
it was a problem of NFS compliance i would really be glad to find out 
what to do to patch those servers, but im really far to be able to do it 
alone.
But: dont you think that we should find out that the problem is surely 
into the userspace servers before saying that everything is alright with 
linux kernel ?


I propose to anybody interested to test cfsd or nfs-user-server 
2.2beta47-23 (from debian etch) on any kernel = 2.6.23. You ca use 
something like the piece of code included in this email, or you can simply:


echo Hello  /mnt/nfsmount/test

which is gonna fail after the first time (the first it creates a new 
file and it works) with EINVAL.


Now as i already said the problem is an RPC call_decode which fails, 
this is evident from an RPC debug. I have noticed a different number of 
bytes in the response from knfsd to userspace (28 instead of 96).


Somebody can help ?

Gianluca

Chuck Lever wrote:


On Dec 25, 2007, at 5:04 PM, Andrew Morton wrote:

On Sun, 23 Dec 2007 12:35:17 +0100 Gianluca Alberici 
[EMAIL PROTECTED] wrote:



Hello,

I can do better. I have investigated a bit the problem:

1) The problem arises only with the userspace nfsd (Universal nfsd 
2.2).

I have realized that the latest patches introduced in 2.6.22 have
changed a lot of things into NFS.

2) The following code has been debugged with sunrpc.rpc_debug and
sunrpc.nfs_debug

#include sys/types.h
#include sys/stat.h
#include fcntl.h
#include errno.h
#include stdio.h

int main(int argc, char *argv[]) {
int fp;
if ((fp=open(/mnt/tmp/test,O_WRONLY | O_CREAT | O_TRUNC, S_IRWXU 
)) == -1) printf(ERR: %d\n,errno);

else {
write(fp, argv[1], strlen(argv[1]));
printf(OK\n);
close(fp);
};
}


2b) The output

[...]

8NFS call setattr
8RPC: new task initialized, procpid 18656
8RPC: allocated task f7bec500
8RPC: 0 looking up UNIX cred
8RPC: 740 __rpc_execute flags=0x480
8RPC: 740 call_start nfs2 proc 2 (sync)
8RPC: 740 call_reserve (status 0)
8RPC: 740 reserved req f1416000 xid 643f3c42
8RPC: 740 call_reserveresult (status 0)
8RPC: 740 call_allocate (status 0)
8RPC: 740 allocated buffer of size 528 at e627b800
8RPC: 740 call_bind (status 0)
8RPC: 740 call_connect xprt f70d4000 is connected
8RPC: 740 call_transmit (status 0)
8RPC: 740 xprt_prepare_transmit
8RPC: 740 xprt_cwnd_limited cong = 0 cwnd = 512
8RPC: 740 call_encode (status 0)
8RPC: 740 marshaling UNIX cred f4efcb40
8RPC: 740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
8RPC: 740 xprt_transmit(148)
8RPC: xs_udp_data_ready...
8RPC: cong 256, cwnd was 512, now 512
8RPC: 740 xid 643f3c42 complete (28 bytes received)
8RPC: xs_udp_send_request(148) = 148
8RPC: 740 xmit complete
8RPC: wake_up_next(f70d4114 xprt_resend)
8RPC: wake_up_next(f70d40e4 xprt_sending)
8RPC: 740 call_status (status 28)
8RPC: 740 call_decode (status 28)
8RPC: 740 validating UNIX cred f4efcb40
8RPC: 740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
8RPC: 740 call_decode result -22
8RPC: 740 return 0, status -22
8RPC: 740 release task
8RPC: freeing buffer of size 528 at e627b800
8RPC: 740 release request f1416000
8RPC: wake_up_next(f70d4174 xprt_backlog)
8RPC: 740 releasing UNIX cred f4efcb40
8RPC: rpc_release_client(f6f1f880)
8NFS reply setattr: -22

[...]

3) It turns out the following:

- setattr returns EINVAL due to...
- RPC call_decode returns status 22

4) In conclusion in my understanding:

- At present linux nfs filesystem support is not anymore compatible 
with
old userspace servers like universal nfsd and crypto filesystems as 
cfsd

(which is an nfs server pretty old fashioned).




Linux NFS client issues with servers that are old or not widely used 
should be reported to [EMAIL PROTECTED] (thanks for forwarding 
this, Andrew).


User space servers are usually not tested with NFS client changes, 
since their use is infrequent 

Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2008-01-19 Thread Gianluca Alberici

Hello Chuck,

As you requested i have finally found out that the problem was 
introduced in 2.6.22 (2.6.21.7 works, after 2.6.21.7 theres 2.6.22)


I have also examined the changeslogs: 2.6.22 introduce a certain number 
of patches into NFS client. Many of them are yours and Trond's.


Pls can you help with this ?

TIA

Gianluca
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-26 Thread Chuck Lever

On Dec 25, 2007, at 5:04 PM, Andrew Morton wrote:
On Sun, 23 Dec 2007 12:35:17 +0100 Gianluca Alberici  
<[EMAIL PROTECTED]> wrote:



Hello,

I can do better. I have investigated a bit the problem:

1) The problem arises only with the userspace nfsd (Universal nfsd  
2.2).

I have realized that the latest patches introduced in 2.6.22 have
changed a lot of things into NFS.

2) The following code has been debugged with sunrpc.rpc_debug and
sunrpc.nfs_debug

#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[]) {
int fp;
if ((fp=open("/mnt/tmp/test",O_WRONLY | O_CREAT | O_TRUNC,  
S_IRWXU )) == -1) printf("ERR: %d\n",errno);

else {
write(fp, argv[1], strlen(argv[1]));
printf("OK\n");
close(fp);
};
}


2b) The output

[...]

<8>NFS call  setattr
<8>RPC:   new task initialized, procpid 18656
<8>RPC:   allocated task f7bec500
<8>RPC: 0 looking up UNIX cred
<8>RPC:   740 __rpc_execute flags=0x480
<8>RPC:   740 call_start nfs2 proc 2 (sync)
<8>RPC:   740 call_reserve (status 0)
<8>RPC:   740 reserved req f1416000 xid 643f3c42
<8>RPC:   740 call_reserveresult (status 0)
<8>RPC:   740 call_allocate (status 0)
<8>RPC:   740 allocated buffer of size 528 at e627b800
<8>RPC:   740 call_bind (status 0)
<8>RPC:   740 call_connect xprt f70d4000 is connected
<8>RPC:   740 call_transmit (status 0)
<8>RPC:   740 xprt_prepare_transmit
<8>RPC:   740 xprt_cwnd_limited cong = 0 cwnd = 512
<8>RPC:   740 call_encode (status 0)
<8>RPC:   740 marshaling UNIX cred f4efcb40
<8>RPC:   740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
<8>RPC:   740 xprt_transmit(148)
<8>RPC:   xs_udp_data_ready...
<8>RPC:   cong 256, cwnd was 512, now 512
<8>RPC:   740 xid 643f3c42 complete (28 bytes received)
<8>RPC:   xs_udp_send_request(148) = 148
<8>RPC:   740 xmit complete
<8>RPC:   wake_up_next(f70d4114 "xprt_resend")
<8>RPC:   wake_up_next(f70d40e4 "xprt_sending")
<8>RPC:   740 call_status (status 28)
<8>RPC:   740 call_decode (status 28)
<8>RPC:   740 validating UNIX cred f4efcb40
<8>RPC:   740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
<8>RPC:   740 call_decode result -22
<8>RPC:   740 return 0, status -22
<8>RPC:   740 release task
<8>RPC:   freeing buffer of size 528 at e627b800
<8>RPC:   740 release request f1416000
<8>RPC:   wake_up_next(f70d4174 "xprt_backlog")
<8>RPC:   740 releasing UNIX cred f4efcb40
<8>RPC:   rpc_release_client(f6f1f880)
<8>NFS reply setattr: -22

[...]

3) It turns out the following:

- setattr returns EINVAL due to...
- RPC call_decode returns status 22

4) In conclusion in my understanding:

- At present linux nfs filesystem support is not anymore  
compatible with
old userspace servers like universal nfsd and crypto filesystems  
as cfsd

(which is an nfs server pretty old fashioned).


Linux NFS client issues with servers that are old or not widely used  
should be reported to [EMAIL PROTECTED] (thanks for  
forwarding this, Andrew).


User space servers are usually not tested with NFS client changes,  
since their use is infrequent compared with Solaris, NetApp filers,  
and the Linux knfsd (and several others).  We have to draw the line  
somewhere to make NFS client development a manageable process.



- This problem makes UNFSD and CFSD unusable on 2.6.22 and up (this
doesnt sound good to me)


It's likely the NFS client team is not aware of problems with these  
servers simply because none of us run them, and nothing has been  
reported so far.



The question are:

- Is this wanted or is a bug ?


An EINVAL return is fairly generic, but since it is coming from the  
reply path on the client, that probably indicates that decoding the  
reply failed somehow.  That could be a client problem, or the server  
reply was incorrect or corrupt.


More specific information about the problem is needed.  Could we see  
a wire trace?  A raw tcpdump or tethereal dump captured during the  
problematic interaction would be helpful.  Maybe even an strace of  
the failing application would tell us what the arguments for the  
setattr() call are.  And what are your mount options, especially  
which NFS version? (cat /proc/mounts)


Have you tried a 'git bisect' or something similar to track down  
exactly when client behavior changed?


- Can this be solved with some backwards compat stuff into the  
kernel or

all the old packages as UNFSD have to be
bugfixed/upgraded


If NFS servers don't conform to the NFS protocol specification, then  
it's unlikely that the NFS client will be changed to fix such  
issues.  Ie: server bugs need to be fixed on the server.  That is  
sometimes difficult if the server is not maintained, for instance.


- I have found other strange behaviors of the new NFS filesystem  
support

that i am investigating. All are bound to the user of
old userspace servers onto the new NFS (since 2.6.22). What to do ?


Report the problems on the 

Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-26 Thread Gianluca Alberici

Hello all,

I've seen that Trond and the other guys at Netapp are working hard on 
the NFS support and there are dozens of patches in latest

2.6.24rcX s kernels regarding both RPC and NFS.
I can only hope, as Andrew Morton said, that the policy of new NFS impl 
was 'dont break'.
In the end those buggy, dreaded, old fashioned NFSs have done their job 
excellently for over 15 years and will be around for

another 5 IMHO.

Anyways, id really would be so thankful if somebody from the NFS team 
was givin us a clue about the whole thing.


TIA

Gianluca

Andrew Morton wrote:


On Sun, 23 Dec 2007 12:35:17 +0100 Gianluca Alberici <[EMAIL PROTECTED]> wrote:

 


Hello,

I can do better. I have investigated a bit the problem:

1) The problem arises only with the userspace nfsd (Universal nfsd 2.2). 
I have realized that the latest patches introduced in 2.6.22 have 
changed a lot of things into NFS.


2) The following code has been debugged with sunrpc.rpc_debug and 
sunrpc.nfs_debug


#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[]) {
   int fp;
   if ((fp=open("/mnt/tmp/test",O_WRONLY | O_CREAT | O_TRUNC, S_IRWXU )) == -1) 
printf("ERR: %d\n",errno);
   else {
   write(fp, argv[1], strlen(argv[1]));
   printf("OK\n");
   close(fp);
   };
}


2b) The output

[...]

<8>NFS call  setattr
<8>RPC:   new task initialized, procpid 18656
<8>RPC:   allocated task f7bec500
<8>RPC: 0 looking up UNIX cred
<8>RPC:   740 __rpc_execute flags=0x480
<8>RPC:   740 call_start nfs2 proc 2 (sync)
<8>RPC:   740 call_reserve (status 0)
<8>RPC:   740 reserved req f1416000 xid 643f3c42
<8>RPC:   740 call_reserveresult (status 0)
<8>RPC:   740 call_allocate (status 0)
<8>RPC:   740 allocated buffer of size 528 at e627b800
<8>RPC:   740 call_bind (status 0)
<8>RPC:   740 call_connect xprt f70d4000 is connected
<8>RPC:   740 call_transmit (status 0)
<8>RPC:   740 xprt_prepare_transmit
<8>RPC:   740 xprt_cwnd_limited cong = 0 cwnd = 512
<8>RPC:   740 call_encode (status 0)
<8>RPC:   740 marshaling UNIX cred f4efcb40
<8>RPC:   740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
<8>RPC:   740 xprt_transmit(148)
<8>RPC:   xs_udp_data_ready...
<8>RPC:   cong 256, cwnd was 512, now 512
<8>RPC:   740 xid 643f3c42 complete (28 bytes received)
<8>RPC:   xs_udp_send_request(148) = 148
<8>RPC:   740 xmit complete
<8>RPC:   wake_up_next(f70d4114 "xprt_resend")
<8>RPC:   wake_up_next(f70d40e4 "xprt_sending")
<8>RPC:   740 call_status (status 28)
<8>RPC:   740 call_decode (status 28)
<8>RPC:   740 validating UNIX cred f4efcb40
<8>RPC:   740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
<8>RPC:   740 call_decode result -22
<8>RPC:   740 return 0, status -22
<8>RPC:   740 release task
<8>RPC:   freeing buffer of size 528 at e627b800
<8>RPC:   740 release request f1416000
<8>RPC:   wake_up_next(f70d4174 "xprt_backlog")
<8>RPC:   740 releasing UNIX cred f4efcb40
<8>RPC:   rpc_release_client(f6f1f880)
<8>NFS reply setattr: -22

[...]

3) It turns out the following:

- setattr returns EINVAL due to...
- RPC call_decode returns status 22

4) In conclusion in my understanding:

- At present linux nfs filesystem support is not anymore compatible with 
old userspace servers like universal nfsd and crypto filesystems as cfsd 
(which is an nfs server pretty old fashioned).
- This problem makes UNFSD and CFSD unusable on 2.6.22 and up (this 
doesnt sound good to me)


The question are:

- Is this wanted or is a bug ?
- Can this be solved with some backwards compat stuff into the kernel or 
all the old packages as UNFSD have to be

bugfixed/upgraded
- I have found other strange behaviors of the new NFS filesystem support 
that i am investigating. All are bound to the user of

old userspace servers onto the new NFS (since 2.6.22). What to do ?

   



I'm not sure what the NFS client's policy is regarding support for
userspace servers.  But I'd certainly hope that it is "don't break them".

Which would make this an NFS client regression.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-26 Thread Gianluca Alberici

Hello all,

I've seen that Trond and the other guys at Netapp are working hard on 
the NFS support and there are dozens of patches in latest

2.6.24rcX s kernels regarding both RPC and NFS.
I can only hope, as Andrew Morton said, that the policy of new NFS impl 
was 'dont break'.
In the end those buggy, dreaded, old fashioned NFSs have done their job 
excellently for over 15 years and will be around for

another 5 IMHO.

Anyways, id really would be so thankful if somebody from the NFS team 
was givin us a clue about the whole thing.


TIA

Gianluca

Andrew Morton wrote:


On Sun, 23 Dec 2007 12:35:17 +0100 Gianluca Alberici [EMAIL PROTECTED] wrote:

 


Hello,

I can do better. I have investigated a bit the problem:

1) The problem arises only with the userspace nfsd (Universal nfsd 2.2). 
I have realized that the latest patches introduced in 2.6.22 have 
changed a lot of things into NFS.


2) The following code has been debugged with sunrpc.rpc_debug and 
sunrpc.nfs_debug


#include sys/types.h
#include sys/stat.h
#include fcntl.h
#include errno.h
#include stdio.h

int main(int argc, char *argv[]) {
   int fp;
   if ((fp=open(/mnt/tmp/test,O_WRONLY | O_CREAT | O_TRUNC, S_IRWXU )) == -1) 
printf(ERR: %d\n,errno);
   else {
   write(fp, argv[1], strlen(argv[1]));
   printf(OK\n);
   close(fp);
   };
}


2b) The output

[...]

8NFS call  setattr
8RPC:   new task initialized, procpid 18656
8RPC:   allocated task f7bec500
8RPC: 0 looking up UNIX cred
8RPC:   740 __rpc_execute flags=0x480
8RPC:   740 call_start nfs2 proc 2 (sync)
8RPC:   740 call_reserve (status 0)
8RPC:   740 reserved req f1416000 xid 643f3c42
8RPC:   740 call_reserveresult (status 0)
8RPC:   740 call_allocate (status 0)
8RPC:   740 allocated buffer of size 528 at e627b800
8RPC:   740 call_bind (status 0)
8RPC:   740 call_connect xprt f70d4000 is connected
8RPC:   740 call_transmit (status 0)
8RPC:   740 xprt_prepare_transmit
8RPC:   740 xprt_cwnd_limited cong = 0 cwnd = 512
8RPC:   740 call_encode (status 0)
8RPC:   740 marshaling UNIX cred f4efcb40
8RPC:   740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
8RPC:   740 xprt_transmit(148)
8RPC:   xs_udp_data_ready...
8RPC:   cong 256, cwnd was 512, now 512
8RPC:   740 xid 643f3c42 complete (28 bytes received)
8RPC:   xs_udp_send_request(148) = 148
8RPC:   740 xmit complete
8RPC:   wake_up_next(f70d4114 xprt_resend)
8RPC:   wake_up_next(f70d40e4 xprt_sending)
8RPC:   740 call_status (status 28)
8RPC:   740 call_decode (status 28)
8RPC:   740 validating UNIX cred f4efcb40
8RPC:   740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
8RPC:   740 call_decode result -22
8RPC:   740 return 0, status -22
8RPC:   740 release task
8RPC:   freeing buffer of size 528 at e627b800
8RPC:   740 release request f1416000
8RPC:   wake_up_next(f70d4174 xprt_backlog)
8RPC:   740 releasing UNIX cred f4efcb40
8RPC:   rpc_release_client(f6f1f880)
8NFS reply setattr: -22

[...]

3) It turns out the following:

- setattr returns EINVAL due to...
- RPC call_decode returns status 22

4) In conclusion in my understanding:

- At present linux nfs filesystem support is not anymore compatible with 
old userspace servers like universal nfsd and crypto filesystems as cfsd 
(which is an nfs server pretty old fashioned).
- This problem makes UNFSD and CFSD unusable on 2.6.22 and up (this 
doesnt sound good to me)


The question are:

- Is this wanted or is a bug ?
- Can this be solved with some backwards compat stuff into the kernel or 
all the old packages as UNFSD have to be

bugfixed/upgraded
- I have found other strange behaviors of the new NFS filesystem support 
that i am investigating. All are bound to the user of

old userspace servers onto the new NFS (since 2.6.22). What to do ?

   



I'm not sure what the NFS client's policy is regarding support for
userspace servers.  But I'd certainly hope that it is don't break them.

Which would make this an NFS client regression.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
 



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-26 Thread Chuck Lever

On Dec 25, 2007, at 5:04 PM, Andrew Morton wrote:
On Sun, 23 Dec 2007 12:35:17 +0100 Gianluca Alberici  
[EMAIL PROTECTED] wrote:



Hello,

I can do better. I have investigated a bit the problem:

1) The problem arises only with the userspace nfsd (Universal nfsd  
2.2).

I have realized that the latest patches introduced in 2.6.22 have
changed a lot of things into NFS.

2) The following code has been debugged with sunrpc.rpc_debug and
sunrpc.nfs_debug

#include sys/types.h
#include sys/stat.h
#include fcntl.h
#include errno.h
#include stdio.h

int main(int argc, char *argv[]) {
int fp;
if ((fp=open(/mnt/tmp/test,O_WRONLY | O_CREAT | O_TRUNC,  
S_IRWXU )) == -1) printf(ERR: %d\n,errno);

else {
write(fp, argv[1], strlen(argv[1]));
printf(OK\n);
close(fp);
};
}


2b) The output

[...]

8NFS call  setattr
8RPC:   new task initialized, procpid 18656
8RPC:   allocated task f7bec500
8RPC: 0 looking up UNIX cred
8RPC:   740 __rpc_execute flags=0x480
8RPC:   740 call_start nfs2 proc 2 (sync)
8RPC:   740 call_reserve (status 0)
8RPC:   740 reserved req f1416000 xid 643f3c42
8RPC:   740 call_reserveresult (status 0)
8RPC:   740 call_allocate (status 0)
8RPC:   740 allocated buffer of size 528 at e627b800
8RPC:   740 call_bind (status 0)
8RPC:   740 call_connect xprt f70d4000 is connected
8RPC:   740 call_transmit (status 0)
8RPC:   740 xprt_prepare_transmit
8RPC:   740 xprt_cwnd_limited cong = 0 cwnd = 512
8RPC:   740 call_encode (status 0)
8RPC:   740 marshaling UNIX cred f4efcb40
8RPC:   740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
8RPC:   740 xprt_transmit(148)
8RPC:   xs_udp_data_ready...
8RPC:   cong 256, cwnd was 512, now 512
8RPC:   740 xid 643f3c42 complete (28 bytes received)
8RPC:   xs_udp_send_request(148) = 148
8RPC:   740 xmit complete
8RPC:   wake_up_next(f70d4114 xprt_resend)
8RPC:   wake_up_next(f70d40e4 xprt_sending)
8RPC:   740 call_status (status 28)
8RPC:   740 call_decode (status 28)
8RPC:   740 validating UNIX cred f4efcb40
8RPC:   740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
8RPC:   740 call_decode result -22
8RPC:   740 return 0, status -22
8RPC:   740 release task
8RPC:   freeing buffer of size 528 at e627b800
8RPC:   740 release request f1416000
8RPC:   wake_up_next(f70d4174 xprt_backlog)
8RPC:   740 releasing UNIX cred f4efcb40
8RPC:   rpc_release_client(f6f1f880)
8NFS reply setattr: -22

[...]

3) It turns out the following:

- setattr returns EINVAL due to...
- RPC call_decode returns status 22

4) In conclusion in my understanding:

- At present linux nfs filesystem support is not anymore  
compatible with
old userspace servers like universal nfsd and crypto filesystems  
as cfsd

(which is an nfs server pretty old fashioned).


Linux NFS client issues with servers that are old or not widely used  
should be reported to [EMAIL PROTECTED] (thanks for  
forwarding this, Andrew).


User space servers are usually not tested with NFS client changes,  
since their use is infrequent compared with Solaris, NetApp filers,  
and the Linux knfsd (and several others).  We have to draw the line  
somewhere to make NFS client development a manageable process.



- This problem makes UNFSD and CFSD unusable on 2.6.22 and up (this
doesnt sound good to me)


It's likely the NFS client team is not aware of problems with these  
servers simply because none of us run them, and nothing has been  
reported so far.



The question are:

- Is this wanted or is a bug ?


An EINVAL return is fairly generic, but since it is coming from the  
reply path on the client, that probably indicates that decoding the  
reply failed somehow.  That could be a client problem, or the server  
reply was incorrect or corrupt.


More specific information about the problem is needed.  Could we see  
a wire trace?  A raw tcpdump or tethereal dump captured during the  
problematic interaction would be helpful.  Maybe even an strace of  
the failing application would tell us what the arguments for the  
setattr() call are.  And what are your mount options, especially  
which NFS version? (cat /proc/mounts)


Have you tried a 'git bisect' or something similar to track down  
exactly when client behavior changed?


- Can this be solved with some backwards compat stuff into the  
kernel or

all the old packages as UNFSD have to be
bugfixed/upgraded


If NFS servers don't conform to the NFS protocol specification, then  
it's unlikely that the NFS client will be changed to fix such  
issues.  Ie: server bugs need to be fixed on the server.  That is  
sometimes difficult if the server is not maintained, for instance.


- I have found other strange behaviors of the new NFS filesystem  
support

that i am investigating. All are bound to the user of
old userspace servers onto the new NFS (since 2.6.22). What to do ?


Report the problems on the [EMAIL PROTECTED] mailing list, or
document them in the 

Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-25 Thread Andrew Morton
On Sun, 23 Dec 2007 12:35:17 +0100 Gianluca Alberici <[EMAIL PROTECTED]> wrote:

> Hello,
> 
> I can do better. I have investigated a bit the problem:
> 
> 1) The problem arises only with the userspace nfsd (Universal nfsd 2.2). 
> I have realized that the latest patches introduced in 2.6.22 have 
> changed a lot of things into NFS.
> 
> 2) The following code has been debugged with sunrpc.rpc_debug and 
> sunrpc.nfs_debug
> 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> int main(int argc, char *argv[]) {
> int fp;
> if ((fp=open("/mnt/tmp/test",O_WRONLY | O_CREAT | O_TRUNC, S_IRWXU )) 
> == -1) printf("ERR: %d\n",errno);
> else {
> write(fp, argv[1], strlen(argv[1]));
> printf("OK\n");
> close(fp);
> };
> }
> 
> 
> 2b) The output
> 
> [...]
> 
> <8>NFS call  setattr
> <8>RPC:   new task initialized, procpid 18656
> <8>RPC:   allocated task f7bec500
> <8>RPC: 0 looking up UNIX cred
> <8>RPC:   740 __rpc_execute flags=0x480
> <8>RPC:   740 call_start nfs2 proc 2 (sync)
> <8>RPC:   740 call_reserve (status 0)
> <8>RPC:   740 reserved req f1416000 xid 643f3c42
> <8>RPC:   740 call_reserveresult (status 0)
> <8>RPC:   740 call_allocate (status 0)
> <8>RPC:   740 allocated buffer of size 528 at e627b800
> <8>RPC:   740 call_bind (status 0)
> <8>RPC:   740 call_connect xprt f70d4000 is connected
> <8>RPC:   740 call_transmit (status 0)
> <8>RPC:   740 xprt_prepare_transmit
> <8>RPC:   740 xprt_cwnd_limited cong = 0 cwnd = 512
> <8>RPC:   740 call_encode (status 0)
> <8>RPC:   740 marshaling UNIX cred f4efcb40
> <8>RPC:   740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
> <8>RPC:   740 xprt_transmit(148)
> <8>RPC:   xs_udp_data_ready...
> <8>RPC:   cong 256, cwnd was 512, now 512
> <8>RPC:   740 xid 643f3c42 complete (28 bytes received)
> <8>RPC:   xs_udp_send_request(148) = 148
> <8>RPC:   740 xmit complete
> <8>RPC:   wake_up_next(f70d4114 "xprt_resend")
> <8>RPC:   wake_up_next(f70d40e4 "xprt_sending")
> <8>RPC:   740 call_status (status 28)
> <8>RPC:   740 call_decode (status 28)
> <8>RPC:   740 validating UNIX cred f4efcb40
> <8>RPC:   740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
> <8>RPC:   740 call_decode result -22
> <8>RPC:   740 return 0, status -22
> <8>RPC:   740 release task
> <8>RPC:   freeing buffer of size 528 at e627b800
> <8>RPC:   740 release request f1416000
> <8>RPC:   wake_up_next(f70d4174 "xprt_backlog")
> <8>RPC:   740 releasing UNIX cred f4efcb40
> <8>RPC:   rpc_release_client(f6f1f880)
> <8>NFS reply setattr: -22
> 
> [...]
> 
> 3) It turns out the following:
> 
> - setattr returns EINVAL due to...
> - RPC call_decode returns status 22
> 
> 4) In conclusion in my understanding:
> 
> - At present linux nfs filesystem support is not anymore compatible with 
> old userspace servers like universal nfsd and crypto filesystems as cfsd 
> (which is an nfs server pretty old fashioned).
> - This problem makes UNFSD and CFSD unusable on 2.6.22 and up (this 
> doesnt sound good to me)
> 
> The question are:
> 
> - Is this wanted or is a bug ?
> - Can this be solved with some backwards compat stuff into the kernel or 
> all the old packages as UNFSD have to be
> bugfixed/upgraded
> - I have found other strange behaviors of the new NFS filesystem support 
> that i am investigating. All are bound to the user of
> old userspace servers onto the new NFS (since 2.6.22). What to do ?
> 

I'm not sure what the NFS client's policy is regarding support for
userspace servers.  But I'd certainly hope that it is "don't break them".

Which would make this an NFS client regression.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-25 Thread Andrew Morton
On Sun, 23 Dec 2007 12:35:17 +0100 Gianluca Alberici [EMAIL PROTECTED] wrote:

 Hello,
 
 I can do better. I have investigated a bit the problem:
 
 1) The problem arises only with the userspace nfsd (Universal nfsd 2.2). 
 I have realized that the latest patches introduced in 2.6.22 have 
 changed a lot of things into NFS.
 
 2) The following code has been debugged with sunrpc.rpc_debug and 
 sunrpc.nfs_debug
 
 #include sys/types.h
 #include sys/stat.h
 #include fcntl.h
 #include errno.h
 #include stdio.h
 
 int main(int argc, char *argv[]) {
 int fp;
 if ((fp=open(/mnt/tmp/test,O_WRONLY | O_CREAT | O_TRUNC, S_IRWXU )) 
 == -1) printf(ERR: %d\n,errno);
 else {
 write(fp, argv[1], strlen(argv[1]));
 printf(OK\n);
 close(fp);
 };
 }
 
 
 2b) The output
 
 [...]
 
 8NFS call  setattr
 8RPC:   new task initialized, procpid 18656
 8RPC:   allocated task f7bec500
 8RPC: 0 looking up UNIX cred
 8RPC:   740 __rpc_execute flags=0x480
 8RPC:   740 call_start nfs2 proc 2 (sync)
 8RPC:   740 call_reserve (status 0)
 8RPC:   740 reserved req f1416000 xid 643f3c42
 8RPC:   740 call_reserveresult (status 0)
 8RPC:   740 call_allocate (status 0)
 8RPC:   740 allocated buffer of size 528 at e627b800
 8RPC:   740 call_bind (status 0)
 8RPC:   740 call_connect xprt f70d4000 is connected
 8RPC:   740 call_transmit (status 0)
 8RPC:   740 xprt_prepare_transmit
 8RPC:   740 xprt_cwnd_limited cong = 0 cwnd = 512
 8RPC:   740 call_encode (status 0)
 8RPC:   740 marshaling UNIX cred f4efcb40
 8RPC:   740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
 8RPC:   740 xprt_transmit(148)
 8RPC:   xs_udp_data_ready...
 8RPC:   cong 256, cwnd was 512, now 512
 8RPC:   740 xid 643f3c42 complete (28 bytes received)
 8RPC:   xs_udp_send_request(148) = 148
 8RPC:   740 xmit complete
 8RPC:   wake_up_next(f70d4114 xprt_resend)
 8RPC:   wake_up_next(f70d40e4 xprt_sending)
 8RPC:   740 call_status (status 28)
 8RPC:   740 call_decode (status 28)
 8RPC:   740 validating UNIX cred f4efcb40
 8RPC:   740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
 8RPC:   740 call_decode result -22
 8RPC:   740 return 0, status -22
 8RPC:   740 release task
 8RPC:   freeing buffer of size 528 at e627b800
 8RPC:   740 release request f1416000
 8RPC:   wake_up_next(f70d4174 xprt_backlog)
 8RPC:   740 releasing UNIX cred f4efcb40
 8RPC:   rpc_release_client(f6f1f880)
 8NFS reply setattr: -22
 
 [...]
 
 3) It turns out the following:
 
 - setattr returns EINVAL due to...
 - RPC call_decode returns status 22
 
 4) In conclusion in my understanding:
 
 - At present linux nfs filesystem support is not anymore compatible with 
 old userspace servers like universal nfsd and crypto filesystems as cfsd 
 (which is an nfs server pretty old fashioned).
 - This problem makes UNFSD and CFSD unusable on 2.6.22 and up (this 
 doesnt sound good to me)
 
 The question are:
 
 - Is this wanted or is a bug ?
 - Can this be solved with some backwards compat stuff into the kernel or 
 all the old packages as UNFSD have to be
 bugfixed/upgraded
 - I have found other strange behaviors of the new NFS filesystem support 
 that i am investigating. All are bound to the user of
 old userspace servers onto the new NFS (since 2.6.22). What to do ?
 

I'm not sure what the NFS client's policy is regarding support for
userspace servers.  But I'd certainly hope that it is don't break them.

Which would make this an NFS client regression.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-23 Thread Gianluca Alberici

Hello,

I can do better. I have investigated a bit the problem:

1) The problem arises only with the userspace nfsd (Universal nfsd 2.2). 
I have realized that the latest patches introduced in 2.6.22 have 
changed a lot of things into NFS.


2) The following code has been debugged with sunrpc.rpc_debug and 
sunrpc.nfs_debug


#include 
#include 
#include 
#include 
#include 

int main(int argc, char *argv[]) {
   int fp;
   if ((fp=open("/mnt/tmp/test",O_WRONLY | O_CREAT | O_TRUNC, S_IRWXU )) == -1) 
printf("ERR: %d\n",errno);
   else {
   write(fp, argv[1], strlen(argv[1]));
   printf("OK\n");
   close(fp);
   };
}


2b) The output

[...]

<8>NFS call  setattr
<8>RPC:   new task initialized, procpid 18656
<8>RPC:   allocated task f7bec500
<8>RPC: 0 looking up UNIX cred
<8>RPC:   740 __rpc_execute flags=0x480
<8>RPC:   740 call_start nfs2 proc 2 (sync)
<8>RPC:   740 call_reserve (status 0)
<8>RPC:   740 reserved req f1416000 xid 643f3c42
<8>RPC:   740 call_reserveresult (status 0)
<8>RPC:   740 call_allocate (status 0)
<8>RPC:   740 allocated buffer of size 528 at e627b800
<8>RPC:   740 call_bind (status 0)
<8>RPC:   740 call_connect xprt f70d4000 is connected
<8>RPC:   740 call_transmit (status 0)
<8>RPC:   740 xprt_prepare_transmit
<8>RPC:   740 xprt_cwnd_limited cong = 0 cwnd = 512
<8>RPC:   740 call_encode (status 0)
<8>RPC:   740 marshaling UNIX cred f4efcb40
<8>RPC:   740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
<8>RPC:   740 xprt_transmit(148)
<8>RPC:   xs_udp_data_ready...
<8>RPC:   cong 256, cwnd was 512, now 512
<8>RPC:   740 xid 643f3c42 complete (28 bytes received)
<8>RPC:   xs_udp_send_request(148) = 148
<8>RPC:   740 xmit complete
<8>RPC:   wake_up_next(f70d4114 "xprt_resend")
<8>RPC:   wake_up_next(f70d40e4 "xprt_sending")
<8>RPC:   740 call_status (status 28)
<8>RPC:   740 call_decode (status 28)
<8>RPC:   740 validating UNIX cred f4efcb40
<8>RPC:   740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
<8>RPC:   740 call_decode result -22
<8>RPC:   740 return 0, status -22
<8>RPC:   740 release task
<8>RPC:   freeing buffer of size 528 at e627b800
<8>RPC:   740 release request f1416000
<8>RPC:   wake_up_next(f70d4174 "xprt_backlog")
<8>RPC:   740 releasing UNIX cred f4efcb40
<8>RPC:   rpc_release_client(f6f1f880)
<8>NFS reply setattr: -22

[...]

3) It turns out the following:

- setattr returns EINVAL due to...
- RPC call_decode returns status 22

4) In conclusion in my understanding:

- At present linux nfs filesystem support is not anymore compatible with 
old userspace servers like universal nfsd and crypto filesystems as cfsd 
(which is an nfs server pretty old fashioned).
- This problem makes UNFSD and CFSD unusable on 2.6.22 and up (this 
doesnt sound good to me)


The question are:

- Is this wanted or is a bug ?
- Can this be solved with some backwards compat stuff into the kernel or 
all the old packages as UNFSD have to be

bugfixed/upgraded
- I have found other strange behaviors of the new NFS filesystem support 
that i am investigating. All are bound to the user of

old userspace servers onto the new NFS (since 2.6.22). What to do ?

Thanks

Gianluca




Scott wrote:



On Dec 22, 2007, at 5:52 AM, Gianluca Alberici wrote:


I've run into this problem 2.6.23.9. The open syscall will return
"Invalid argument" when O_TRUNC is set on existing files.

The same file can be opened for append or removed.

The evidence is for example:

mars:~# mount localhost:/opt/nfs/ /mnt/tmp
mars:~# echo "Hello" > /mnt/tmp/test-file
mars:~# echo "Hello" > /mnt/tmp/test-file
bash: /mnt/tmp/test-file: Invalid argument



Can you show an strace of this please? Also, might suggest "rpcdebug - 
m nfsd -s all"


--
Scott --
To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in

the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-23 Thread Gianluca Alberici

Hello,

I can do better. I have investigated a bit the problem:

1) The problem arises only with the userspace nfsd (Universal nfsd 2.2). 
I have realized that the latest patches introduced in 2.6.22 have 
changed a lot of things into NFS.


2) The following code has been debugged with sunrpc.rpc_debug and 
sunrpc.nfs_debug


#include sys/types.h
#include sys/stat.h
#include fcntl.h
#include errno.h
#include stdio.h

int main(int argc, char *argv[]) {
   int fp;
   if ((fp=open(/mnt/tmp/test,O_WRONLY | O_CREAT | O_TRUNC, S_IRWXU )) == -1) 
printf(ERR: %d\n,errno);
   else {
   write(fp, argv[1], strlen(argv[1]));
   printf(OK\n);
   close(fp);
   };
}


2b) The output

[...]

8NFS call  setattr
8RPC:   new task initialized, procpid 18656
8RPC:   allocated task f7bec500
8RPC: 0 looking up UNIX cred
8RPC:   740 __rpc_execute flags=0x480
8RPC:   740 call_start nfs2 proc 2 (sync)
8RPC:   740 call_reserve (status 0)
8RPC:   740 reserved req f1416000 xid 643f3c42
8RPC:   740 call_reserveresult (status 0)
8RPC:   740 call_allocate (status 0)
8RPC:   740 allocated buffer of size 528 at e627b800
8RPC:   740 call_bind (status 0)
8RPC:   740 call_connect xprt f70d4000 is connected
8RPC:   740 call_transmit (status 0)
8RPC:   740 xprt_prepare_transmit
8RPC:   740 xprt_cwnd_limited cong = 0 cwnd = 512
8RPC:   740 call_encode (status 0)
8RPC:   740 marshaling UNIX cred f4efcb40
8RPC:   740 using AUTH_UNIX cred f4efcb40 to wrap rpc data
8RPC:   740 xprt_transmit(148)
8RPC:   xs_udp_data_ready...
8RPC:   cong 256, cwnd was 512, now 512
8RPC:   740 xid 643f3c42 complete (28 bytes received)
8RPC:   xs_udp_send_request(148) = 148
8RPC:   740 xmit complete
8RPC:   wake_up_next(f70d4114 xprt_resend)
8RPC:   wake_up_next(f70d40e4 xprt_sending)
8RPC:   740 call_status (status 28)
8RPC:   740 call_decode (status 28)
8RPC:   740 validating UNIX cred f4efcb40
8RPC:   740 using AUTH_UNIX cred f4efcb40 to unwrap rpc data
8RPC:   740 call_decode result -22
8RPC:   740 return 0, status -22
8RPC:   740 release task
8RPC:   freeing buffer of size 528 at e627b800
8RPC:   740 release request f1416000
8RPC:   wake_up_next(f70d4174 xprt_backlog)
8RPC:   740 releasing UNIX cred f4efcb40
8RPC:   rpc_release_client(f6f1f880)
8NFS reply setattr: -22

[...]

3) It turns out the following:

- setattr returns EINVAL due to...
- RPC call_decode returns status 22

4) In conclusion in my understanding:

- At present linux nfs filesystem support is not anymore compatible with 
old userspace servers like universal nfsd and crypto filesystems as cfsd 
(which is an nfs server pretty old fashioned).
- This problem makes UNFSD and CFSD unusable on 2.6.22 and up (this 
doesnt sound good to me)


The question are:

- Is this wanted or is a bug ?
- Can this be solved with some backwards compat stuff into the kernel or 
all the old packages as UNFSD have to be

bugfixed/upgraded
- I have found other strange behaviors of the new NFS filesystem support 
that i am investigating. All are bound to the user of

old userspace servers onto the new NFS (since 2.6.22). What to do ?

Thanks

Gianluca




Scott wrote:



On Dec 22, 2007, at 5:52 AM, Gianluca Alberici wrote:


I've run into this problem 2.6.23.9. The open syscall will return
Invalid argument when O_TRUNC is set on existing files.

The same file can be opened for append or removed.

The evidence is for example:

mars:~# mount localhost:/opt/nfs/ /mnt/tmp
mars:~# echo Hello  /mnt/tmp/test-file
mars:~# echo Hello  /mnt/tmp/test-file
bash: /mnt/tmp/test-file: Invalid argument



Can you show an strace of this please? Also, might suggest rpcdebug - 
m nfsd -s all


--
Scott --
To unsubscribe from this list: send the line unsubscribe 
linux-kernel in

the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-22 Thread Scott


On Dec 22, 2007, at 5:52 AM, Gianluca Alberici wrote:

I've run into this problem 2.6.23.9. The open syscall will return
"Invalid argument" when O_TRUNC is set on existing files.

The same file can be opened for append or removed.

The evidence is for example:

mars:~# mount localhost:/opt/nfs/ /mnt/tmp
mars:~# echo "Hello" > /mnt/tmp/test-file
mars:~# echo "Hello" > /mnt/tmp/test-file
bash: /mnt/tmp/test-file: Invalid argument


Can you show an strace of this please? Also, might suggest "rpcdebug - 
m nfsd -s all"


--
Scott 
--

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFS EINVAL on open(... | O_TRUNC) on 2.6.23.9

2007-12-22 Thread Scott


On Dec 22, 2007, at 5:52 AM, Gianluca Alberici wrote:

I've run into this problem 2.6.23.9. The open syscall will return
Invalid argument when O_TRUNC is set on existing files.

The same file can be opened for append or removed.

The evidence is for example:

mars:~# mount localhost:/opt/nfs/ /mnt/tmp
mars:~# echo Hello  /mnt/tmp/test-file
mars:~# echo Hello  /mnt/tmp/test-file
bash: /mnt/tmp/test-file: Invalid argument


Can you show an strace of this please? Also, might suggest rpcdebug - 
m nfsd -s all


--
Scott 
--

To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/