Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Fri, 6 Nov 2009, James Andrewartha wrote: > How about attacking it the other way? Sign the SCA, get a sponsor and put > the fix into OpenSolaris, then sustaining just have to backport it. > http://hub.opensolaris.org/bin/view/Main/participate Do you mean the samba bug or the NFS bug? For the samba bug, I've already submitted a patch to fix the problem. For the NFS bug, while I have in the past pursued such options with open-source software, considering Solaris 10 is a commercial product for which we're paying a fairly substantial cost on for support, I'd really prefer they fix it themselves... > Also, since you know it's a NFS server issue now, have you tried asking > on nfs-discuss? Yup: http://opensolaris.org/jive/thread.jspa?messageID=430745 No responses... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Nov 6, 2009, at 11:23 PM, "Paul B. Henson" wrote: NFSv3 gss: damien cfservd # mount -o sec=krb5p ike.unx.csupomona.edu:/export/ user/henson /mnt hen...@damien /mnt/sgid_test $ ls -ld drwx--s--x+ 2 henson iit 2 Nov 6 20:14 . hen...@damien /mnt/sgid_test $ mkdir gss hen...@damien /mnt/sgid_test $ ls -l drwx--s--x+ 2 henson iit 2 Nov 6 20:14 gss NFSv3 sys: damien cfservd # mount -o sec=sys ike.unx.csupomona.edu:/export/user/ henson /mnt hen...@damien /mnt/sgid_test $ ls -ld drwx--s--x+ 3 henson iit 3 Nov 6 20:14 . hen...@damien /mnt/sgid_test $ mkdir sys hen...@damien /mnt/sgid_test $ ls -l drwx--s--x+ 2 henson iit 2 Nov 6 20:16 sys NFSv3 both auth gss and auth sys respects the sgid bit. NFSv4 both auth gss and auth sys does not. Unless you're talking about a different problem, this sure looks like an NFSv3 vs NFSv4 problem to me. IMHO, the v4 server is broken. It would appear Sun didn't even investigate it. It's a shame. -Ross ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Thu, 5 Nov 2009, Miles Nordin wrote: > allowing the first local patch into your site? or you are running a > closed-source release where you have to roll over and beg for support? We're running Solaris 10. It does seem like I spend an undue amount of time lately dealing with Sun support, I have another open issue (http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg30169.html) that's lasted over two months, involved multiple high-level support managers, and probably already cost the company considerably more resources thrashing than just applying the patch I already provided to fix the bug would have. As far as this NFS issue, when we initially reported the problem (which occurred with NFSv4), they claimed NFSv3 had the exact same behavior and since v4 worked like v3 it wasn't a bug. We didn't actually verify that, but at a later point a Red Hat support engineer indicated it worked correctly for him on a Solaris 10 server under v3, and only v4 was broken. We set up a v3 test and verified that was the case, which we've now pointed out on the support ticket, and are hoping it will actually get fixed now. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
> "pbh" == Paul B Henson writes: pbh> I've got a cron job running every hour on the backend servers pbh> crawling around and fixing permissions on new directories :(. To my view, if there's a problem it's first with the build system, second with NFS. You can fix Solaris to do what you want, right---it is like a 1-line change, and you've already found the line to change? Why can't you just fix all your systems in a local hg repository? Is that more painful than the cron job? When I had a lot of NetBSD systems, I had plenty of local patches, and with cvs and build.sh it wasn't a nightmare at all. Is the Solaris build framework really so much less convenient, or is it just a barrier to allowing the first local patch into your site? or you are running a closed-source release where you have to roll over and beg for support? pgpWUQAaNw5eD.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Tue, 3 Nov 2009, Ross Walker wrote: > Maybe this isn't an interoperability fix, but a security fix as it allows > non-Sun clients to bypass security restrictions placed on a sgid > protected directory tree because it doesn't properly test the existence > of that bit upon file creation. > > If an appropriate scenario can be made, and I'm sure it can, one might > even post a CERT advisory to make sure operators are made aware of this > potential security problem. I agree it's a security issue, I think I mentioned that at some point in this thread. However, it doesn't allow a client to do something they couldn't do anyway. If the sgid bit was respected and the directory was created with the right group, the client could chgrp it to their primary group afterwards. The security issue isn't that an evil client will avail of this to end up with a directory owned by the wrong group, it's that a poor innocent client will end up with a directory owned by their primary group rather than the group of the parent directory, and any inherited group@ ACL will apply to the primary group, resulting in insecure and unintended access :(. Another possible security issue that came up while I was discussing this issue with one of the Linux NFSv4 developers is that relying upon the client to set the ownership of the directory results in a race condition and is in their opinion buggy. In between the time the client generates the mkdir request and sends it over the wire and the server receives it, someone else might have changed the permissions or group ownership of the parent directory, resulting in the explicitly specified group provided by the client being wrong. They refuse to implement this buggy behavior, and to quote them, "You should get Sun to fix their server". I'm trying to do that, but no luck so far ... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Nov 2, 2009, at 2:38 PM, "Paul B. Henson" wrote: On Sat, 31 Oct 2009, Al Hopper wrote: Kudos to you - nice technical analysis and presentation, Keep lobbying your point of view - I think interoperability should win out if it comes down to an arbitrary decision. Thanks; but so far that doesn't look promising. Right now I've got a cron job running every hour on the backend servers crawling around and fixing permissions on new directories :(. You would have thought something like this would have been noticed in one of the NFS interoperability bake offs. Paul, Maybe your approaching this the wrong way. Maybe this isn't an interoperability fix, but a security fix as it allows non-Sun clients to bypass security restrictions placed on a sgid protected directory tree because it doesn't properly test the existence of that bit upon file creation. If an appropriate scenario can be made, and I'm sure it can, one might even post a CERT advisory to make sure operators are made aware of this potential security problem. -Ross ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Sat, 31 Oct 2009, Al Hopper wrote: > Kudos to you - nice technical analysis and presentation, Keep lobbying > your point of view - I think interoperability should win out if it comes > down to an arbitrary decision. Thanks; but so far that doesn't look promising. Right now I've got a cron job running every hour on the backend servers crawling around and fixing permissions on new directories :(. You would have thought something like this would have been noticed in one of the NFS interoperability bake offs. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Thu, 29 Oct 2009 casper@sun.com wrote: > Do you have the complete NFS trace output? My reading of the source code > says that the file will be created with the proper gid so I am actually > believing that the client "over corrects" the attributes after creating > the file/directory. Just wondering if you had a chance to look at the packet capture I sent and the pointers to the Solaris source code that appear to be causing the problem that results in ignoring the sgid bits on directory creations over NFS. The feedback I'm getting from sustaining on my support request is that they don't think it's broken and they're not inclined to fix it. Even if the spec doesn't explicitly define the behavior, respecting the sgid bit on directory creation still seems like the right thing to do. If you agree, perhaps you could use your considerable influence to try and improve interoperability ;)? Or perhaps put me in touch with someone in forward development, or someone in charge of attending NFS interoperability bakeoffs, that might be more interested in improvements? Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Thu, Oct 29, 2009 at 8:52 PM, Paul B. Henson wrote: > On Thu, 29 Oct 2009 casper@sun.com wrote: > > > Do you have the complete NFS trace output? My reading of the source code > > says that the file will be created with the proper gid so I am actually > > believing that the client "over corrects" the attributes after creating > > the file/directory. > > I dug around the OpenSolaris source code and believe I found where this > behavior is coming from. > > In > > > http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/nfs/nfs4_srv.c > > on line 1643, there's a comment: > > "Set default initial values for attributes when not specified in > createattrs." > . snip the good stuff . HI Paul. Kudos to you - nice technical analysis and presentation, Keep lobbying your point of view - I think interoperability should win out if it comes down to an arbitrary decision. Regards, -- Al Hopper Logical Approach Inc,Plano,TX a...@logical-approach.com Voice: 972.379.2133 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Fri, 30 Oct 2009, Darren J Moffat wrote: > Have you tried using different values for the per dataset aclinherit or > aclmode properties ? We have aclmode set to passthrough and aclinherit to passthrough-x (thanks again Mark!). We haven't tried anything else. > I'm not sure they will help you much but I was curious if you had looked > at this area for help. If you saw the message I sent late yesterday, I found the code in the nfs server which explicitly sets the group owner if one is not specified by the client, so I don't think at the filesystem level it has much choice, it's being told explicitly which group the new directory should be owned by. Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
Paul B. Henson wrote: I posted a little while back about a problem we are having where when a new directory gets created over NFS on a Solaris NFS server from a Linux NFS client, the new directory group ownership is that of the primary group of the process, even if the parent directory has the sgid bit set and is owned by a different group. Have you tried using different values for the per dataset aclinherit or aclmode properties ? aclinherit YES YES discard | noallow | restricted | passthrough | passthrough-x aclmode YES YES discard | groupmask | passthrough I'm not sure they will help you much but I was curious if you had looked at this area for help. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Thu, 29 Oct 2009 casper@sun.com wrote: > Do you have the complete NFS trace output? My reading of the source code > says that the file will be created with the proper gid so I am actually > believing that the client "over corrects" the attributes after creating > the file/directory. I dug around the OpenSolaris source code and believe I found where this behavior is coming from. In http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/nfs/nfs4_srv.c on line 1643, there's a comment: "Set default initial values for attributes when not specified in createattrs." And if the uid/gid is not explicitly specified in the NFS CREATE operation, the code calls crgetuid and crgetgid to determine what uid/gid to use for the mkdir operation. crgetgid is just "return (cr->cr_gid);", which would result in the behavior we describe -- if there is no group owner explicitly specified, new subdirectories are always created based on the primary group of the user, disregarding the presence of any sgid bit on the parent directory. As far as the client: http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/nfs/nfs4_vnops.c On line 6790 the client code explicitly checks whether or not the new directory is being created inside of a parent directory with a sgid bit set, and then explicitly includes the group owner if so. I'm guessing you are probably looking at the actual underlying filesystem code? That probably does do the right thing if the gid is not specified. But given the NFS server code, if no gid is specified by the client, explicitly uses the primary gid, by the time it gets to the underlying file system the gid is already specified and any filesystem level sgid handling is bypassed. I doubt if the resolution to the problem is as simple as not having the NFS server code explicitly specify a gid if none is given by the client, allowing the underlying filesystem to "do the right thing", but who knows :)... I still think that the preferred behavior would be to respect the sgid bit semantics, and continue to hope I can convince the engineers in charge of this decision to agree. Thanks... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
On Thu, 29 Oct 2009 casper@sun.com wrote: > Do you have the complete NFS trace output? My reading of the source code > says that the file will be created with the proper gid so I am actually > believing that the client "over corrects" the attributes after creating > the file/directory. Yes, we submitted that to support. It's SR#71757154, although I don't know if they've kept the ticket kept up-to-date. My understanding of the current status is that they have verified the behavior we describe, and given the ambiguity of the POSIX spec are not necessarily inclined to change it. I've attached a small packet capture from creating a subdirectory on a Solaris 10U8 NFS server from both a Linux client and a Solaris client. For the linux client: -- hen...@damien /mnt/sgid_test $ ls -ld . drwx--s--x 2 henson iit 2 Oct 29 17:29 . hen...@damien /mnt/sgid_test $ id uid=1005(henson) gid=1012(csupomona) hen...@damien /mnt/sgid_test $ mkdir linux hen...@damien /mnt/sgid_test $ ls -l total 2 drwx--s--x 2 henson csupomona 2 Oct 29 17:31 linux -- The mkdir operation appears to consist of the compound call "PUTFH;SAVEFH;CREATE;GETFH;GETATTR;RESTOREFH;GETATTR"; the CREATE call specifies an attrmask of just FATTR4_MODE. The response to the GETATTR call shows the FATTR4_OWNER_GROUP to be csupomona. For the Solaris client: -- hen...@s10 /mnt/sgid_test $ ls -ld . drwx--s--x+ 3 henson iit3 Oct 29 17:31 . hen...@s10 /mnt/sgid_test $ id uid=1005(henson) gid=1012(csupomona) hen...@s10 /mnt/sgid_test $ mkdir solaris hen...@s10 /mnt/sgid_test $ ls -l total 4 drwx--s--x+ 2 henson iit2 Oct 29 17:33 solaris -- The mkdir in this case consists of the compound call "PUTFH;CREATE;GETFH;GETATTR;SAVEFH;PUTFH;GETATTR;RESTOREFH;NVERIFY;SETATTR", the CREATE call specifies an attrmask of both FATTR4_MODE *and* FATTR4_OWNER_GROUP with iit as the group. In the reply to GETATTR, FATTR4_OWNER_GROUP is iit. We don't see any evidence that the Linux client explicitly changes the group ownership after the directory is made. If I might inquire, which source code are you looking at? Is it available though the OpenSolaris online source browser? If so, could I trouble you for a link to it? Thanks much for any help you might provide in clarifying this issue, and if our understanding of the behavior turns out to be accurate, any help in getting a change committed to better respect the sgid bit :)... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 linux_mkdir.pcap Description: Binary data solaris_mkdir.pcap Description: Binary data ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
>I posted a little while back about a problem we are having where when a >new directory gets created over NFS on a Solaris NFS server from a Linux >NFS client, the new directory group ownership is that of the primary group >of the process, even if the parent directory has the sgid bit set and is >owned by a different group. > >Basically, a Solaris client in such an instance explicitly requests that >the new directory be owned by the group of the parent directory, and the >server follows that request. A Linux NFS client, on the other hand, does >not explicitly request any particular group ownership for the new >directory, leaving the server to decide that on its own, which in the case >of the Solaris server, is not the "right" group. Do you have the complete NFS trace output? My reading of the source code says that the file will be created with the proper gid so I am actually believing that the client "over corrects" the attributes after creating the file/directory. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] CR6894234 -- improved sgid directory compatibility with non-Solaris NFS clients
I posted a little while back about a problem we are having where when a new directory gets created over NFS on a Solaris NFS server from a Linux NFS client, the new directory group ownership is that of the primary group of the process, even if the parent directory has the sgid bit set and is owned by a different group. Basically, a Solaris client in such an instance explicitly requests that the new directory be owned by the group of the parent directory, and the server follows that request. A Linux NFS client, on the other hand, does not explicitly request any particular group ownership for the new directory, leaving the server to decide that on its own, which in the case of the Solaris server, is not the "right" group. The POSIX spec on this is somewhat ambiguous, so you can't really say the Solaris implementation is "broken", but while perhaps following the letter of the spec, I don't think it's following the spirit of the sgid bit on directories. I have a CR, #6894234, which is currently being reviewed through Sun support. It seems their current inclination is to not change the behavior. Again, while not technically broken, I would argue this behavior is undesirable. The semantics of the sgid bit on directories are that new subdirectories should be owned by the group of the parent directory. That's what happens under Solaris for local file system access. That's what happens under Solaris if a directory is made via NFS from a Solaris NFS client. It's not what happens when a new directory is created via NFS from a Linux NFS client, or any other NFS client that does not explicitly request the group ownership when creating a directory. While POSIX does not explicitly specify what a server should do when creating a new directory and the client does not specify the group ownership, in the case where the new directory resides in an existing directory with the sgid bit set, following standard sgid bit directory group ownership semantics seems the most appropriate thing to do. If any Sun engineers with an interest in improved interoperability and keeping true to the spirit of the sgid bit could take a look at this CR and weigh in on its final resolution, that would be greatly appreciated. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss