I have encountered this issue before as well. Something on the system is creating a new root user session keyring and keyctl_read fails after that happens. For now reloading the key into the keyring is what I have done. For the client you could mount with --skpath option so any time it's mounted it reloads the key but there is still the issue when the session context expires and the keys are re-established keyctl_read will fail again if a new keyring is created. I'm not sure when I'll have time to put together a fix for this but let me know if mounting with skpath option works.
Jeremy On Sun, Jun 24, 2018 at 4:41 PM, Mark Roper <[email protected]> wrote: > Hi Jeremy, > > Thanks for taking a look at my question. I have validated that the key on > the server and the client match and that the client key has the prime > generated. > > When I ssh to the client node and run > sudo mount -t lustre -o skpath=/secure_directory/scratch.client.key > 172.31.46.245@tcp:/scratch /scratch > I get the following output in /var/log/messages with verbosity turned up > to trace on the MDS node I see: > > Jun 24 20:26:41 ip-172-31-44-121 lsvcgssd[23975]: keyctl_read() failed for > key 27091278: Permission denied > > Jun 24 20:26:41 ip-172-31-44-121 lsvcgssd[23975]: Failed to create sk > credentials > As I mentioned, If I remove the option I'm able to mount the FS. I'm using > Lustre 2.11 server and clients. The server kernel is > 3.10.0-693.21.1.el7_lustre.x86_64 and the client kernel is > 3.10.0-693.21.1.el7.x86_64. > > I am wondering if this has something to do with linux keyring permissions > on CentOS. When I ssh to my server and client nodes as the user `centos` > and run `sudo lgss_sk -l /secure_directory/scratch.<server | client>.key` > followed by `keyctl show`, the lustre user key does not appear in the list > of keys. If I ssh to the client & server nodes as root and run the same > two commands, the lustre key shows up on the server as: > > 772711346 --alswrv 0 0 keyring: _ses > > 1047091535 --alswrv 0 65534 \_ keyring: _uid.0 > > 27091278 --alswrv 0 0 \_ user: lustre:scratch:default > > ... and on the client as: > > Session Keyring > > 269152212 --alswrv 0 0 keyring: _ses > > 1059491764 --alswrv 0 65534 \_ keyring: _uid.0 > > 146272009 --alswrv 0 0 \_ user: lustre:scratch > I'm going to try setting up a 2.10.3 server and client to see if this is > some kind of regression in 2.11 and not just me fat fingering something. > I'm also going to dive deeper into keyring permissions and see if I can > find anything there. I'll update this thread for those interested if I > figure it out. > > Any additional thoughts would be appreciated! > > Cheers, > > Mark > > > On Sun, Jun 24, 2018 at 4:02 PM Jeremy Filizetti < > [email protected]> wrote: > >> GSS error 0x60000 is GSS bad signature which would mean the HMAC was >> invalid. Can you verify your key file's have the same shared key? Do you >> have any logs for the server side as well? You can increase server >> verbosity by adding some extra v's to LSVCGSSDARGS in >> /etc/sysconfig/lsvcgss. >> >> Jeremy >> >> On Fri, Jun 22, 2018 at 3:41 PM, Mark Roper <[email protected]> wrote: >> >>> Hi Lustre Admins, >>> >>> I am hoping someone can help me understand what I'm doing wrong with SSK >>> setup. I have set up a lustre 2.11 server and worked through the steps to >>> use shared secret keys (SSKs) to encrypt data in transit between client >>> nodes and the MDT and OSS. I followed the manual instructions here: >>> http://doc.lustre.org/lustre_manual.xhtml#idm140687075065344 >>> >>> Before enabling the encryption settings on the MDT, I can mount the FS >>> on the client node. After I turn on the encryption I get back an >>> encryption refused error and cannot mount: >>> >>> mount.lustre: mount 172.31.46.245@tcp:/scratch at /scratch failed: >>> Connection refused >>> >>> The keys are definitely distributed to client nodes and server nodes and >>> the settings have all been made as instruct4red in the manual (I did this a >>> few times from scratch to make sure). I can manually load the keys into >>> the keyring and see them by running `keyctl show`, I can compare the key >>> files on client and server nodes with the command `lgss_sk --read >>> /secure_directory/scratch.client.key` and validate that they all match >>> and that the client has a prime. >>> >>> The commands I'm using to enable the encryption are: >>> >>> mdt# sudo lctl conf_param scratch.srpc.flavor.tcp.cli2mdt=skpi >>> mdt# sudo lctl conf_param scratch.srpc.flavor.tcp.cli2ost=skpi >>> I tried tailing /var/log/messages and am not able to interpret the >>> output, I'm wondering - does anyone have a hypothesis about what might be >>> wrong or instructions to debug? >>> >>> Log output is below! Many thanks to anyone who can help! >>> >>> Mark >>> >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main(): >>> start parsing parameters >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:INFO:main(): key >>> 428863463, desc 0@26, ugid 0:0, sring 46159405, coinfo 38:sk:0:0:m:p:2: >>> 0x20000ac1f2109:scratch-OST1cd0-osc-MDT0000:0x20000ac1f2ef5:1 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22250]:TRACE:parse_callout_info(): components: 38,sk,0,0,m,p,2, >>> 0x20000ac1f2109,scratch-OST1cd0-osc-MDT0000,0x20000ac1f2ef5,1 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22250]:DEBUG:parse_callout_info(): parse call out info: secid 38, mech >>> sk, ugid 0:0, is_root 0, is_mdt 1, is_ost 0, svc type p, svc 2, nid >>> 0x20000ac1f2109, tgt scratch-OST1cd0-osc-MDT0000, self nid 0x20000ac1f2ef5, >>> pid 1 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main(): >>> parsing parameters OK >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22250]:TRACE:lgss_mech_initialize(): >>> initialize mech sk >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22250]:TRACE:lgss_create_cred(): >>> create a sk cred at 0x1ecc2e0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main(): >>> caller's namespace is the same >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22250]:TRACE:lgss_prepare_cred(): preparing sk cred 0x1ecc2e0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22250]:INFO:sk_create_cred(): Creating credentials for target: >>> scratch-OST1cd0-osc-MDT0000 with nodemap: (null) >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22250]:INFO:sk_create_cred(): Searching for key with description: >>> lustre:scratch >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22250]:TRACE:prepare_and_instantiate(): >>> instantiated kernel key 198fefe7 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22250]:TRACE:main(): >>> forked child 22251 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:TRACE:lgssc_kr_negotiate(): >>> child start on behalf of key 198fefe7: cred 0x1ecc2e0, uid 0, svc 2, nid >>> 20000ac1f2109, uids: 0:0/0:0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:INFO:ipv4_nid2hostname(): >>> SOCKLND: net 0x20000, addr 0x9211fac => ip-172-31-33-9.us-west-2. >>> compute.internal >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:DEBUG:lgss_get_service_str(): >>> constructed service string: [email protected] >>> west-2.compute.internal >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:TRACE:lgss_using_cred(): using sk cred 0x1ecc2e0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main(): >>> start parsing parameters >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:INFO:main(): key >>> 189483693, desc 0@25, ugid 0:0, sring 46159405, coinfo 37:sk:0:0:m:p:2: >>> 0x20000ac1f2687:scratch-OST2b9d-osc-MDT0000:0x20000ac1f2ef5:1 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22253]:TRACE:parse_callout_info(): components: 37,sk,0,0,m,p,2, >>> 0x20000ac1f2687,scratch-OST2b9d-osc-MDT0000,0x20000ac1f2ef5,1 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22253]:DEBUG:parse_callout_info(): parse call out info: secid 37, mech >>> sk, ugid 0:0, is_root 0, is_mdt 1, is_ost 0, svc type p, svc 2, nid >>> 0x20000ac1f2687, tgt scratch-OST2b9d-osc-MDT0000, self nid 0x20000ac1f2ef5, >>> pid 1 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main(): >>> parsing parameters OK >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22253]:TRACE:lgss_mech_initialize(): >>> initialize mech sk >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22253]:TRACE:lgss_create_cred(): >>> create a sk cred at 0x21b02e0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main(): >>> caller's namespace is the same >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22253]:TRACE:lgss_prepare_cred(): preparing sk cred 0x21b02e0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22253]:INFO:sk_create_cred(): Creating credentials for target: >>> scratch-OST2b9d-osc-MDT0000 with nodemap: (null) >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22253]:INFO:sk_create_cred(): Searching for key with description: >>> lustre:scratch >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22253]:TRACE:prepare_and_instantiate(): >>> instantiated kernel key 0b4b4aad >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22253]:TRACE:main(): >>> forked child 22254 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:TRACE:lgssc_kr_negotiate(): >>> child start on behalf of key 0b4b4aad: cred 0x21b02e0, uid 0, svc 2, nid >>> 20000ac1f2687, uids: 0:0/0:0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:INFO:ipv4_nid2hostname(): >>> SOCKLND: net 0x20000, addr 0x87261fac => ip-172-31-38-135.us-west-2. >>> compute.internal >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:DEBUG:lgss_get_service_str(): >>> constructed service string: lustre_oss@ip-172-31-38-135. >>> us-west-2.compute.internal >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:TRACE:lgss_using_cred(): using sk cred 0x21b02e0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:INFO:sk_encode_netstring(): >>> Encoded netstring of 647 bytes >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:INFO:lgss_sk_using_cred(): Created netstring of 647 bytes >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:TRACE:lgssc_negotiation_manual(): >>> starting gss negotation >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:TRACE:do_nego_rpc(): start negotiation rpc >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:TRACE:gss_do_ioctl(): to open /proc/fs/lustre/sptlrpc/gss/ >>> init_channel >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:TRACE:gss_do_ioctl(): to down-write >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:INFO:sk_encode_netstring(): >>> Encoded netstring of 647 bytes >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:INFO:lgss_sk_using_cred(): Created netstring of 647 bytes >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:TRACE:lgssc_negotiation_manual(): >>> starting gss negotation >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:TRACE:do_nego_rpc(): start negotiation rpc >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:TRACE:gss_do_ioctl(): to open /proc/fs/lustre/sptlrpc/gss/ >>> init_channel >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:TRACE:gss_do_ioctl(): to down-write >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:TRACE:do_nego_rpc(): do_nego_rpc: to parse reply >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:DEBUG:do_nego_rpc(): do_nego_rpc: receive handle len 0, token len >>> 0, res 0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:ERROR:lgssc_negotiation_manual(): >>> negotiation gss error 60000 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:ERROR:lgssc_kr_negotiate_manual(): >>> key 198fefe7: failed to negotiate >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:TRACE:error_kernel_key(): revoking kernel key 198fefe7 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:INFO:error_kernel_key(): key 198fefe7: revoked >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22251]:TRACE:lgss_release_cred(): releasing sk cred 0x1ecc2e0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:TRACE:do_nego_rpc(): do_nego_rpc: to parse reply >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:DEBUG:do_nego_rpc(): do_nego_rpc: receive handle len 0, token len >>> 0, res 0 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:ERROR:lgssc_negotiation_manual(): >>> negotiation gss error 60000 >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:ERROR:lgssc_kr_negotiate_manual(): >>> key 0b4b4aad: failed to negotiate >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:TRACE:error_kernel_key(): revoking kernel key 0b4b4aad >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:INFO:error_kernel_key(): key 0b4b4aad: revoked >>> Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: >>> [22254]:TRACE:lgss_release_cred(): releasing sk cred 0x21b02e0 >>> >>> _______________________________________________ >>> lustre-discuss mailing list >>> [email protected] >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>> >>> >>
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
