Sailesh Mukil has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed ......................................................................
Patch Set 5: (6 comments) @Todd: Thanks, I got the ASAN build working. Aside from the code comments, I had a very silly bug in an intermediate patch that I got rid of. http://gerrit.cloudera.org:8080/#/c/5820/4/src/kudu/integration-tests/external_mini_cluster.cc File src/kudu/integration-tests/external_mini_cluster.cc: Line 35: #include "kudu/server/server_base.pb.h" > error: 'kudu/master/master_rpc.h' file not found [clang-diagnostic-error] Done http://gerrit.cloudera.org:8080/#/c/5820/4/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 81: krb5_context krb5_ctx_; > warning: 'kinit_ctx' is a static definition in anonymous namespace; static Done PS4, Line 125: auto cleanup_cursor = MakeScopedCleanup([&]() { : krb5_cc_end_seq_get(krb5_ctx_, *ccache_, &cursor); }); : : > the kerberos code uses a 'data_eq' helper here which also uses the length f I ported in the data_eq functions (they're very small), and that did away with the heap-buffer overflow. There still is a leak though that I'm unable to track down. Line 134: auto cleanup_creds = MakeScopedCleanup([&]() { > we may want some jitter on this. As is, it seems like if you started up 100 You're right, but adding a jitter here wouldn't help, as this is just a deadline and if all 100 servers try to renew at the same point in time before this deadline, they would still flood the servers. I've added the jitter while setting the sleep_interval_ in Kinit(), so that each server, with a high probability, wakes up at a different time than all or most other servers. PS4, Line 144: continue; : } : > this probably needs the same OSX workaround performed below in the initial Done PS4, Line 157: // If the ticket has already expired or if there's only a short period before which the : // renew window closes, we acquire a new ticket. : if (ticket_expiry < now || renew_deadline < now) { : // Acquire a new ticket using the keytab. This ticket will automatically be put into the : // credential cache. : KRB5_RETURN_NOT_OK_PREPEND(krb5_get_init_creds_keytab(krb5_ctx_, &new_creds, principal_, : > does this leave open a race where a connection might be attempted in betwee Good catch. To verify, I manually added a sleep between both the function calls and authentication of the peer failed. I changed it so that we allocate a new ccache and store the credentials there and swap out the old cache for the new one, before closing and deleting the old ccache. Odd how most other implementations of this don't take care of this case. -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil <sail...@cloudera.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes