[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 10: lgtm, i'll put this on a cluster and see how it goes. -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has submitted this change and it was merged. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. KUDU-1845: Kerberos client keytab should be periodically renewed With the current kerberos support, if a ticket expires, all communication on a secure cluster would stop as we do not attempt to renew the ticket. This patch adds a renewal thread which periodically wakes up and either renews the Ticket Granting Ticket or acquires a new one if the renewal window is closed. The renew interval is controlled by a newly introduced flag called 'kerberos_reinit_interval'. A reader-writer lock is used to avoid a race at the time of ticket reinitializtaion, which can occur between the time the credential cache is reinitialized and the time the new credentials are placed in the cache. The reader lock is taken before any call to the SASL library (if kerberos is enabled) and the writer lock is taken before reinitializing the cache and placing the new creds in it. Testing: Added 2 tests, one for ticket renewal and one for ticket reacquisition. I haven't added tests to confirm if disabling the renew thread makes these tests fail as that would require quite some plumbing which I think is unnecessary. Instead, I manually tested that disabling the renew thread causes the tests to fail. Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Reviewed-on: http://gerrit.cloudera.org:8080/5820 Tested-by: Kudu Jenkins Reviewed-by: Todd Lipcon--- M src/kudu/integration-tests/external_mini_cluster-test.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/rpc/sasl_common.cc M src/kudu/security/init.cc M src/kudu/security/init.h M src/kudu/security/test/mini_kdc.cc M src/kudu/security/test/mini_kdc.h 8 files changed, 318 insertions(+), 43 deletions(-) Approvals: Todd Lipcon: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 11 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 10: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Sailesh Mukil has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 10: (11 comments) http://gerrit.cloudera.org:8080/#/c/5820/9/src/kudu/security/init.cc File src/kudu/security/init.cc: PS9, Line 69: : // Acquires a new Ticket Granting Ticket (TGT). > is this still accurate? If this method isn't renewing a ticket, but rather Whoops sorry, I forgot to revert the comment. It does renewal again now. Line 83: // Returns a value based on 'time_remaining' that increases exponentially with > please rename to include units (sleep_interval_secs()) or have it return a Renamed to sleep_interval_secs() PS9, Line 95: > nit: name g_kinit_ctx Done Line 100: > I think non-POD global data is a no-no. Better to use a pointer and explici Done PS9, Line 120: entation > I don't think the d.length check is necessary, since memcmp with 0 length i Done Line 176: > did we determine that the underlying get_init_creds_keytab held the appropr It doesn't use internal locks, but it looks like it retries once if it doesn't go through the first time but that won't be enough, so I'll wrap the lock around it. PS9, Line 197: // This follows the same format as is_local_tgt() from krb5:src/clients/klist/klist.c : if (creds.server->length != 2 || : data_eq(creds.server->data[1], principal_->realm) == 0 || > isn't this already set when we initialized opts_? Oh yea, this is redundant. Removed it. PS9, Line 203: } : : time_t now = time(nullptr); : time_t ticket_expiry = creds.times.endtime; : time_t renew_till = creds.times.renew_till; : time_t renew_deadline = renew_till - 30; : : krb5_creds new_creds; > wrap this in a block so that the log message below isn't holding the lock Done PS9, Line 252: KRB5_RETURN_NOT_OK_PREPEND(krb5_cc_store_cred(krb5_ctx_, ccache_, _creds), :"Failed to store credentials in ccache"); : } : LOG(INFO) << "Successfully renewed kerberos TGT"; : } : ticket_end_timestamp_ = new_creds.times.endtime; : break; : } > I'm worried that, since we're just setting the interval to wae up just befo Added a backoff mechanism on failures, and also this is recalculated after every ticket renewal. http://gerrit.cloudera.org:8080/#/c/5820/9/src/kudu/security/init.h File src/kudu/security/init.h: Line 20: > can just use a forward decl Done Line 29: Status InitKerberosForServer(); > I think it's worth documenting this further in the header. Something like: Done -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 9: (11 comments) http://gerrit.cloudera.org:8080/#/c/5820/9/src/kudu/security/init.cc File src/kudu/security/init.cc: PS9, Line 69: // Does not do actual ticket renewal since there is a fundamental race in the way ticket : // renewal is done. Even from krb5:src/clients/kinit/kinit.c is this still accurate? If this method isn't renewing a ticket, but rather reacquiring, I think 'DoReacquire()' is probably a better name. Line 83: int32_t sleep_interval() { return sleep_interval_; } please rename to include units (sleep_interval_secs()) or have it return a MonoDelta so that units aren't confused PS9, Line 95: kinit_ctx nit: name g_kinit_ctx Line 100: RWMutex kerberos_reinit_lock(RWMutex::Priority::PREFER_WRITING); I think non-POD global data is a no-no. Better to use a pointer and explicitly leak it. Otherwise we can hit weird destruction-order related crashes PS9, Line 120: d.length I don't think the d.length check is necessary, since memcmp with 0 length is guaranteed to return 0, according to the man page. (same above) Line 176: KRB5_RETURN_NOT_OK_PREPEND(krb5_get_init_creds_keytab(krb5_ctx_, _creds, principal_, did we determine that the underlying get_init_creds_keytab held the appropriate internal locks such that it isn't necessary to wrap this in the writelock? I thought that even the krb5 library itself didn't prevent against this race. PS9, Line 197: KRB5_RETURN_NOT_OK_PREPEND(krb5_get_init_creds_opt_set_out_ccache(krb5_ctx_, opts_, : ccache_), : "unable to set init_creds options"); isn't this already set when we initialized opts_? PS9, Line 203: std::lock_guard l(kerberos_reinit_lock); : // Clear existing credentials in cache. : KRB5_RETURN_NOT_OK_PREPEND(krb5_cc_initialize(krb5_ctx_, ccache_, principal_), : "Failed to re-initialize ccache"); : : // Store the new credentials in the cache. : KRB5_RETURN_NOT_OK_PREPEND(krb5_cc_store_cred(krb5_ctx_, ccache_, _creds), : "Failed to store credentials in ccache"); wrap this in a block so that the log message below isn't holding the lock PS9, Line 252: // If the interval between now and ticket expiry is: : // * > 10 minutes: We attempt to renew the ticket between 5 seconds and 5 minutes before the : // ticket expires. : // * 5 - 10 minutes: We attempt to renew the ticket betwen 5 seconds and 1 minute before the : // ticket expires. : // * < 5 minutes:Attempt to renew the ticket every 'interval'. : // The jitter is added to make sure that every server doesn't flood the KDC at the same time. : { I'm worried that, since we're just setting the interval to wae up just before expiration, that if there's any slight blip with the KDC (eg it's being restarted), we'll then go to sleep for an entire 2nd interval. I think we either need to reset the interval after each renewal attempt (regardless of failure) so that you get a very short interval just after a failure, or need to make it so that on failure it does some kind of backoff. http://gerrit.cloudera.org:8080/#/c/5820/9/src/kudu/security/init.h File src/kudu/security/init.h: Line 20: #include "kudu/util/rw_mutex.h" can just use a forward decl Line 29: // Returns the process lock 'kerberos_reinit_lock' I think it's worth documenting this further in the header. Something like: This lock is acquired in write mode while the ticket is being renewed, and acquired in read mode before using the SASL library which might require a ticket. -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5820 to look at the new patch set (#9). Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. KUDU-1845: Kerberos client keytab should be periodically renewed With the current kerberos support, if a ticket expires, all communication on a secure cluster would stop as we do not attempt to renew the ticket. This patch adds a renewal thread which periodically wakes up and either renews the Ticket Granting Ticket or acquires a new one if the renewal window is closed. The renew interval is controlled by a newly introduced flag called 'kerberos_reinit_interval'. A reader-writer lock is used to avoid a race at the time of ticket reinitializtaion, which can occur between the time the credential cache is reinitialized and the time the new credentials are placed in the cache. The reader lock is taken before any call to the SASL library (if kerberos is enabled) and the writer lock is taken before reinitializing the cache and placing the new creds in it. Testing: Added 2 tests, one for ticket renewal and one for ticket reacquisition. I haven't added tests to confirm if disabling the renew thread makes these tests fail as that would require quite some plumbing which I think is unnecessary. Instead, I manually tested that disabling the renew thread causes the tests to fail. Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d --- M src/kudu/integration-tests/external_mini_cluster-test.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/rpc/sasl_common.cc M src/kudu/security/init.cc M src/kudu/security/init.h M src/kudu/security/test/mini_kdc.cc M src/kudu/security/test/mini_kdc.h 8 files changed, 289 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/5820/9 -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 9 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Sailesh Mukil has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 8: (2 comments) > Build Failed > > http://104.196.14.100/job/kudu-gerrit/6321/ : FAILURE Will address these asap. http://gerrit.cloudera.org:8080/#/c/5820/7/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 108: return (d1.length == d2.length && (d1.length == 0 || > warning: 'data_eq' is a static definition in anonymous namespace; static is Done Line 115: !memcmp(d.data, s, d.length))); > warning: 'data_eq_string' is a static definition in anonymous namespace; st Done -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5820 to look at the new patch set (#8). Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. KUDU-1845: Kerberos client keytab should be periodically renewed With the current kerberos support, if a ticket expires, all communication on a secure cluster would stop as we do not attempt to renew the ticket. This patch adds a renewal thread which periodically wakes up and either renews the Ticket Granting Ticket or acquires a new one if the renewal window is closed. The renew interval is controlled by a newly introduced flag called 'kerberos_reinit_interval'. A reader-writer lock is used to avoid a race at the time of ticket reinitializtaion, which can occur between the time the credential cache is reinitialized and the time the new credentials are placed in the cache. The reader lock is taken before any call to the SASL library (if kerberos is enabled) and the writer lock is taken before reinitializing the cache and placing the new creds in it. Testing: Added 2 tests, one for ticket renewal and one for ticket reacquisition. I haven't added tests to confirm if disabling the renew thread makes these tests fail as that would require quite some plumbing which I think is unnecessary. Instead, I manually tested that disabling the renew thread causes the tests to fail. Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d --- M src/kudu/integration-tests/external_mini_cluster-test.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/rpc/sasl_common.cc M src/kudu/security/init.cc M src/kudu/security/init.h M src/kudu/security/test/mini_kdc.cc M src/kudu/security/test/mini_kdc.h 8 files changed, 284 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/5820/8 -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 8 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Sailesh Mukil has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/5820/6/src/kudu/security/init.cc File src/kudu/security/init.cc: PS6, Line 189: if (krb5_init_context(_ctx_) != 0) { : return Status::RuntimeError("could not initialize krb5 library"); > I'm not sure this is accomplishing what we want. Doesn't this just end up w You're right. I looked at the krb5 code and this call just returns the handle to the old ccache. As we spoke, the only known workaround to this is to not renew the tickets and just reacquire the tickets every time. I've removed the renewal code, and just left the reacquire code. However, this race still exists when using Heimdal's krb5 and there's no workaround for that. I've left a comment explaining that. Line 248: KRB5_RETURN_NOT_OK_PREPEND(krb5_cc_store_cred(krb5_ctx_, ccache_, ), > isn't it possible that the renewal time changes when we renew the ticket? m How so? The renewal time is based off the 'ticket_lifetime' which is a config value in krb5.conf. So 'endtime - now' will always be the same assuming 'now' is the 'start_time' which it almost always will be. Are you talking about the case where someone changes that value in the conf file midway? -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5820 to look at the new patch set (#7). Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. KUDU-1845: Kerberos client keytab should be periodically renewed With the current kerberos support, if a ticket expires, all communication on a secure cluster would stop as we do not attempt to renew the ticket. This patch adds a renewal thread which periodically wakes up and acquires a new ticket. We don't attempt to renew the existing ticket as there's a fundamental race in how ticket renewal works, as explained in the comment above KinitContext::DoRenewal(). Testing: Added a test for ticket reacquisition. I haven't added tests to confirm if disabling the renew thread makes these tests fail as that would require quite some plumbing which I think is unnecessary. Instead, I manually tested that disabling the renew thread causes the tests to fail. Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d --- M src/kudu/integration-tests/external_mini_cluster-test.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/security/init.cc M src/kudu/security/test/mini_kdc.cc M src/kudu/security/test/mini_kdc.h 6 files changed, 225 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/5820/7 -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/5820/6/src/kudu/security/init.cc File src/kudu/security/init.cc: PS6, Line 189: KRB5_RETURN_NOT_OK_PREPEND(krb5_cc_default(krb5_ctx_, new_ccache), : "unable to get default credentials cache"); I'm not sure this is accomplishing what we want. Doesn't this just end up with krb5_ccache pointing at the same ticket cache that the old one was, rather than creating a new one? Line 248: int32_t interval = creds.times.endtime - time(nullptr); isn't it possible that the renewal time changes when we renew the ticket? maybe we should recalculate our "next renewal time" every time we renew? -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Sailesh Mukil has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 6: > Uploaded patch set 6. I just had to #include The util/random library wouldn't have helped here because we need to seed it with something random as well. If we use the same seed across nodes, we will end up with the same random number on all the nodes, which means all of them will still end up flooding the KDC at the same time, completely defeating the purpose. -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Sailesh Mukil has uploaded a new patch set (#6). Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. KUDU-1845: Kerberos client keytab should be periodically renewed With the current kerberos support, if a ticket expires, all communication on a secure cluster would stop as we do not attempt to renew the ticket. This patch adds a renewal thread which periodically wakes up and either renews the Ticket Granting Ticket or acquires a new one if the renewal window is closed. The renew interval is controlled by a newly introduced flag called 'kerberos_reinit_interval'. Testing: Added 2 tests, one for ticket renewal and one for ticket reacquisition. I haven't added tests to confirm if disabling the renew thread makes these tests fail as that would require quite some plumbing which I think is unnecessary. Instead, I manually tested that disabling the renew thread causes the tests to fail. Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d --- M src/kudu/integration-tests/external_mini_cluster-test.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/security/init.cc M src/kudu/security/test/mini_kdc.cc M src/kudu/security/test/mini_kdc.h 6 files changed, 273 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/5820/6 -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5820 to look at the new patch set (#6). Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. KUDU-1845: Kerberos client keytab should be periodically renewed With the current kerberos support, if a ticket expires, all communication on a secure cluster would stop as we do not attempt to renew the ticket. This patch adds a renewal thread which periodically wakes up and either renews the Ticket Granting Ticket or acquires a new one if the renewal window is closed. The renew interval is controlled by a newly introduced flag called 'kerberos_reinit_interval'. Testing: Added 2 tests, one for ticket renewal and one for ticket reacquisition. I haven't added tests to confirm if disabling the renew thread makes these tests fail as that would require quite some plumbing which I think is unnecessary. Instead, I manually tested that disabling the renew thread causes the tests to fail. Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d --- M src/kudu/integration-tests/external_mini_cluster-test.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/security/init.cc M src/kudu/security/test/mini_kdc.cc M src/kudu/security/test/mini_kdc.h 6 files changed, 273 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/5820/6 -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 6 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
Re: [kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Will fix it now and post an update. On Fri, Feb 3, 2017 at 9:34 AM, Todd Lipcon (Code Review) < ger...@cloudera.org> wrote: > Todd Lipcon has posted comments on this change. > > Change subject: KUDU-1845: Kerberos client keytab should be periodically > renewed > .. > > > Patch Set 5: > > seems the TSAN build doesn't like the std::random stuff (maybe it's not > available in libc++?) We have our own RNG in util/random.h > > -- > To view, visit http://gerrit.cloudera.org:8080/5820 > To unsubscribe, visit http://gerrit.cloudera.org:8080/settings > > Gerrit-MessageType: comment > Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d > Gerrit-PatchSet: 5 > Gerrit-Project: kudu > Gerrit-Branch: master > Gerrit-Owner: Sailesh Mukil> Gerrit-Reviewer: Adar Dembo > Gerrit-Reviewer: Kudu Jenkins > Gerrit-Reviewer: Sailesh Mukil > Gerrit-Reviewer: Tidy Bot > Gerrit-Reviewer: Todd Lipcon > Gerrit-HasComments: No >
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 5: seems the TSAN build doesn't like the std::random stuff (maybe it's not available in libc++?) We have our own RNG in util/random.h -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Sailesh Mukil has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 5: (6 comments) @Todd: Thanks, I got the ASAN build working. Aside from the code comments, I had a very silly bug in an intermediate patch that I got rid of. http://gerrit.cloudera.org:8080/#/c/5820/4/src/kudu/integration-tests/external_mini_cluster.cc File src/kudu/integration-tests/external_mini_cluster.cc: Line 35: #include "kudu/server/server_base.pb.h" > error: 'kudu/master/master_rpc.h' file not found [clang-diagnostic-error] Done http://gerrit.cloudera.org:8080/#/c/5820/4/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 81: krb5_context krb5_ctx_; > warning: 'kinit_ctx' is a static definition in anonymous namespace; static Done PS4, Line 125: auto cleanup_cursor = MakeScopedCleanup([&]() { : krb5_cc_end_seq_get(krb5_ctx_, *ccache_, ); }); : : > the kerberos code uses a 'data_eq' helper here which also uses the length f I ported in the data_eq functions (they're very small), and that did away with the heap-buffer overflow. There still is a leak though that I'm unable to track down. Line 134: auto cleanup_creds = MakeScopedCleanup([&]() { > we may want some jitter on this. As is, it seems like if you started up 100 You're right, but adding a jitter here wouldn't help, as this is just a deadline and if all 100 servers try to renew at the same point in time before this deadline, they would still flood the servers. I've added the jitter while setting the sleep_interval_ in Kinit(), so that each server, with a high probability, wakes up at a different time than all or most other servers. PS4, Line 144: continue; : } : > this probably needs the same OSX workaround performed below in the initial Done PS4, Line 157: // If the ticket has already expired or if there's only a short period before which the : // renew window closes, we acquire a new ticket. : if (ticket_expiry < now || renew_deadline < now) { : // Acquire a new ticket using the keytab. This ticket will automatically be put into the : // credential cache. : KRB5_RETURN_NOT_OK_PREPEND(krb5_get_init_creds_keytab(krb5_ctx_, _creds, principal_, : > does this leave open a race where a connection might be attempted in betwee Good catch. To verify, I manually added a sleep between both the function calls and authentication of the peer failed. I changed it so that we allocate a new ccache and store the credentials there and swap out the old cache for the new one, before closing and deleting the old ccache. Odd how most other implementations of this don't take care of this case. -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5820 to look at the new patch set (#5). Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. KUDU-1845: Kerberos client keytab should be periodically renewed With the current kerberos support, if a ticket expires, all communication on a secure cluster would stop as we do not attempt to renew the ticket. This patch adds a renewal thread which periodically wakes up and either renews the Ticket Granting Ticket or acquires a new one if the renewal window is closed. The renew interval is controlled by a newly introduced flag called 'kerberos_reinit_interval'. Testing: Added 2 tests, one for ticket renewal and one for ticket reacquisition. I haven't added tests to confirm if disabling the renew thread makes these tests fail as that would require quite some plumbing which I think is unnecessary. Instead, I manually tested that disabling the renew thread causes the tests to fail. Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d --- M src/kudu/integration-tests/external_mini_cluster-test.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/security/init.cc M src/kudu/security/test/mini_kdc.cc M src/kudu/security/test/mini_kdc.h 6 files changed, 273 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/5820/5 -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 5 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/5820/4/src/kudu/security/init.cc File src/kudu/security/init.cc: PS4, Line 125: strcmp(creds.server->data[1].data, principal_->realm.data) != 0 || : strcmp(creds.server->data[0].data, KRB5_TGS_NAME) != 0 || : strcmp(creds.server->realm.data, principal_->realm.data) != 0) { : the kerberos code uses a 'data_eq' helper here which also uses the length field of the data (and probably avoids the heap buffer overflow that ASAN is hitting) Line 134: time_t renew_deadline = renew_till - 30; we may want some jitter on this. As is, it seems like if you started up 100 servers, they'd all get their initial ticket at the same time. Then, they'd all hit this 30-second-before-expiration window at the same moment and cause a coordinated flood of load at the KDC at that moment. Some randomization like between 30 seconds and 5 minutes would probably be a good idea (potentailly also a random sleep before the first renewal attempt so that the checks themselves are offset) PS4, Line 144: // Acquire a new ticket using the keytab. This ticket will automatically be put into the : // credential cache. : this probably needs the same OSX workaround performed below in the initial kinit PS4, Line 157: // Clear existing credentials in cache. : KRB5_RETURN_NOT_OK_PREPEND(krb5_cc_initialize(krb5_ctx_, ccache_, principal_), : "Failed to re-initialize ccache"); : // Store the new credentials in the cache. : KRB5_RETURN_NOT_OK_PREPEND(krb5_cc_store_cred(krb5_ctx_, ccache_, _creds), : "Failed to store credentials in ccache"); : does this leave open a race where a connection might be attempted in between the two calls, see no TGT, and fail? is there a way to atomically replace the cred? (what happens if you don't use krb5_cc_initialize to reset it first?) -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 4: > So the ASAN build fails because the renew thread is leaking all its > allocations, since the thread runs forever and most of the objects are never > freed. I think the leak is actually that the scoped_cleanup for 'creds' should be inside the krb5_cc_next_cred(...) loop, since each iteration creates a new creds object, right? BTW I'm also seeing a heap buffer overflow in the current rev of the patch -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Sailesh Mukil has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 4: So the ASAN build fails because the renew thread is leaking all its allocations, since the thread runs forever and most of the objects are never freed. I tried adding a security::Shutdown() which gets called from ServerBase::Shutdown(), but that seems to not be of any use in the case of the tests since an ExternalDaemon is killed and not shutdown properly. So CheckForLeaks() would still complain and fail the test. The only option I see is to have a ScopedLeakCheckDisabler in the scope of the renew thread. However, that too makes me nervous since a lot of the structures can have nested allocations and future changes could introduce bugs easily. What do you think is the right path of action here? -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Sailesh Mukil has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 4: Thanks Todd. As Todd fixed in the patch, the issue was that the client had a separate Kinit path and there was no renewal for that path which meant that the client could not communicate with the master or tserver after the ticket expired. The fix was to ensure communication between the master and the tserver worked well after the ticket expired. I added another test to check if acquiring a new ticket works as well (as opposed to renewing the existing one). -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: No
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Hello Kudu Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5820 to look at the new patch set (#4). Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. KUDU-1845: Kerberos client keytab should be periodically renewed With the current kerberos support, if a ticket expires, all communication on a secure cluster would stop as we do not attempt to renew the ticket. This patch adds a renewal thread which periodically wakes up and either renews the Ticket Granting Ticket or acquires a new one if the renewal window is closed. The renew interval is controlled by a newly introduced flag called 'kerberos_reinit_interval'. Testing: Added 2 tests, one for ticket renewal and one for ticket reacquisition. I haven't added tests to confirm if disabling the renew thread makes these tests fail as that would require quite some plumbing which I think is unnecessary. Instead, I manually tested that disabling the renew thread causes the tests to fail. Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d --- M src/kudu/integration-tests/external_mini_cluster-test.cc M src/kudu/integration-tests/external_mini_cluster.cc M src/kudu/integration-tests/external_mini_cluster.h M src/kudu/security/init.cc M src/kudu/security/test/mini_kdc.cc M src/kudu/security/test/mini_kdc.h 6 files changed, 212 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/5820/4 -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5820/1/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 86: SleepFor(MonoDelta::FromSeconds(FLAGS_kerberos_reinit_interval * 60)); > AFAICT it is called in MiniClusters. Here's a call stack: Yea, but we never currently configure security in miniclusters, AFAIK, because it would mix state (like env vars) between all the participants in the cluster and the client and thus not give us much benefit. -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Adar Dembo has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5820/1/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 86: SleepFor(MonoDelta::FromSeconds(FLAGS_kerberos_reinit_interval * 60)); > In MiniCluster I don't think any of this code gets called (because it deals AFAICT it is called in MiniClusters. Here's a call stack: 1. MiniCluster::AddTabletServer() 2. MiniTabletServer::Start() 3. TabletServer::Init() 4. ServerBase::Init() 5. security::InitKerberosForServer() -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5820/1/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 86: SleepFor(MonoDelta::FromSeconds(FLAGS_kerberos_reinit_interval * 60)); > Will this be OK in MiniCluster, where all the servers are in-proc and we ex In MiniCluster I don't think any of this code gets called (because it deals with process-wide state). Though, maybe I'm wrong about that? Sailesh? -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Adar Dembo has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5820/1/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 86: SleepFor(MonoDelta::FromSeconds(FLAGS_kerberos_reinit_interval * 60)); > I think this only runs in the "real daemon" context, in which case there is Will this be OK in MiniCluster, where all the servers are in-proc and we explicitly shutdown to restart them? FWIW, in that context the log rotator thread does get shut down (see ServerBase::Shutdown), but the async logger thread does not. -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Todd Lipcon has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 1: (13 comments) http://gerrit.cloudera.org:8080/#/c/5820/1//COMMIT_MSG Commit Message: Line 19: TODO: Add tests. Still working on it. I think with our minikdc, it should be possible to set the TGT expiration time to something like 5 or 10 seconds, and then verify that the renewal works properly by having the test run for 30 seconds or so? http://gerrit.cloudera.org:8080/#/c/5820/1/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 38: DEFINE_int32(kerberos_reinit_interval, 60, "Duration in minutes before which an attempt to " > I don't have much context on the security work going on in Kudu, but I was yea, I agree we should be able to do this dynamically based on the expiration time. Basing it on the messenger is nice but I agree the circular dependency makes it troublesome. I believe the LevelDB thread wrapper has some kind of 'SchedulePeriodicTask()' helper which allows all of these various periodic-but-light-weight things to run on a single thread, instead of a bunch of separate ones... but either way, what's one more thread on top of our 1000+? :-D Line 53: struct KinitContext { it seems like this global instance is accessed a bunch by some free functions that could just as well be class members (and avoid a bunch of 'kinit_ctx->' expressions)? Line 62: static scoped_refptr renew_thread; > warning: 'renew_thread' is a static definition in anonymous namespace; stat hrm, does this need to be kept around? Our default behavior if you just drop a Thread reference is to detach the thread, which is what we want. Line 71: #define KRB5_LOG_NOT_OK_ERROR(call, prepend_msg, ret) \ I think you could avoid needing this macro by instead moving the body of the renewal thread into a new function like: Status DoRenewal(); and then just use the existing KRB5_RETURN_NOT_OK_PREPEND within that function, and at the call site use WARN_NOT_OK(DoRenewal()) Line 86: SleepFor(MonoDelta::FromSeconds(FLAGS_kerberos_reinit_interval * 60)); > Shouldn't this thread participate in the rest of the server lifecycle thoug I think this only runs in the "real daemon" context, in which case there is no "shutdown", there is only getting killed. eg I don't think we attempt to shut down our log rotater thread, async logger threads, etc, and that's ok Line 108: strcmp(creds.server->data[1].data, kinit_ctx->principal->realm.data) || Looking at the 'is_local_tgt' function in krb5:src/clients/klist/klist.c it does: return princ->length == 2 && data_eq(princ->realm, *realm) && data_eq_string(princ->data[0], KRB5_TGS_NAME) && data_eq(princ->data[1], *realm); we seem to be missing the check on creds.server->realm (not sure when it might differ from the creds.server->data[1] but apparently they might?) It might also be worth referencing this function in a comment so we know where this incantation came from PS1, Line 109: creds.server->length != 2) I think this probably belongs first in the series of conditions, otherwise we risk dereferencing data[0] and data[1] when they don't exist. Line 110: krb5_free_cred_contents(kinit_ctx->krb5_ctx, ); can you do this as a scoped cleanup up above to avoid repeating it three times? PS1, Line 115: MonoDelta I don't know if MonoDelta is giving us much in this case. Typically we use monotime/monodelta only in the case when it's truly a monotonic clock reading, whereas here we have wall times. PS1, Line 120: krb5_creds new_creds; : auto cleanup_new_creds = MakeScopedCleanup([&]() { : krb5_free_cred_contents(kinit_ctx->krb5_ctx, _creds); }); I believe for this to be safe, you'd need to memset the 'new_creds' object to \0s first, right? Otherwise it is arbitrary data and the free call could crash? Line 132: kinit_ctx->principal, alignment slightly off Line 145: if (ret) goto end; if you use a ScopedCleanup up above, then no need for the gotos and such here. -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Adar Dembo has posted comments on this change. Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. Patch Set 1: (2 comments) Just passing through... http://gerrit.cloudera.org:8080/#/c/5820/1/src/kudu/security/init.cc File src/kudu/security/init.cc: Line 38: DEFINE_int32(kerberos_reinit_interval, 60, "Duration in minutes before which an attempt to " I don't have much context on the security work going on in Kudu, but I was wondering: would it be possible to use the expiration of the Kerberos credentials to dictate how long to sleep? That is, if we know our tickets are expiring in 6 hours, we can sleep until then, rather than waking up periodically? Ideally we wouldn't even need a thread, but schedule the future work as a task. Unfortunately, that facility is provided by the messenger in krpc, so using it might create a circular dependency. Line 86: SleepFor(MonoDelta::FromSeconds(FLAGS_kerberos_reinit_interval * 60)); Shouldn't this thread participate in the rest of the server lifecycle though? That is, if the server is shutdown, shouldn't we explicitly stop this thread? -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes
[kudu-CR] KUDU-1845: Kerberos client keytab should be periodically renewed
Sailesh Mukil has uploaded a new change for review. http://gerrit.cloudera.org:8080/5820 Change subject: KUDU-1845: Kerberos client keytab should be periodically renewed .. KUDU-1845: Kerberos client keytab should be periodically renewed With the current kerberos support, if a ticket expires, all communication on a secure cluster would stop as we do not attempt to renew the ticket. This patch adds a renewal thread which periodically wakes up and either renews the Ticket Granting Ticket or acquires a new one if the renewal window is closed. The renew interval is controlled by a newly introduced flag called 'kerberos_reinit_interval'. Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d TODO: Add tests. Still working on it. --- M src/kudu/security/init.cc 1 file changed, 137 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/20/5820/1 -- To view, visit http://gerrit.cloudera.org:8080/5820 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ic4c072c1210216369e60eac88be4a20d9b166b2d Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil