On Fri, Apr 24, 2015 at 02:01:11PM +0200, Lukas Slebodnik wrote: > On (24/04/15 12:43), Jakub Hrozek wrote: > >On Thu, Apr 23, 2015 at 07:29:07AM -0400, Stephen Gallagher wrote: > >> On Thu, 2015-04-23 at 08:14 +0200, Lukas Slebodnik wrote: > >> > On (20/04/15 14:38), Stephen Gallagher wrote: > >> > > On Mon, 2015-04-20 at 08:53 +0200, Lukas Slebodnik wrote: > >> > > > ehlo, > >> > > > > >> > > > attached patch fixes crash in > >> > > > https://fedorahosted.org/sssd/ticket/2629 > >> > > > > >> > > > >> > > > >> > > Nack. I'd rather we fixed the root of this problem. I did some > >> > > digging > >> > > this afternoon and tracked the issue back to ad_gpo.c line 3499 (in > >> > > current master). If we get back a NULL result or num_results == 0, > >> > > then we just skip over this item in the list and start processing > >> > > the > >> > > next one. Unfortunately, that leaves an item in the candidate_gpos > >> > > list that was never properly constructed. > >> > > > >> > You are right. > >> > > >> > We got a referral to GPO and therefore we do not find any attributes. > >> > > >> > [sdap_sd_search_send] (0x0400): Searching entry [cn={2BA15B73-9524- > >> > 419F-B4B7-185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com] > >> > using SD > >> > [sdap_print_server] (0x2000): Searching 10.1.1.14 > >> > [sdap_get_generic_ext_step] (0x0400): calling ldap_search_ext with > >> > [(objectclass=*)][cn={2BA15B73-9524-419F-B4B7- > >> > 185E1F0D3DCF},cn=policies,cn=system,DC=example,DC=com]. > >> > [sdap_get_generic_ext_step] (0x1000): Requesting attrs: > >> > [nTSecurityDescriptor] > >> > [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [cn] > >> > [sdap_get_generic_ext_step] (0x1000): Requesting attrs: > >> > [gPCFileSysPath] > >> > [sdap_get_generic_ext_step] (0x1000): Requesting attrs: > >> > [gPCMachineExtensionNames] > >> > [sdap_get_generic_ext_step] (0x1000): Requesting attrs: > >> > [gPCFunctionalityVersion] > >> > [sdap_get_generic_ext_step] (0x1000): Requesting attrs: [flags] > >> > [sdap_get_generic_ext_step] (0x2000): ldap_search_ext called, msgid = > >> > 14 > >> > [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], > >> > connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] > >> > [sdap_process_result] (0x2000): Trace: ldap_result found nothing! > >> > [sdap_process_result] (0x2000): Trace: sh[0x7f5d409a8c60], > >> > connected[1], ops[0x7f5d409d2c10], ldap[0x7f5d409a9ed0] > >> > [sdap_process_message] (0x4000): Message type: > >> > [LDAP_RES_SEARCH_RESULT] > >> > [sdap_get_generic_op_finished] (0x0400): Search result: > >> > Referral(10), 0000202B: RefErr: DSID-0310063C, data 0, 1 access > >> > points > >> > ref 1: 'lzb.hq' > >> > > >> > [sdap_get_generic_op_finished] (0x1000): Ref: > >> > ldap://lzb.hq/cn=%7B2BA15B73-9524-419F-B4B7- > >> > 185E1F0D3DCF%7D,cn=policies,cn=system,DC=example,DC=com > >> > [ad_gpo_get_gpo_attrs_done] (0x0040): no attrs found for GPO; try > >> > next GPO. > >> > > >> > > Under what circumstances can the secinfo_dacl search return success > >> > > but with zero results? Is there a bug or a race here (such as the > >> > > AD > >> > > server has updated the GPO since we got the list of candidate > >> > > GPOs?). > >> > > How best to handle this? > >> > > > >> > > With your patch here, it looks like we're assuming that it's okay > >> > > to > >> > > just skip over this GPO. If that's the case, then what we really > >> > > need > >> > > to be doing in ad_gpo_get_gpo_attrs_step() is to mark the gp_gpo as > >> > > being invalid and then after we've gone through them all, shrink > >> > > the > >> > > array, removing all of the invalid entries. This will be more > >> > > future- > >> > > proof, as it's not just the gpo_sd that is uninitialized here. > >> > > Every > >> > > member of this gp_gpo is NULL except for the DN. > >> > > > >> > > If it's *not* okay that we've gotten no results for this lookup > >> > > (such > >> > > as in the race case; we don't want to be skipping over a GPO that > >> > > might properly be denying users), we may need to restart > >> > > processing at > >> > > least a couple times to try to avoid the race and go offline if we > >> > > can't complete the processing (so we at least stick to our cached > >> > > rules). > >> > > >> > I'm sorry I didn't noticed it in log files the first time. > >> > Now we know the root of problem. > >> > Which of your proposal do you prefer now? > >> > > >> > LS > >> > >> OK, I hadn't considered the referral scenario. (I wasn't actually > >> aware that GPOs could *be* referred to elsewhere). Is this referral > >> pointing to a different machine than the host is enrolled to? It's > >> trying to reach a DN at lzb.hq that is identical to the one we just > >> requested. > >> > >> I'd really like to know how to reproduce the setup for this situation; > >> is it inheriting a GPO from a parent domain somehow? > >> > >> Neither of my above proposals are a proper solution for this case, > >> unfortunately. The first one would make the crash go away, but if this > >> GPO really exists and is restrictive, not processing it is *really > >> bad*. Right now we're actually slightly lucky in that the crash has a > >> net result of causing an access denial rather than erroneously > >> resulting in a user login, but if we were to "solve" this problem the > >> way Lukas's original patch intended (or my first alternative above), > >> it would be introducing a security issue. So let's not do either of > >> those. > >> > >> The second option I suggested there is invalid as well, since the > >> entry *exists*, it's just been converted to a referral. There are > >> therefore only two safe options at this time: > >> > >> 1) Quick-and-ugly: If we encounter a referral, skip straight to denial > >> and loudly warn in the logs that this is a situation we are unable to > >> handle yet. This will avoid the crash, but will leave the GPO behavior > >> equivalently bad to the current state. Might be acceptable if this is > >> only happening on a very small subset of systems. > > > >I'm afraid we need to go this way now. > > > >> Users experiencing > >> this might be able to set "ldap_referrals = True" on the affected > >> systems to quietly solve the problem at the expense of all the other > >> side-effects to that (particularly performance) > > > >I wouldn't even suggest enabling referrals. I've seen the LDAP provider > >misbehave completely with referrals enabled, openldap-libs is not really > >great at chasing them and LDAP provider would go offline etc.. > >So enabling referrals doesn't just degrade performance, but also > >functionality.. > > > >I also tested that using Global Catalog wouldn't help here, some > >attributes we require are not replicated to GC, such as gPCFileSysPath. > > > >Are you sure we're using the right connection to the right domain? We > >can select the right connection based on the DN that we know. > >I'm sorry, but I couldn't reproduce the bug myself. > > > Stephen wrote a reproducer into ticket > https://fedorahosted.org/sssd/ticket/2629
I know, that's what I tried, but I couldn't get the setup to break. For some reason SSSD didn't examine the GPO from the parent domain if it was linked to a site the client was at.. > > We agreed to do the "Quick-and-ugly" way for 1.12 + documentation > but he mentioned that it is walid use-case and it would be good to > implement it in next release. Yes, the LDAP provider is in a big need of refactoring, but I think even for 1.13 it's too late. _______________________________________________ sssd-devel mailing list sssd-devel@lists.fedorahosted.org https://lists.fedorahosted.org/mailman/listinfo/sssd-devel