Re: [Freeipa-users] Fwd: Marking subdomain offline

2017-04-07 Thread Jakub Hrozek
On Thu, Apr 06, 2017 at 02:39:02PM -0400, Chris Dagdigian wrote:
> 
> I see similar things in our environment where IPA is used as "glue" between
> AD Forests that have a 1-way trust relationship. We believe that the root
> cause has something to do with the 30+ domain controllers the IPA client
> tries to make contact with (in seemingly random order) across the AD Forest.

When an AD user logs to an IPA client, there are actually two actions --
a user lookup and the authentication.

The user lookup is in fact done by the SSSD instance running on one of
the IPA masters, the clients just talks to the masters, but the SSSD on
the master talks to one AD DCs.

Authentication is done directly against one of AD DCs.


> Very hard to reproduce but the "subdomain marked offline" problem is one we
> see often in the sssd logs. We think that there are some AD servers in our
> sprawling environment that we either can't reach properly over the network
> (firewalls, etc.) or are just plain not configured to talk properly to us.
> Login success depends on hitting a happy domain controller.
> 
> We are VERY interested in the recent updates to IPA server that seem to
> indicate we can 'pin" clients to certain specific AD controllers and from my
> understanding we just need to wait until the SSSD software gets broad
> support for this feature as well. Once we can do that we plan to pin our
> clients to named controllers and see if that helps with any of the
> intermittent login problems.

I don't think there are any changes needed to the the IPA server (maybe
some management framework), but in general you're looking for this
feature:
https://docs.pagure.org/SSSD.sssd/design_pages/subdomain_configuration.html

(after we migrated the upstream projects from fedorahosted to pagure,
our documentation is still in a bit of a flux, but we're migrating the
docs gradually..)

As the design page says, you will be able to set up the AD DCs the IPA
masters talk to using the subdomain configuration, but the DCs the
clients authenticate to must currently be set in krb5.conf on the
clients until https://pagure.io/SSSD/sssd/issue/3336 is implemented.

> 
> One workaround we've started to use for power users is collecting public SSH
> keys and hosting them in the IPA server -- as long as IPA knows that the
> user "exists" in AD and has a roughly complete group membership list than
> logging in with SSH key instead of AD password bypasses the transient
> password checking failures and is very quick.

Another workaround (for the IPA masters at least) would be to put the
reachable AD DCs into a site and assign the IPA masters to this site.

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] Fwd: Marking subdomain offline

2017-04-06 Thread mike

On 2017-04-06 20:18, Jakub Hrozek wrote:

On Thu, Apr 06, 2017 at 07:21:01PM +0200, m...@chinewalking.com wrote:

Hi,

My IPA<->AD trust setup experiences intermittent failures during login
events. The AD subdomain goes in an inactive/offline state and users 
logging
in are put into a 'delayed authentication' queue. Usually logging in 
after a
minute or so succeeds as the subdomain is reset and the user is cached 
for
following events. At all times getent/id and kinit's are succesfull, 
even

with a purged sssd cache.
SRV records are correctly resolved, except for _kerberos-master.

I have not been able to further troubleshoot the intermittent 
failures.

Traffic captures show no strange behaviour, yet the sssd_domain log is
clearly showing AD to be unreachable at times. All AD servers are 
W2012 and
DNS masking _ldap and _kerberos to single nodes, factoring out any 
faulty

Windows configs, so far has not had any effect (Would it?).

sssd's data_provider_fo.c :> be_fo_reset_svc() calls fo_get_service(), 
which
returns EOK. I'm not familiar yet with the variables at play, would 
adding

debug statements here reveal faults that may cause this?


Could you paste a bit more context? I think what would work is to trim
the logs (truncate --size 0), then reproduce the issue and search for
the first occurence of "NOT_WORKING" message from any of the fo_*
functions.


After truncating the logs I noticed a comparable error that was fixed 
earlier today. I created a number of existing groups (sudo, app, etc) 
with low GIDs during initial deployment of IPA. One group caused issues 
and I deleted it earlier on. Now another group triggered exactly the 
same sequence of errors:


[{"CODE_FILE=src/providers/ipa/ipa_id.c", 
36}{"CODE_FUNC=ipa_initgr_get_overrides_step"{"The group 
name=s...@unix.foo.local,cn=groups,cn=unix.foo.local,cn=sysdb has no 
UUID attribute objectSIDString, error!\n"
[{"CODE_FILE=src/providers/ipa/ipa_subdomains_id.c", 
47}{"CODE_FUNC=ipa_id_get_groups_overrides_done", 42}{"IPA resolve user 
groups overrides failed [22].\n"
[{"CODE_FUNC=be_mark_dom_offline", 29}{"Marking subdomain foo.local 
offline\n"


With all these troublesome groups removed I have not been able to 
reproduce the issues. I will further test with different users and 
mapped groups. I guess the main fault was incorrect log handling. 
Multiple logins caused overlooking the real error and only showed the 
mentions of offline AD backends and subdomains.


I am not sure why these Posix groups had no objectSIDString while others 
did.


Thank you,

Mike

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] Fwd: Marking subdomain offline

2017-04-06 Thread Chris Dagdigian


I see similar things in our environment where IPA is used as "glue" 
between AD Forests that have a 1-way trust relationship. We believe that 
the root cause has something to do with the 30+ domain controllers the 
IPA client tries to make contact with (in seemingly random order) across 
the AD Forest.  Very hard to reproduce but the "subdomain marked 
offline" problem is one we see often in the sssd logs. We think that 
there are some AD servers in our sprawling environment that we either 
can't reach properly over the network (firewalls, etc.) or are just 
plain not configured to talk properly to us.  Login success depends on 
hitting a happy domain controller.


We are VERY interested in the recent updates to IPA server that seem to 
indicate we can 'pin" clients to certain specific AD controllers and 
from my understanding we just need to wait until the SSSD software gets 
broad support for this feature as well. Once we can do that we plan to 
pin our clients to named controllers and see if that helps with any of 
the intermittent login problems.


One workaround we've started to use for power users is collecting public 
SSH keys and hosting them in the IPA server -- as long as IPA knows that 
the user "exists" in AD and has a roughly complete group membership list 
than logging in with SSH key instead of AD password bypasses the 
transient password checking failures and is very quick.


Chris


m...@chinewalking.com 
April 6, 2017 at 1:21 PM
Hi,

My IPA<->AD trust setup experiences intermittent failures during login 
events. The AD subdomain goes in an inactive/offline state and users 
logging in are put into a 'delayed authentication' queue. Usually 
logging in after a minute or so succeeds as the subdomain is reset and 
the user is cached for following events. At all times getent/id and 
kinit's are succesfull, even with a purged sssd cache.

SRV records are correctly resolved, except for _kerberos-master.

I have not been able to further troubleshoot the intermittent 
failures. Traffic captures show no strange behaviour, yet the 
sssd_domain log is clearly showing AD to be unreachable at times. All 
AD servers are W2012 and DNS masking _ldap and _kerberos to single 
nodes, factoring out any faulty Windows configs, so far has not had 
any effect (Would it?).


sssd's data_provider_fo.c :> be_fo_reset_svc() calls fo_get_service(), 
which returns EOK. I'm not familiar yet with the variables at play, 
would adding debug statements here reveal faults that may cause this?


Any pointers are very much appreciated.

Mike


[sssd[be[unix.foo.local]]] [ipa_srv_ad_acct_lookup_step] (0x0400): 
Looking up AD account
[sssd[be[unix.foo.local]]] [ipa_srv_ad_acct_lookup_done] (0x0080): 
Sudomain lookup failed, will try to reset sudomain..
[sssd[be[unix.foo.local]]] [ipa_server_trusted_dom_setup_send] 
(0x1000): Trust direction of subdom foo.local from forest foo.local 
is: one-way inbound: local domain trusts the remote domain
[sssd[be[unix.foo.local]]] [ipa_server_trusted_dom_setup_1way] 
(0x0400): Will re-fetch keytab for foo.local
[sssd[be[unix.foo.local]]] [ipa_getkeytab_send] (0x0400): Retrieving 
keytab for UNIX$@FOO.local from ipa01.unix.foo.local into 
/var/lib/sss/keytabs/foo.local.keytab6AXxWV using ccache 
/var/lib/sss/db/ccache_UNIX.FOO.local
[sssd[be[unix.foo.local]]] [child_handler_setup] (0x2000): Setting up 
signal handler up for pid [6242]
[sssd[be[unix.foo.local]]] [child_handler_setup] (0x2000): Signal 
handler set up for pid [6242]
[sssd[be[unix.foo.local]]] [sdap_process_result] (0x2000): Trace: 
sh[0x7f71cd9ddb80], connected[1], ops[(nil)], ldap[0x7f71cd9e65a0]
[sssd[be[unix.foo.local]]] [sdap_process_result] (0x2000): Trace: end 
of ldap_result list
[sssd[be[unix.foo.local]]] [ad_online_cb] (0x0400): The AD provider is 
online
[sssd[be[unix.foo.local]]] [be_ptask_online_cb] (0x0400): Back end is 
online
[sssd[be[unix.foo.local]]] [be_ptask_enable] (0x0080): Task 
[Subdomains Refresh]: already enabled
Keytab successfully retrieved and stored in: 
/var/lib/sss/keytabs/foo.local.keytab6AXxWV
[sssd[be[unix.foo.local]]] [child_sig_handler] (0x1000): Waiting for 
child [6242].
[sssd[be[unix.foo.local]]] [child_sig_handler] (0x0100): child [6242] 
finished successfully.
[sssd[be[unix.foo.local]]] [ipa_getkeytab_recv] (0x2000): 
ipa-getkeytab status 0
[sssd[be[unix.foo.local]]] [ipa_server_trust_1way_kt_done] (0x0400): 
Keytab successfully retrieved to 
/var/lib/sss/keytabs/foo.local.keytab6AXxWV
[sssd[be[unix.foo.local]]] [ipa_server_trust_1way_kt_done] (0x2000): 
Keytab renamed to /var/lib/sss/keytabs/foo.local.keytab
[sssd[be[unix.foo.local]]] [ipa_server_trust_1way_kt_done] (0x0400): 
Keytab /var/lib/sss/keytabs/foo.local.keytab6AXxWV contains the 
expected principals
[sssd[be[unix.foo.local]]] [ipa_server_trust_1way_kt_done] (0x0400): 
Established trust context for foo.local
[sssd[be[unix.foo.local]]] [unique_filename_destructor] (0x2000): 
Unlinking 

Re: [Freeipa-users] Fwd: Marking subdomain offline

2017-04-06 Thread Jakub Hrozek
On Thu, Apr 06, 2017 at 07:21:01PM +0200, m...@chinewalking.com wrote:
> Hi,
> 
> My IPA<->AD trust setup experiences intermittent failures during login
> events. The AD subdomain goes in an inactive/offline state and users logging
> in are put into a 'delayed authentication' queue. Usually logging in after a
> minute or so succeeds as the subdomain is reset and the user is cached for
> following events. At all times getent/id and kinit's are succesfull, even
> with a purged sssd cache.
> SRV records are correctly resolved, except for _kerberos-master.
> 
> I have not been able to further troubleshoot the intermittent failures.
> Traffic captures show no strange behaviour, yet the sssd_domain log is
> clearly showing AD to be unreachable at times. All AD servers are W2012 and
> DNS masking _ldap and _kerberos to single nodes, factoring out any faulty
> Windows configs, so far has not had any effect (Would it?).
> 
> sssd's data_provider_fo.c :> be_fo_reset_svc() calls fo_get_service(), which
> returns EOK. I'm not familiar yet with the variables at play, would adding
> debug statements here reveal faults that may cause this?

Could you paste a bit more context? I think what would work is to trim
the logs (truncate --size 0), then reproduce the issue and search for
the first occurence of "NOT_WORKING" message from any of the fo_*
functions.

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project