Re: [SSSD] Design Discussion: SSSD should support DNS sites

Sumit Bose Mon, 18 Feb 2013 09:06:13 -0800

On Thu, Jan 31, 2013 at 07:04:32PM +0100, Sumit Bose wrote:
> On Thu, Jan 31, 2013 at 11:32:21AM -0500, Simo Sorce wrote:
> > On Thu, 2013-01-31 at 16:40 +0100, Sumit Bose wrote:
> > > On Thu, Jan 31, 2013 at 09:43:09AM -0500, Simo Sorce wrote:
> > > > On Thu, 2013-01-31 at 10:49 +0100, Sumit Bose wrote:
> > > > > Hi,
> > > > > 
> > > > > I have created a design page for
> > > > > https://fedorahosted.org/sssd/ticket/1032 "[RFE] sssd should support 
> > > > > DNS
> > > > > sites" at
> > > > > https://fedorahosted.org/sssd/wiki/DesignDocs/ActiveDirectoryDNSSites 
> > > > > .
> > > > > It can be found below as well.
> > > > > 
> > > > > Corrections, comments and enhancements are welcome.
> > > > 
> > 
> > But if we conceal site discovery into resolv_ctx, we can simply avoid a
> > new separate interface altogether.
> > 
> > We simply add site support to the resolver library through a module and
> > let all current calls be 'magically' made site aware.
> > We might need to change calls that specify SRV searches so that we
> > separate Service and Domain, but I think that would be a better
> > architecture as we can abstract away how we do resolve stuff into a
> > module of the resolver.
> > 
> > This way when we later add an IPA location discovery mechanism we do not
> > have to change anything in the common code, we just plug in a different
> > resolver plugin for 'location' discovery for the ipa domain case.
> > 
> > and we can actually mix both as long as we make the 'site/location'
> > plugin domain bound (hence why we need to split service and domain
> > arguments in requests so each request can be bound to the correct site
> > discovery code.
> 
> I have to think about it. It certainly would make code reuse much easier
> but I wonder this really works in complex setup with multiple IPA and AD
> domains.
>


I have updated the design page accordingly a copy of the current version can be
found below. Comments are welcome.

bye,
Sumit

== Use Active Directory's DNS sites ==
Related ticket(s):
* [https://fedorahosted.org/sssd/ticket/1032 RFE sssd should support DNS sites]

=== Problem Statement ===
In larger Active Directory environments there is typically more than one domain 
controller. Some of them are used for redundancy, others to build different 
administrative domains. But in environments with multiple physical locations 
each location often has at least one local domain controller to reduce latency 
and network load between the locations.

Now clients have to find the local or nearest domain controller. For this the 
concept of sites was introduce where each physical location can be seen as an 
individual site with a unique name. The naming scheme for DNS service records 
was extended (see e.g. 
http://technet.microsoft.com/en-us/library/cc759550(v=ws.10).aspx) so that 
clients can first try to find the needed service in the local site and can fall 
back to look in the whole domain if there is no local service available.

Additionally clients have to find out about which site they belong to. This 
must be done dynamically because clients might move from one location to a 
different one on regular basis (roaming users). For this a special LDAP 
request, the (C)LDAP ping 
(http://msdn.microsoft.com/en-us/library/cc223811.aspx), was introduced.

=== Overview view of the solution ===
==== General considerations ====
The solution in SSSD should take into account that other types of domains, e.g. 
a FreeIPA domain, want to implement their own scheme to discover the nearest 
service of a certain type. A plugin interface where the configured ID provider 
can implement methods to determine the location of the client looks like the 
most flexible solution here.

Since the currently available (AD sites) or discussed schemes 
(http://www.freeipa.org/page/V3/DNS_Location_Mechanism) use DNS SRV lookups 
with special extensions to the _server._protocal.domain.name format the plugin 
will provided the needed string for the SRV lookup, e.g. 
_ldap._tcp.site-name._sites.domain.name for AD or 
_ldap._tcp.hostname.domain.name for FreeIPA. Since network lookups might be 
needed to find out all the components of the string the plugin interface must 
allow asynchronous operations.

QUESTION: Should the plugin offer a tevent_req style interface with *_send and 
*_recv or should it be a single method which has a callback as a parameter?

Additionally the plugin may return a fallback string for the SRV lookup which 
would typically be _service._protocal.domain.name. The common resolver code can 
then run the SRV lookup first with primary string and mark all returned hosts 
as primary servers (as it currently does). If a fallback string is returned a 
second SRV lookup can be run, all hosts which were already returned in the 
first request will be removed and the remaining hosts will be added as backup 
servers.

==== Sites specific details ====

The plugin of the AD provider will do the following steps:
1. do a DNS lookup to find any DC
1. send a CLDAP ping to the first DC returned to get the client's site
1. after a timeout send a CLDAP ping to the next DC on the list
1. if the clients site is known return 
_service._protocol.site-name._sites.domain.name for primary server and 
_service._protocol.domain.name for backup server, otherwise return only 
_service._protocol.domain.name.

The results of the different step should be available with one of the debug 
levels reserved for tracing to make debugging easier and to allow acceptance 
tests to validate the behavior with the help of the debug logs.

=== Implementation details ===
struct resolv_ctx should get 2 (or 3 depending of the interface type) members, 
one holds a function pointer to the plugin and the other a pointer to private 
data for the plugin. Since most of the structs related to the fail-over and 
resolver code are private a setter method to add the pointers should be added 
as well. This is more flexible than adding additional arguments to 
resolv_init().

Besides the the service type and protocol and domain, which are all available 
in struct srv_data, the plugin should get a tevent context and its private data 
as arguments. With this the plugin interface might look like:

{{{
typedef struct tevent_req *(*location_plugin_send_t)(TALLOC_CTX *mem_ctx, 
struct tevent_context *ev, const char *service, const char *protocol, const 
char *domain, void *private_data);
typedef int (*location_plugin_recv_t)(TALLOC_CTX *mem_ctx, char 
**primary_srv_lookup_string, char **backup_srv_lookup_string);
}}}
or
{{{
typedef void (*location_plugin_callback_t)(int status, char 
*primary_srv_lookup_string, char *backup_srv_lookup_string);
typedef void (*location_plugin_call_t)(TALLOC_CTX *mem_ctx, struct 
tevent_context *ev, const char *service, const char *protocol, const char 
*domain, void *private_data, location_plugin_callback_t cb_fn);
}}}

Please note, if in future there is a method to find the nearest service of a 
certain kind which is not based on DNS SRV records like the ones currently 
discussed here, the plugin interface can be extended to return not a string for 
the SRV lookup, but a list of hostnames or even a list of IP addresses. But for 
the time being it does not makes sense to add code for this because e.g. it 
cannot be tested.

If a plugin is defined it can then be called in resolve_srv_cont() instead of 
get_srv_query(). If it is not defined either the result of  get_srv_query() can 
be used or a default request with the same interface as the plugin can be used. 
I think the latter one would make the code flow more easy to follow.

Additionally the srv resolver code in fail_over.c must be extended to run the 
lookup for the backup servers as well, if s backup srv lookup string is 
returned.
 
==== Finding a DC for the CLDAP ping ====
To find any DC in the domain samba look for a _ldap._tcp.domain.name. I would 
suggest to use _ldap._tcp.domain.name as well for the SSSD implementation.

==== Sending the CLDAP ping ====
The CLDAP ping is a LDAP search request with a filter like
{{{
(&(&(DnsDomain=ad.domain)(DomainSid=S-1-5-21-1111-2222-3333))(NtVer=0x01000016))
}}}
and the attribute "!NetLogon".
The flags given with the !NtVer component of the search filter will be 
different for a domain member (AD provider) and an IPA server in an environment 
with trusts (IPA provider).

A domain member will belong to a site and the following flags from 
/usr/include/samba-4.0/gen_ndr/nbt.h should be used 'NETLOGON_NT_VERSION_5 | 
NETLOGON_NT_VERSION_5EX | NETLOGON_NT_VERSION_IP'. A trusted server does not 
belong to one of the sites of trusting domain so it can only ask for the 
closest site with 'NETLOGON_NT_VERSION_5 | NETLOGON_NT_VERSION_5EX | 
NETLOGON_NT_VERSION_WITH_CLOSEST_SITE'. Maybe 
NETLOGON_NT_VERSION_WITH_CLOSEST_SITE is useful for a domain member as well if 
e.g. the services on the local site are not available.

==== Parsing the server response ====
The server response is a single attribute "!NetLogon" which is a binary blob 
containing multiple NDR encoded values. This value can be decoded with 
ndr_pull_netlogon_samlogon_response() from the Samba library libndr-nbt.

==== Side note about struct resolv_ctx and the usage of resolv_init() ====
In previous discussions it was said that resolv_init() should be only called 
once during the initialization of a provider, preferable from the common 
responder code. This means that there is only one instance of the resolv_ctx 
for the whole provider.

Currently resolv_init() is called at two other places as well, in ipa_dyndns.c 
and sdap_async_sudo_hostinfo.c. I think the only reason for calling 
resolv_init() at those two place is, that both needed to call some low level 
resolve routines which need a resolv_ctx as parameter and that there is no easy 
way to get the resolv_ctx because it is hidden in a private struct. Instead of 
adding an appropriate getter method which returns the current resolve_ctx 
resolv_init() was called for a second time.

If the resolv_init() calls are removed from those two places with the help of a 
getter method or similar, I think the prev and next members can be removed from 
struct resolv_ctx as well. Because there will not be a list of resolver 
contexts, but only one.

=== How to test ===
If this feature is tested the following scenarios can be considered:
==== AD domain does only has a single site ====
* site name might be 'Default-First-Site-Name' but it can be renamed or 
localized as well
* SSSD should be able to discover the site, e.g. 'Default-First-Site-Name' 
* SSSD should connect to any DC.

==== AD domain has sites but the local site of the SSSD client has no domain 
controller ====
* SSSD should be able to discover the local site
* SSSD should connect to a any DC

==== AD domain has sites and the local site of the SSSD client has a domain 
controller ====
* SSSD should be able to discover the local site
* SSSD should connect to a DC from the local site

Besides inspection the log files with a high debug level to connection to the 
domain controller can also be verified with the netstat or ss utilities.
=== Author(s) ===
Sumit Bose <sb...@redhat.com>
_______________________________________________
sssd-devel mailing list
sssd-devel@lists.fedorahosted.org
https://lists.fedorahosted.org/mailman/listinfo/sssd-devel

Re: [SSSD] Design Discussion: SSSD should support DNS sites

Reply via email to