[389-devel] Re: Please have a look at rewriters design

thierry bordaz Wed, 18 Mar 2020 11:10:14 -0700


On 3/18/20 1:51 AM, William Brown wrote:

On 18 Mar 2020, at 04:08, thierry bordaz <[email protected]> wrote:

Hi William,

I updated the design according to our offline exchange

Thanks Thierry, I appreciate the conversation and the updates to the document: 
it made clear there were extra details up in your brain but not in words yet :) 
it's always hard to remember all the details as we write things, so thanks for 
the discussion. Like you said, it's always good to have a team who is really 
invested and cares about the work we do!


Your design for the core server version looks much better! Thank you. I still 
think there are some missing points. The reason to have a libpath rather than 
inbuild is to avoid a potential linking to sssd/samba. I think also that the 
problem space of the global catalog here needs to be looked at too. This 
feature is not in isolation, it's really a part of that.

Okay, I will work on a new PR making core server able toretrieve/registers rewriters.

I think the "need" to improve the usability of rewriters is not specificto global catalog. Global Catalog is just an opportunity to implementit. I think parts of slapi-nis, integration of vsphere, GC (and likelyothers) are also use case for rewriters. They were implemented indifferent ways because rewriters were not easy to use or simply not known.


This means we have a whole set of deployment cases to look at.

So the deployment will look like:

IPA DS --> IPA GC


So an ipaAccount from the IPA DS instance will be "copied and transformed" into 
the IPA GC. This process is as yet undefined (it sounds like it may be offline or 
something else ...). We are simply not dealing with one instance now, but an out-of-band 
replication and transformation process. It's unclear whether the data transform is during 
this loading process, or in the IPA GC somehow.

 From what I understand, it sounds like a method to take an ipaAccount and 
transform it to an AD GC account stub. Then inside of that IPA GC there are 
some virtual attributes you wish to add like objectSid binary vs string 
representations, objectCategory, maybe others.

So from our discussion, we have currently focused on "how do we transform entries 
within a single directory server". But that's not the problem here. We are saying:

"We take an entry from IPA DS, transform it to an IPA GC stub entry, and then apply a set of 
further "in memory" transformations"

One of the biggest issue with GC is schema. IPA DS and IPA GC have notcompatible schema. They can not be in the same replication topology.So provisioning of IPA GC requires transformations rules to present another "view" of IPA DS data. Those transformations will be on the writepath (i.e. stored in DB/indexed). This transformation work is almostdone and is completely independent of 389-ds.All of this is "write" path: provisioning (online or offline) andtransformation.

The problem for IPA GC is now on the "read" path. AD clients are use tosmart shortcuts/control that are supported by IPA GC.This is the IPA GC instance that will register the rewriters to act asGC does.


If that's the process, why not do all the transforms as required in the DS -> GC 
load process? You raised a critically key point - we have a concern about the write 
path as the transform point due to IO or time to do the transform, but it sounds like 
you have to do this anyway as an element of the DS -> GC process.

Some of the transformation rules, on the write path, are quite complex.Looking at slapi-nis config entries gives an idea what is needed. Inaddition to those transformations, DS to GC online provisioning is notsimple at all. Relying on sync-repl, you then need to transform areceived entry into an update. At the moment it is an offlineprovisioning via transformation and import (much simpler).

To be honest I am afraid that the transform rules will result inrewriting slapi-nis.

I think everytime I have spoken to you about this, I have kept learning more and more
about this, and the more I see, I have many concerns about this feature. I think we do
not have the full picture. You have admitted that you don't know the full extend or ideas
here. There is clearly a communication break down here to our team from the IPA project,
and they aren't telling us what they want. It sounds like they are asking you to just do
"a small piece" but only they know the bigger picture.

The IPA project has the following designs:

https://www.freeipa.org/page/V4/Global_Catalog_Support

https://www.freeipa.org/page/V4/Global_Catalog_HLD

https://www.freeipa.org/page/V4/Global_Catalog_Access_Control

https://www.freeipa.org/page/V4/Global_Catalog_Data_Transformation

This also links to:

https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2003/cc737410(v=ws.10)?redirectedfrom=MSDN

The freeipa design pages are extremely shallow on details. The entire section
on how they plan to get data into the GC is:

"""
Global Catalog provisioning

The data in Global Catalog is provisioned from the primary LDAP server instance
running on the same FreeIPA master. A SYNCREPL mechanism is used to retrieve
the changes and a modified slapi-nis module is used to transform FreeIPA
original data to a schema compatible with Global Catalog in Active Directory.
Unlike the original slapi-nis module, the data is stored in a proper LDAP
backend so it is persistent across the directory server restarts.
"""

You are right I do not know the big picture. What I know is that partsof GC needs can be solved with rewriters that is by the way a supported389-ds interface. So storing rewriters in simple shared library ratherthan in plugins will help both IPA and 389-ds.


Where is the example config? Proof of concept? Even a conceptual set of 
accounts and groups showing the data transformation? How will they synthesise 
stable object data points?

The section of "data transformation" even goes to a blank page. Is the rewrite 
you are being asked to do just for objectSid once all these other transforms are done? Or 
is there more?


Honestly, it's worth reading the "how global catalog works" from msdn. Just to 
put it in contrast, that document (when converted to a pdf) is 61 pages long. Look at the 
features. Group caching, GC replication, partialAttribute replication based on schema, 
more ...


Honestly, Thierry, I trust you as a very smart and capable engineer, but you do 
not have the full picture here - none of us do. This seems like a feature that 
will explode in complexity and scale, and if not done *properly* from the 
start, may end up with many many half-baked, poorly designed solutions tacked 
together to make it look like it works. And that means we'll end up carrying 
that burden, just like slapi-nis (which is everyones favourite plugin ...)

Again, rewriters is not new. It has been a supported interface foryears. The design is just to make them simpler to develop/deploy.Looking at some plugins I think they are related to a way to givedifferent "views" of the same dataset. Many time, a rewriter, specificto ldap client needs is a good option.If GC can make use of it great. But I am sure that others (like vsphere)will appreciate.



I really think that right now, if the FreeIPA team wants to commit to providing 
GC support, they need to present a more robust and fully scoped design, that 
really encompasses the scale and complexity of this feature. Without them 
providing us clear, communicated designs, we are not able to actually provide 
well engineered solutions to the problems at hand and we risk another tech debt 
explosion.

Thanks,

regards
thierry

On 3/17/20 11:12 AM, thierry bordaz wrote:


On 3/17/20 2:42 AM, William Brown wrote:

On 17 Mar 2020, at 02:49, thierry bordaz <[email protected]> wrote:

Hi,

As a follow up of the PR https://pagure.io/389-ds-base/pull-request/50939,
I wrote down a small design about  rewriters (filter/computed_attr) plugin: 
http://www.port389.org/docs/389ds/design/search_rewriters.html

Comments are welcome

Probably the most dangerous thing to say in all of history?

Well decisions are dangerous. Sharing your wise comments reduce the risk of bad 
decisions ;)
So be sure I sincerely appreciate your feedback.

Like, your design is very smart, but that cleverness and flexibility carries 
many risks. The problem at hand is rewriting ad attributes - not to make a 
framework. I still say focus on that problem alone rather than trying to solve 
a generic class of problems.

Anyway, I still don't think this is the right avenue. There are two major 
reasons for this:

First, is the attempt to make a "generic framework" to solve a "specific 
problem". We should not have a generic rewrite framework, when all we need is a specific, 
focused, module just for doing known and well tested attribute transformations.

Code like COS or MEP may be generic, and it solves many cases but the surface 
area is huge, it's hard to test, and it's hard to reason about.

We do not have a need for allowing generic, and arbitrary rewriters to exist, especially 
not when you have to "compile in" the rewriters anyway!

Rewriting attributes is not a problem it is what LDAP clients do need. But I 
agree rewriting attributes is not that easy.

Clearly we have been hitting a regular demand to rewrite attributes and 
attributes values. Many plugins (cos, mep, addn, roles, views, slapi-nis, 
filter/attribute rewriters and now AD attributes, vsphere integration) have 
been related to rewrite attributes/values. This has always been a big need. 
Many parts of those plugins are similar (finding pattern, scope, craft 
values..) but implemented in a slightly different way. Those plugins are 
generic and already let the client select, through config, the specific 
transformation they need. This design does not introduce a new generic plugin 
but just simplify the use of already supported interfaces.

IMHO those interfaces are clever as they are flexible and opened. They do not 
force rewriters to use strict and limited abilities of plugins (like cos, 
mep,..) and let them be as complex as they need to match their needs.

This should be simply, an "ad rewrite" plugin, where all it does is that one thing - 
rewrite the attributes as required for AD emulation for IPA. This is far easier to deploy, test and 
reason about. Ideally, the configuration is simply "the plugin is enabled or disabled".


Second, is the idea of this being a "search rewriter". I don't think this is a 
good idea. The search path should be simple, it's our hot path. We have many things that 
have to interact like indexes etc. Look at virtual attribute indexing and such and the 
work needed for COS to have these used?

This plugin should be on the write path, transforming when a change occurs. This means 
the code is much simpler, easier to test, and we need no modifications to our read paths. 
Things like MEP and replication will "just work" as will indexing and much more.

I disagree here. Many time the write path is just not possible. Because of 
schema or historical reason, the entries already exist and will not be updated. 
The customer just want to see them in a transformed way. Sometime they can not 
even run a batch load to provision the missing attributes/values.


For me to approve this plugin, I really want to see it being a write-path 
transformation of values into other values, and it should be focused, targeted, 
and simple.

I do want to make one thing clear though - I think it's much better that this 
plugin exist in 389-ds rather than in freeipa. The 389-ds project has better 
tooling (like ASAN/LSAN), faster testing capability and a group of subject 
matter experts for code review. I think that if you were to move this to 
freeipa, you would not have the same level of testing or review quality as 
here, so I'd prefer to see you put it here. Sure, I might be difficult on this 
topic, but I do it because I believe there is a better, more robust manner to 
approach this problem space than currently you are considering. :)

I agree with you. I prefer the rewriter callback be part of 389-ds because I 
think the more rewriter samples the easier a developer will do his own.

best regards
thierry


Thanks,

best regards
thierry
_______________________________________________
389-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs
_______________________________________________
389-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]

_______________________________________________
389-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]

_______________________________________________
389-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]

—
Sincerely,

William Brown

Senior Software Engineer, 389 Directory Server
SUSE Labs
_______________________________________________
389-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]

_______________________________________________
389-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/[email protected]

[389-devel] Re: Please have a look at rewriters design

Reply via email to