More inline...
Alex Karasulu wrote:
Hi Emmanuel,
On Mon, Nov 23, 2009 at 3:31 AM, Emmanuel Lecharny <[email protected]>wrote:
Hi,
when we modify the schema, for instance by adding a new AT, in order not to
break the current schema, we apply the modification to a cloned Registries.
So far, so good. And if the new registries is ok with the newly added
SchemaObject, we swap the registries.
Now, this create a dull issue : all the ServerEntry instances still point
to the previous registries, and as we clean the previous registries to avoid
memory leaks, all those instances now point to non existing SchemaObjects
(more specifically, each instance has attributes which contain a reference
to the AttributeType schemaObject they are associated with) !
We cannot hold references to AttributeType or other schema objects anymore.
Instead we need to perform a lookup everytime to get these schema objects
from the registries via the schema manager. This way when the swap occurs
the new objects are extracted every time. No member references should be
made to schema objects any longer.
Question : how can we avoid this problem ?
See above.
We discussed about this point this afternoon with Pierre Arnaud, but I
didn't decide anything before having updated the wiki with a page
describing the current data structure and all the interactions between
objects. The page is
http://directory.apache.org/apacheds/1.5/apache-ds-schemamanager.html
The big problem with this approach (which was what was implemented
before) is that each lookup is costly, and we do a lot of them while
processing a request.
Having the AT stored into the AttributeEntry solve this problem, as we
have a direct access to the data : we just do a single lookup at the
beginning.
My idea is to apply the modification to the initial registries once it has
been proven that the registries alteration leave the Regsitries consistent.
We will just clean the cloned registries, and don't swap anymore.
I think this is not a good idea. The swap is a clean approach that does the
job just fine. Otherwise if there are sticking points in the registries we
will have problems. This is too much to do to solve this issue. Instead we
need dynamic look ups of all the schema objects or we need a update pattern
to inform of schema changes to swap out these objects.
I am thinking of the Observer or Listener notification patterns here.
Otherwise we will have to use dynamic lookups every time which can cost us a
lot.
We discussed this point with Alex extensively lately. This Observer/Listener pattern sounds good, but IMO, it does not solve the problem.
1) We have to deal with all the serverEntries being currently processed, and maybe being modified. A ServerEntry is not an atomic object, and we don't want to deal with the extra complexity of a event occuring while we are in the middle of a modification of this ServerEntry
2) Many ServerEntries might be stuck in memory, waiting for a thread to be free. This is the case for every partial ServerEntry waiting for some more bytes from the client.
3) We have many places where we store cached ServerEntries. The question is how do we update them ?
4) What do we do with the Original entry, which is stored into each OpContext,
as we may need it later ? Do we update it too ? IMO, that would defeat the
purpose of this object
5) ServerEntries are serialized in the Backend, and if we modify the schema, it
will most certainly impact them. If they are not migrated, they might not be
usable anymore after the schema modification. Also if we have millions of
entries, changing them online is probably not realistic. Anyway, the admin has
to deal with this problem in any case
I don't see how possibly we can deal with a schema modification live, except
for a few modifications :
- AT, OC, S, MR, C, N and SC, and only for Add or Move operations
- schema enabling
Any other operation (delete, modify, rename, disabling a schema) are most
certainly leading to dire errors, something an administrator will not want to
experiment in production. IMO, they should be forbidden on a working base. Such
operation is like manipulating a loaded weapon with no safety...
Does it sound good to you, or do you have any better idea ?
No I think we either go with expensive dynamic lookups all over the place or
we utilize a update notification mechanism via the Observer or Listener
patterns.
Does this sound like a viable alternative?
For the reason I mentionned, I don't think that any alternative is ok.
We probably don't have a perfect solution because there are none. As we
say : "any problem vanishes when there is no solution"...
More seriously, I don't think we need a dynamic schemaManager for a LDAP
server in production : admins don't change such a critical thing in
production, except those who are insane or desesperate. We must accept
the idea that we might have a downtime, we just have to minimize it.
Unless someone has a genious idea !
Regards,
--
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org