Alex Karasulu wrote:
Short abstract :
So we should consider that checking for H/R is mandatory when adding a
new value, but no more.
Some more elements about the other parts of your mail :
We can let the Schema intercerptor deal with normalization and syntax
checking, instead of asking the EntryAttribute to do the checking.
That
means we _must_ put this interceptor very high in the chain.
Right now I think this is split into two interceptors. The first one
which is executed immediately is the Normalization interceptor. It's
really an extension of the schema subsystem. Normalization cannot
occur without schema information and the process of normalization
automatically enforces value syntax. This is because to normalize
most parsers embedded in a normalizer must validate the syntax
to transform the value to a cannonical representation using String
prep rules.
those two guys work hand to hand... If we consider the Normalization
Interceptor alone, this is a pretty specific animal, yes. It is run
asap, to be sure that the elements sent by the client are in a good
shape (ie, comparable). But we may have to normalize values later too :
while searching for a value in a attribute, when adding some new
attributes through any inner mechanism (trigger, for instance)...
The big difference that has evolved between the Normalization
interceptor and the Schema interceptor is that the Normalization
interceptor is not designed to fully check schema. It does *ONLY*
what it needs to do to evaluate the validity of a request against the
DIT. For example the DN is normalized and the filter expression is
normalized early to determine if we can short this process with a
rapid return. This reduces latency and weeds out most incorrect
requests. Now with normalized parameters the Exception interceptor
can more accurately do it's work to determine whether or not the
request makes sense: i.e. does the entry that is being deleted
actually exist? Then the request goes deeper into the interceptor
chain for further processing. The key concept in terms of
normalization and schema checking is lazy execution.
yes, but fast failing is also a good thing to have.
Lazy execution makes sense most of the time but from the many
converstations we've had it seems this might actually be harming us
since we're doing many of the same computations over and over again
while discarding the results, especially where normalization is
concerned.
So true... At some point, we might want to keep the UP form and the
Normalized form for values, as we do for DN. It will cost some more
memory, but :
1) entries are transient, and can be discarded at will,
2) now that we will have StreamedValue, this won't be no more a big issue
3) normalizing values over and over may cost much more than storing
twice the size of data (in the worse cases)
4) we should consider that very often, UP value == normilized value, so
we have a easy way to avoid a doubled memory consumption.
This need to be further, and in another thread...
Here are the possible checks we can have on a value for an attribute :
H/R : could be done when creating the attribute or adding some
value into it
Yes this will have to happen very early within the codec I guess right?
yes. We will build the ServerEntry objects in the codec, like we are
processing DN atm. That means we will need an access to the registries
in the codec.
Syntax checking : SchemaInterceptor
Normalization : SchemaInterceptor
Right now request parameters are normalized in within the
Normalization interceptor and the these other aspects (items) are
being handled in the Schema interceptor.
<snip/>
It brings to my mind another concern :
let's think about what could happen if we change the schema : we will
have to update all the existing Attributes, which is simply not
possible. Thus, storing the AttributeType within the
EntryAttribute does
not sound good anymore. (unless we kill all the current requests
before
we change the schema). It would be better to store an accessor to the
schema sub-system, no ?
This is a big concern. For this reason I prefer holding references to
high level service objects which can swap out things like registries
when the schema changes. This is especially important within services
and interceptors that depend in particular on the schema service. I
would rather spend an extra cycle to do more lookups than with lazy
resolution which leads to a more dynamic architecture. Changes to
components are reflected immediately this way and have little impact
in terms of leaving stale objects around which may present problems
and need to be cleaned up.
You are right. I was over looking this part. We should simply consider
that if the schema changes, then we must 'reboot' the server. At least,
it will work in any case. Schema updates are not really meant to be done
often (we are not designing AD, are we ? ;).
The fact is that if we need to keep the serve rup and running even if we
need to change the schema, then it's a little bit more complex than
simply interact with the loaded values in the process being requested.
However on the flip side there's a line we need to draw. Where we
draw this line will determine the level of isolation we want. Let me
draw out a couple of specific scenarios to clarify.
Scenario 1
========
A client binds to the server and pulls the schema at version 1, then
before issuing an add operation for a specific objectClass the schema
changes and one of the objectClasses in the entry to be added is no
longer present. The request will fail and should since the schema
changed. Incidentally a smart client should check the
subscemaSubentry timestamps before issing write operations to see if
needs to check for schema changes that make the request invalid.
That won't be enough. Here, we need a kind of two phase commits, as we
are modifying two sources of data at the same time. Not very simple to
handle. We should also consider that we may have concurrent requests on
the same data...
Scenario 2
========
A client binds to the server and pulls schema at version 1, then
issues an add request, as the add request is being processed by the
server the schema changes and one of the objectClass in the entry to
be added is no longer present.
Scenario 1 is pretty clear and easy to handle. It will be handled
automatically for us anyway without having to explicitly code the
correct behavior. Scenario 2 is a bit tricky. First of all we have
to determine the correct behavoir that needs to be exhibited. Before
confirming with the specifications (which we need to do) my suspicions
would incline me to think that this add request should be allowed
since it was issued and received before the schema change was
committed. In this case it's OK for the add request to contain
handles on schema data which might be old but consistent with the time
at which that request was issued.
So to conclude I think it's OK, prefered and efficient for request
parameters and intermediate derived data structures used to evaluate
requests to have and leverage schema information that is not
necessarily up to date with the last schema change. This brings up a
slew of other problems we have to tackle btw but we can talk about
this in another thread.
Oh, yeah... No need to stop and think right now, as the current server
does not handle those problems anyway. First, we have to 'clean' the
Entry code :)
<snipped the rest of the convo, it will bring us far away from my
initial short Q ;) />
--
--
cordialement, regards,
Emmanuel Lécharny
www.iktek.com
directory.apache.org