On 30/11/09 11:15, Emmanuel Lecharny wrote:
Hi,

while reviewing the proposed names, I find Schema a bit problematic
when it comes to cover all the schemaObjects as a whole. Openldap and
ADS can use more than one schema (typically, core, system, ...). At
this point, naming the system in charge of all the schemas something
like 'SchemaHandler' or 'SchemaManager' would be better...

thoughts ?


Our SDK prototype supports multiple schema, each one is an instance of "Schema". Since a Schema is a big heavily used object but relatively static we have made the class immutable. In order to create a new Schema we provide a SchemaBuilder class with methods like "addAttributeType", etc. Once the Schema is built the application converts the builder to a schema instance as follows:

   SchemaBuilder builder = new SchemaBuilder();
   builder.addAttributeType(...);
   builder.addObjectClass(...);
   ...

   Schema schema = builder.toSchema();

We also provide a version of the toSchema() method which supports storage of error messages (e.g. missing elements):

   List<String> messages = new LinkedList<String>();
   Schema schema = builder.toSchema(messages);
   for (String message : messages) {
      // Display errors to user
   }


Why do we need to handle errors like this? Well because schema definitions may not be complete and we don't want this to prevent a client from proceeding or fixing the broken schema. In particular, RFC4512#4.4 states:

       Clients SHOULD NOT assume that a published subschema is complete,
       that the server supports all of the schema elements it publishes, or
       that the server does not support an unpublished element.

So far in our prototype SDK we haven't needed a SchemaManager. We provide access to the singleton Schema objects directly from the Schema class (e.g. Schema.getCoreSchema(), Schema.getDefaultSchema(), etc).

For other schema (e.g. schema retrieved from a server) we were going to provide factory methods in the Connection/AsynchronousConnection interface (actually we haven't got these yet - I was going to do them today). Here's the idea (we'd have async versions in our AsynchronousConnection interface):

   public interface Connection {

      // Common LDAP operations.
      Result add(AddRequest request) throws ErrorResultException;

      ...

      // Special LDAP operations.
      Schema getSchemaForEntry(String dn, List<String> errors) throws
   ErrorResultException;

      Schema getSchemaForEntry(DN dn, List<String> errors) throws
   ErrorResultException;

      RootDSE getRootDSE() throws ErrorResultException;
   }


This would allow the Connection implementation to manage its schemas. For example, implementations may choose to look up schemas on demand and/or cache them. Certain implementations may only support a single schema. Something we'd like to support is an "LDIF connection" - a virtual LDAP connection to an LDIF file. Basically we read an LDIF file into memory and allow client apps to "connect" to it and perform normal LDAP operations against it. The LDIF connection would have a faked up RootDSE and perhaps a faked up schema or one provided during construction of the LDIF connection. LDIF connections would be great when performing offline upgrades of the server - the upgrade tool can load the server config.ldif into memory and modify it without needing any of the server infrastructure available (thus avoiding "chicken and egg" problems).

One problem with multiple schema support is decoding search result entries. We should decode the search result entry using the schema identified in its subschemaSubentry operational attribute. However, there are various problems:

   * The subschemaSubentry attribute may not be supported by the
     server, it may be filtered by the server (e.g. access control), or
     it may not have been requested by the client. In the latter case,
     we could add the attribute to the list of attributes requested in
     every search request but that's a bit naughty in my opinion.

   * The subschemaSubentry attribute could occur anywhere in the list
     of attributes returned in the entry requiring two passes in order
     to decode the returned entry :-( This is not very efficient.

I haven't quite figured out the best approach for schema discovery so far - certainly using the subschemaSubentry operational attribute seems quite cumbersome. We could implement LDAP connections such that they *optionally* perform a search for all subschema sub-entries in the server immediately after bind.

I think schema support and especially multiple schema support is hard. Even decoding a DN becomes complicate because individual RDN components may be associated with different schema! Yuck! :-(

Matt

Reply via email to