Re: [caiman-discuss] Object Registration for XML Parsing in the DOC

Dermot McCluskey Mon, 31 May 2010 04:37:34 -0700

Darren,

Another advantage of option #2.3 is that it allows a clear cut
method of ordering the registration of classes, which may be important.



I'm wondering also if it is possible to have a totally dynamic option
which doesn't require any specific registration code, eg like in 2.4,
search the Python Path for modules; load all the modules and search
their attribute dictionaries for sub-classes of DataObject; automatically
register such any classes found.

I think Python allow you enough control of the import mechanism to
actually do this, but it would probably open up some other can of worms?


- Dermot




On 05/28/10 17:09, Darren Kenny wrote:

Hi,

I mentioned in the DOC review thread that I'd like to start a separate
discussion about the object registration for XML parsing in the DOC.

So, here it is... ;-)

The attached HTML file outlines possible solutions here.

I'm personally in favour of the one outlined in section 2.3 (combined with 2.2).

Would greatly appreciate people's feedback since this will be added to the next
revision of the DOC Design document.

Thanks,

Darren.


------------------------------------------------------------------------


  Object Registration for XML Parsing


    1. Introduction

The purpose of this document is to outline the way that XML parsing isdone using the Data Object Cache.



      1.1. XML to Object Conversion

The Data Object Cache uses the can_handle() and from_xml() methods ofthe DataObject class as a factory for generating a new object from asnippet of XML.


In general this is done using code like:

Example Source for DataObjectCache methods

class DataObjectCache(DataObject):

    # ...

    @classmethod
    def find_class_to_handle( cls, node ):                      <2>
        """
        Find a class that handles a node in the known_classes list.
        """
        for klass in cls.known_classes:
            if klass.can_handle( node ):
                return klass

        return None

    @classmethod
    def create_doc_from_xml( cls, parent, node ):               <1>
        # Use same parent, skip level by default
        new_parent = parent

        klass = cls.find_class_to_handle(node)
        if klass:
            obj = klass.from_xml(node)
            if obj:
                obj.parent = parent
                parent.children.append( obj )
                new_parent = obj

        for child in node.getchildren():
            cls.create_doc_from_xml( new_parent, child )

The create_doc_from_xml() (<1>) is recursively called to traverse theXML tree, and uses the method find_class_to_handle() (<2>) to find aclass that will handle a given node.

For find_class_to_handle() to work, it needs to know in advance whatclasses are in the system, so that it can ask them whether they canhandle the XML or not, etc.



    2. Class Registration

There are various mechanisms for knowing what classes are available toinstantiation, here I will try to outline what options there are:



      2.1. Single Editable Registry

This is probably the most basic mechanism, but relies totally upon thelist of possible classes being known at build-time.

Any new sub-class of DataObject should be registered by the developer byadding it to a file in a form like:


Example - Single File Registration - registry.py

from cache.data_object_cache import DataObjectCache

from my_new_module import MyClass   # New Class Import

DataObjectCache.register_class( MyClass ) # Register Class

*Advantages*

    *

      It’s very simple to do.

*Disadvantages*

    *

      Not extensible, can only be updated, reliably, by developers with
      access to source.


      2.2. Register By Package

This one allows for each new package to have it’s own registration byusing similar code to the /Single Editable Registration/, but in apackage by package basis.

This allows for each module to control what it registers, as opposed toa central point.


Example - Sample Directory Setup

    src
    |-- cache/
    |   |-- __init__.py
    |   |-- data_object.py
    |   `-- data_object_cache.py
    |-- package1/
    |   |-- __init__.py
    |   `-- module1.py
    |-- package2/
    |   |-- __init__.py
    |   `-- module2.py
    |-- package3/
    |   |-- __init__.py
    |   `-- module3.py
    |-- package4/
    |   |-- __init__.py
    |   `-- module4.py
    `-- package5/
        |-- __init__.py
        `-- module5.py

The __init__.py in a package would need to have code along the lines ofthe following to perform the registration:


Example - Package Initialisation

# Static registry of know class types...
from cache.data_object_cache import DataObjectCache
from package1.module1 import Class1

DataObjectCache.register_class(Class1)

This does the same type of registration that we saw in the /SimpleEditable Registration/ example above.


*Advantages*

    *

      Extensible

          o

            Each package can define that it wants to register

          o

            Not limited to packages provided from Install team.

*Disadvantages*

    *

      Requires some-one (anyone) to import package before ManifestParser
      is run

          o

            For most Applications this really isn’t an problem since
            it’s likely that some code will be import it

          o

            If using dynamic loading of modules (like thought best to
            use for Checkpoints) then it not happen early enough, but
            this can be mitigated by the class that pre-defines the
            loading parameters being the one registered.

          o

            Also can be a problem for the introduction of new
            checkpoints (most likely use here are as finalizers).


      2.3. Extension To XML Manifest and Register By Package

This is pretty much the same thing as the /Register By Package/ sectionabove, but with the addition of a section in the XML Manifest to have apackage loading mechanism at the start, which would be specificallyhandled by ManifestParser to trigger imports of python packages..


Example - XML Manifest Python Package Loading

    <load_packages>
        <module package_name="package1"/>
        <module package_name="package2"/>
        <module package_name="package3"/>
    </load_package>

*Advantages*

    *

      Allows for finalizers to be written in Python, and have code to
      handle special tags for it pre-loaded and thus registered with the
      DOC for completing the import of the rest of the XML into DOC.

*Disadvantages*

    *

      Requires an addition to the XML Manifest Schema


      2.4. Register By Searching Python Path

This one would require the DOC, on start-up, to recursively search thePython Path (sys.path) for packages that contain a specific signaturefile, which we would then execute/process to register objects with the DOC.

An example of such a signature file would be for there to be a Pythonmodule file of a specific name - e.g. __DataObjectCache__.py, whichwould be then looked for, and if found executed.


*Advantages*

    *

      Very Extensible

*Disadvantages*

    *

      Possibly long start-up time when performing search.

    *

      If Python Path contains insecure directories, then there is a risk
      of malicious code being auto-loaded.

------------------------------------------------------------------------
Last updated 2010-05-28 17:04:02 IST


------------------------------------------------------------------------

_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss

Re: [caiman-discuss] Object Registration for XML Parsing in the DOC

Reply via email to