Darren,
Another advantage of option #2.3 is that it allows a clear cut
method of ordering the registration of classes, which may be important.
I'm wondering also if it is possible to have a totally dynamic option
which doesn't require any specific registration code, eg like in 2.4,
search the Python Path for modules; load all the modules and search
their attribute dictionaries for sub-classes of DataObject; automatically
register such any classes found.
I think Python allow you enough control of the import mechanism to
actually do this, but it would probably open up some other can of worms?
- Dermot
On 05/28/10 17:09, Darren Kenny wrote:
Hi,
I mentioned in the DOC review thread that I'd like to start a separate
discussion about the object registration for XML parsing in the DOC.
So, here it is... ;-)
The attached HTML file outlines possible solutions here.
I'm personally in favour of the one outlined in section 2.3 (combined with 2.2).
Would greatly appreciate people's feedback since this will be added to the next
revision of the DOC Design document.
Thanks,
Darren.
------------------------------------------------------------------------
Object Registration for XML Parsing
1. Introduction
The purpose of this document is to outline the way that XML parsing is
done using the Data Object Cache.
1.1. XML to Object Conversion
The Data Object Cache uses the can_handle() and from_xml() methods of
the DataObject class as a factory for generating a new object from a
snippet of XML.
In general this is done using code like:
Example Source for DataObjectCache methods
class DataObjectCache(DataObject):
# ...
@classmethod
def find_class_to_handle( cls, node ): <2>
"""
Find a class that handles a node in the known_classes list.
"""
for klass in cls.known_classes:
if klass.can_handle( node ):
return klass
return None
@classmethod
def create_doc_from_xml( cls, parent, node ): <1>
# Use same parent, skip level by default
new_parent = parent
klass = cls.find_class_to_handle(node)
if klass:
obj = klass.from_xml(node)
if obj:
obj.parent = parent
parent.children.append( obj )
new_parent = obj
for child in node.getchildren():
cls.create_doc_from_xml( new_parent, child )
The create_doc_from_xml() (<1>) is recursively called to traverse the
XML tree, and uses the method find_class_to_handle() (<2>) to find a
class that will handle a given node.
For find_class_to_handle() to work, it needs to know in advance what
classes are in the system, so that it can ask them whether they can
handle the XML or not, etc.
2. Class Registration
There are various mechanisms for knowing what classes are available to
instantiation, here I will try to outline what options there are:
2.1. Single Editable Registry
This is probably the most basic mechanism, but relies totally upon the
list of possible classes being known at build-time.
Any new sub-class of DataObject should be registered by the developer by
adding it to a file in a form like:
Example - Single File Registration - registry.py
from cache.data_object_cache import DataObjectCache
from my_new_module import MyClass # New Class Import
DataObjectCache.register_class( MyClass ) # Register Class
*Advantages*
*
It’s very simple to do.
*Disadvantages*
*
Not extensible, can only be updated, reliably, by developers with
access to source.
2.2. Register By Package
This one allows for each new package to have it’s own registration by
using similar code to the /Single Editable Registration/, but in a
package by package basis.
This allows for each module to control what it registers, as opposed to
a central point.
Example - Sample Directory Setup
src
|-- cache/
| |-- __init__.py
| |-- data_object.py
| `-- data_object_cache.py
|-- package1/
| |-- __init__.py
| `-- module1.py
|-- package2/
| |-- __init__.py
| `-- module2.py
|-- package3/
| |-- __init__.py
| `-- module3.py
|-- package4/
| |-- __init__.py
| `-- module4.py
`-- package5/
|-- __init__.py
`-- module5.py
The __init__.py in a package would need to have code along the lines of
the following to perform the registration:
Example - Package Initialisation
# Static registry of know class types...
from cache.data_object_cache import DataObjectCache
from package1.module1 import Class1
DataObjectCache.register_class(Class1)
This does the same type of registration that we saw in the /Simple
Editable Registration/ example above.
*Advantages*
*
Extensible
o
Each package can define that it wants to register
o
Not limited to packages provided from Install team.
*Disadvantages*
*
Requires some-one (anyone) to import package before ManifestParser
is run
o
For most Applications this really isn’t an problem since
it’s likely that some code will be import it
o
If using dynamic loading of modules (like thought best to
use for Checkpoints) then it not happen early enough, but
this can be mitigated by the class that pre-defines the
loading parameters being the one registered.
o
Also can be a problem for the introduction of new
checkpoints (most likely use here are as finalizers).
2.3. Extension To XML Manifest and Register By Package
This is pretty much the same thing as the /Register By Package/ section
above, but with the addition of a section in the XML Manifest to have a
package loading mechanism at the start, which would be specifically
handled by ManifestParser to trigger imports of python packages..
Example - XML Manifest Python Package Loading
<load_packages>
<module package_name="package1"/>
<module package_name="package2"/>
<module package_name="package3"/>
</load_package>
*Advantages*
*
Allows for finalizers to be written in Python, and have code to
handle special tags for it pre-loaded and thus registered with the
DOC for completing the import of the rest of the XML into DOC.
*Disadvantages*
*
Requires an addition to the XML Manifest Schema
2.4. Register By Searching Python Path
This one would require the DOC, on start-up, to recursively search the
Python Path (sys.path) for packages that contain a specific signature
file, which we would then execute/process to register objects with the DOC.
An example of such a signature file would be for there to be a Python
module file of a specific name - e.g. __DataObjectCache__.py, which
would be then looked for, and if found executed.
*Advantages*
*
Very Extensible
*Disadvantages*
*
Possibly long start-up time when performing search.
*
If Python Path contains insecure directories, then there is a risk
of malicious code being auto-loaded.
------------------------------------------------------------------------
Last updated 2010-05-28 17:04:02 IST
------------------------------------------------------------------------
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss
_______________________________________________
caiman-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/caiman-discuss