(Sorry for the cross-post. I think this mail legitimately pertains to
both GeoTools and GeoAPI.)

I first introduced myself to this list about three weeks ago, when I
first started working with GeoTools. Since then I have been working on a
project in which I inherited an existing code base. Accordingly, I have
had a reasonably broad exposure to the framework (the core framework,
anyway), and I wanted to share the thoughts, impressions, and
suggestions of a novice user (and GIS outsider) with the community.

I know that GeoTools is currently undergoing a fairly major refactoring,
and that's exactly the reason why I am motivated to share these
experiences. I hope that some of what I've experienced as a novice user
might be able to influence a process that is still evolving. Were the
structure of the framework set in stone, I probably wouldn't bother.


First of all, I strongly support the move to the GeoAPI, despite the
warts that it has exposed in the combined framework in the interim
(complex nested structures of deprecated classes, and the like). I don't
know if it is the eventual goal, but the idea of a GeoAPI as the primary
focus, with GeoTools as the GeoAPI reference implementation, has strong
appeal.

In that respect, I think that there are improvements that could be made
to the GeoAPI data model that would allow it to function better as a
stand-alone API. In any Java API composed entirely of interfaces, the
primary obstacle is going to be the creation of objects. Once the object
exists, of course, the interface is no different (to the user of the
API) than any other type. Where interface-only APIs succeed and fail,
then, is in how they manage the creation of the API types.

The profusion of factories (in the GeoAPI) and factory finders (in
GeoTools) is an attempt to address this problem. Unfortunately, I think
the current model is far too complex to be usable. Additionally, it
creates a large implementation-specific footprint in code that uses the
API.

Ideally, the API should be the public face through which the user uses a
framework. The implementation of that API should remain largely hidden
from view. Of course, completely hiding it from view would be
impossible. In the end, there must be some sort of "API Binding" layer
that involves calls to actual constructors or static methods. 

I believe that an API /must/ be built with the structure of this binding
layer in mind and must give hints to potential implementors as to how
the layer should be designed. The design of this layer is the crucial
factor that determines the usability of the API. This has been done
poorly, and it has been done well. 

The core Java DOM interface, for example, uses registries and JVM-level
properties to dynamically create the various DOM types at runtime.
JVM-level properties are basically global variables, and we all learned
a long time ago that global variables are a Bad Thing. Anyone who has
ever tried to use two different packages that used different XML parsers
(and therefore overwrote each other's system variables) has felt this
pain. One package or another will try to use an implementation-specific
feature of a particular XML parser. If that package was the first to
register its favored parser, then it will throw a ClassCastException at
runtime. Bugs like these can be fiendishly difficult to identify and,
once identified, still more difficult to fix (since the problem is in
the design of the framework). The typical solution is simply to stop
using the common API, foregoing all of its benefits.

I have used two frameworks where this binding was done quite well: the
Jena framework (jena.sourceforge.net) and the OWL API
(owlapi.sourceforge.net). Both use an almost pure-interface approach.
Typical user code will almost never deal with an actual Java class. In
each case, the API-binding layer is a single class (ModelFactory for
Jena, OWLManager for the OWL API) that creates any number of API-level
objects. All other objects are created from this core set.

The net effect of this approach is that the footprint of the actual
implementation is extremely small. If the goal of creating a separate
API is that different implementations can be transparently swapped in
without affecting existing code, then this model serves that purpose
very well.

The lesson from these two examples is that the API Binding layer should
be as small as possible, but no smaller. The Java DOM approach of trying
to make it invisible is a non-starter, but so is creating an
implementation-specific factory for every type (or category of types) in
the API.


I don't think the single-class approach could possibly work for
GeoTools. It is many times bigger and more complex than either of the
projects mentioned above. However, some approximation of that approach,
baked into the structure and design intent of the GeoAPI, would in my
opinion be a Very Good Thing.

I am not well versed enough with either GeoTools or GIS in general to
speculate on how it might work, but I don't think that the current
structure of factories and factory finders is it. It forces the user to
constantly break out of the common API and go back to
implementation-specific classes to create the objects they want.

I will, however, take a crack at a very focused portion of the API to
help get across the lines along which I am thinking. I think that the
interfaces in the org.opengis.feature package (and its subpackages)
could benefit from such an approach.

The AttributeTypeBuilder (which is GeoTools specific), for example, is a
confusing web of hidden functionality, where the AttributeType required
to create an AttributeDescriptor is created behind the scenes, and then
the state is quietly reset. The relationship between the AttributeType
and AttributeDescriptor is entirely hidden by this approach. Knowledge
of this relationship is clearly assumed, and a misunderstanding will
result in the misuse of this class. This could lead to some extremely
confusing bugs.

Fundamentally, the class structure should reflect the data model being
described. The current structure masks the navigation from data (the
features) to metadata (the types and descriptors) and back again. In
this case, a FeatureType represents a template to which features must
conform. Accordingly, a FeatureType object should have the ability to
create new Features of this type. Similarly, AttributeTypes should be
able to create AttributeDescriptors, which should be able to create
Descriptors. Since the metadata objects are already required to create
the data objects (i.e. they are parameters to the factory methods), it
makes sense that they should simply be the factories themselves. Moving
from metadata to data essentially involves "filling in the blanks",
where the metadata object defines the blanks.

Such a structure not only makes code more readable, but helps decrease
adoption time by making the data structures being represented explicit
in the code structure. (In this case, a Feature contains Properties,
whose metadata is described by PropertyDescriptor, each of which has a
PropertyType. The Feature has a FeatureType, which defines the
PropertyDescriptors that are appropriate to it.)

I've noticed that a similar approach was followed in some of the legacy
API, but it appears that this approach is being abandoned. I strongly
urge you to reconsider.

I hope that's clear. I'm inclined to elaborate further, but this message
is already quite long.


Also, I hope that you don't think me presumptuous as a GIS and GeoTools
novice. But the truth is that I really like GeoTools and GeoAPI. I think
it's a great project with a lot of potential. Thank you for your time.


Thanks,



Tim Swanson 
Software Engineer


Tyler Technologies, Inc.
14142 Denver West Parkway, Suite 155
Lakewood, CO 80401
Phone:  
Fax: 303-271-1930
E-mail: [EMAIL PROTECTED]
Web: www.tylertech.com

-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Geotools-gt2-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-gt2-users

Reply via email to