(I wrote this back in February, and got no feedback on it. If you
have some feedback, now's the time... ;)
-jdb
This is an early sketch for how I might propose we re-factor the
Xerces x-platform support for "3.0".
The Current Way
=============
There are five main locations of platform specific code in Xerces:
(1) util/Platforms
(2) util/Compilers
(3) util/Transcoders
(4) util/NetAccessors
(5) util/MsgLoaders
Compilers, Transcoders, NetAccessors, and MsgLoaders are built to
support particular functionality that may or may not be available on
a given compiler/platform, and are selected at compile time or
runtime. Each of these holds fairly specific functionality that is
narrowly focused, and is pretty clearly either present, or not.
In Platforms, a new subclass of Platform is created for each port.
Functionality that must be implemented by the platform class includes:
- Factory for NetAccessor, MsgLoader, and Transcoder
- Panic handling
- File access
- Path and Directory handling
- isAnySlash
- Standard Input
- Timing
- Atomic Ops
- Mutexes
The Problems
===========
In general, the current approach works well, I believe, for
Transcoders, NetAccessors, and MsgLoaders. As I said above, these are
generally either simply available or not in a given configuration.
And they're pretty well focused on doing one thing.
There are two areas where the Transcoders might be improved:
- In general, about half of the transcoder code is taken up by
supporting the LCP functions, which are disjoint from the standard
virtual transcoder interface. As I proved recently for the Mac
transcoder, the LCP transcoder can be written in terms of the virtual
transcoder; to implement this would reduce the amount of code needed
within any given transcoder implementation, and simply the the
complexity.
- There looks to be some redundancy in the (six!) iconv-based
transcoders, and perhaps the cygwin/Win32 trancoders, but I'm not
familiar enough with them yet to understand the reasons for this, or
whether/how they might be collapsed. Any input on this from folks
would be interesting.
The Platform class is where most of the issues arise. Due to the
monolithic nature of this class, you pretty much either have to
accept it or not in its entirety, which means that (a) most new ports
need to write a new Platform class, and (b) there's quite a bit of
work to doing so, and (c) that this leads to a lot of redundancy
between ports. Not to mention, (d) that this architecture doesn't
tend to lend itself to the autoconf way of selecting functionality in
a very fine-grained manner.
The (Potential) Solutions
===================
Given some of my goals for "3.0" of enabling configuration based on
autoconf and making it far easier to port Xerces to new platforms,
while also trimming redundancy and code bloat, I've started to
formulate some ideas on how to re-factor some of this code. I will
present those here, and ask for any feedback people have.
The major thing that needs to be done is to break down the monolithic
Platform class. We can perhaps leave the interface as it is, but we
need to gut it underneath in order to allow ad-hoc selection of bits
of functionality therein.
* Create a new "unified" platform subclass that:
- Has factories for NetAccessor, Transcoder, and MsgLoader. They
will make their selection based on preprocessor defines, probably.
- Relies on new factories for new Mutex, File, Timer, and
AtomicOp classes. The timer functionality may not need to be
abstracted to this degree, and could be done with defines if we're lazy.
- Has a panic method that calls abort(). Maybe also have a
pluggable panic function for apps that need to do something else.
- The isAnySlash problem should be solved by defines that
specify which of the path separators are used, or by having the File
class provide a string of path separators. I _think_ I prefer the
former. BTW: it looks to me that isAnySlash in most platforms is as /
or \. This doesn't look right to me: \ isn't valid on most posix
paths, is it...??? Another way to handle this might be to use
platform specific code in the File class to do the path weaving/
manipulation...but this takes us back in time ;)
- Perhaps eliminate the standard input support entirely. (Is
this used anywhere--in tests or samples?)
New classes hierarchies will be generated for Mutex, FileHandle,
Timer and AtomicOp, implementing the various flavors of these
required by different platforms. Note, for instance, that posix file
handle, mutex, and timer classes will serve many platforms. AtomicOp
will probably be written in terms of Mutex for those platforms that
don't support built-in atomic support or special functions
* The Transcoder LCP interface should perhaps be re-factored as a
single class that's written in terms of the virtual transcoder. This
assumes that for any given platform, the virtual transcoder can
always handle the platform's LCP encoding. The task on a given
platform, then, is to determine the proper LCP encoding.
Open Questions
=============
- isAnySlash, which is misnamed and is simply supposed to
determine whether the character is a path separator, is implemented
on nearly all the platforms to include both / and \. Isn't this wrong
for posix paths???
- Is the openStdInHandle function used anywhere? I haven't
searched yet.
- Can the virtual transcoder handle the needs of the LCP
transcoder, particularly on the Windows and Cygwin platforms, if we
were to implement a generic LCP transcoder in terms of the virtual
transcoder? We would need to supply the LCP encoding.
- What is the nature of redundancy in the six iconv-based
transcoder implementations? Or between the Win32 and Cygwin transcoders?
- More thinking/research has to go into how we should handle the
"Compilers" section. Leave it alone? Do more in terms of finer-
grained tests through autoconf?
This has just been a brain-dump of my currect thoughts on these
issues. Your feedback is encouraged.
James.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]