>       Good ideas though. I think we should make it easy to add a class for
> each type of doc/mimetype and then wrap a set of parser rules around that
> class.
> 

Good call.  I've been working on this very thing.  I realized as I was
working on writing an xml parser that I was duplicating a lot of the
methods, etc. from the other parsers.  I'm working on a patch which
would create a GenericParser() class which each content-specific Parser
would inherit to eliminate a lot of potential code overlap, and be a
little more 'plugin-friendly'.

Here's what I'm thinking:

1. rename generic_parser() in Parser.py to handle_mimetype() or similar
as this is more descriptive of what it does.  Change it's if statement
logic to search a dictionary of mimetypes and their parsers.

2. create a GenericParser class which has all methods/code common to all
parsers.  get_plucker_doc, get_images, get_anchors, setting self._doc =
TextDocBuilder(), etc...

3. create a subdirectory in PyPlucker for the content specific parser
classes, each in its own file.

This would greatly simplify adding support for additional mime-types. 
To add a new content parser, one would only have to create a new file
with their class and put an entry in a dictionary.

My xml parser code is written in a similar manner, to support a new type
of xml file one only has to create the ContentHandler and specify the
root tag in a dictionary, the xml parser class would find and use it.

If the above dictionaries were populated from an ini file or something,
it gives the end user a lot of freedom to support whatever they want
without modifying the actual plucker code in any way.  And allows the
user to explicitly disable a parser for a particular mime-type if they
choose.

What do ya think?

PS -- Hope I'm not offending, the current codebase is well done.  I'm
just excited about the possibilities! ;)

-- 
Dave <[EMAIL PROTECTED]>

_______________________________________________
plucker-dev mailing list
[EMAIL PROTECTED]
http://lists.rubberchicken.org/mailman/listinfo/plucker-dev

Reply via email to