-1!
Xsl is terrible slow!
Xml will blow up memory and storage usage.
Dublin core may is good for semantic web, but not for a content storage.
In general the goal must be to minimalize memory usage and improve performance such a parser would increase memory usage and definitely slow down parsing.
The magic world is minimalism.
So I vote against this suggestion!
Stefan





Am 24.11.2005 um 00:01 schrieb Jérôme Charron:

Hi,

We (Chris Mattmann, François Martelet, Sébastien Le Callonnec and me) just
add a new proposal on the nutch Wiki:
http://wiki.apache.org/nutch/MarkupLanguageParserProposal

Here is the Summary of Issue:
"Currently, Nutch provides some specific markup language parsing plugins:
one for handling HTML, another one for RSS, but no generic XML parsing
plugin. This is extremely cumbersome as adding support for a new markup language implies that you have to develop the whole XML parsing code from scratch. This methodology causes: (1) code duplication, with little or no
reuse of common pieces of XML parsing code, and (2) dependency library
duplication, where many XML parsing plugins may rely on similar xml parsing libraries, such as jaxen, or jdom, or dom4j, etc., but each parsing plugin keeps its own local copy of these libraries. It is also very difficult to identify precisely the type of XML content encountered during a parse. That
difficult issue is outside the scope of this proposal, and will be
identified in a future proposal."

Thanks for your feedback, comments, suggestions (and votes).

Regards

Jérôme

--
http://motrech.free.fr/
http://www.frutch.org/



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to