-1!
Xsl is terrible slow!
Xml will blow up memory and storage usage.
Dublin core may is good for semantic web, but not for a content storage.
In general the goal must be to minimalize memory usage and improve
performance such a parser would increase memory usage and definitely
slow down parsing.
The magic world is minimalism.
So I vote against this suggestion!
Stefan
Am 24.11.2005 um 00:01 schrieb Jérôme Charron:
Hi,
We (Chris Mattmann, François Martelet, Sébastien Le Callonnec and
me) just
add a new proposal on the nutch Wiki:
http://wiki.apache.org/nutch/MarkupLanguageParserProposal
Here is the Summary of Issue:
"Currently, Nutch provides some specific markup language parsing
plugins:
one for handling HTML, another one for RSS, but no generic XML parsing
plugin. This is extremely cumbersome as adding support for a new
markup
language implies that you have to develop the whole XML parsing
code from
scratch. This methodology causes: (1) code duplication, with little
or no
reuse of common pieces of XML parsing code, and (2) dependency library
duplication, where many XML parsing plugins may rely on similar xml
parsing
libraries, such as jaxen, or jdom, or dom4j, etc., but each parsing
plugin
keeps its own local copy of these libraries. It is also very
difficult to
identify precisely the type of XML content encountered during a
parse. That
difficult issue is outside the scope of this proposal, and will be
identified in a future proposal."
Thanks for your feedback, comments, suggestions (and votes).
Regards
Jérôme
--
http://motrech.free.fr/
http://www.frutch.org/
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers