A small update on this:
I can successfully run the plugin from thew command line like this:
bash-3.2$ bin/nutch plugin parse-wms info.geometa.parse.wms.WMSParser
sogis_geologie.wms.xml
However, nutch when crawling still refuses to use the plugin...
Greets
Silvio
Silvio Heuberger wrote:
Here's the relevant part of my plugin.xml:
<implementation id="info.geometa.parse.wms.WMSParser"
class="info.geometa.parse.wms.WMSParser">
<parameter name="contentType" value="application/vnd.ogc.wms_xml"/>
<parameter name="pathSuffix" value=""/>
</implementation>
I think that should nail it, right?
Doğacan Güney wrote:
Hi,
On Thu, Nov 27, 2008 at 6:30 PM, Silvio Heuberger
<[EMAIL PROTECTED]> wrote:
I have built a plugin for rudimentary parsing of WMS service descriptors.
The content type for that is:
application/vnd.ogc.wms_xml
I have the plgin built by ant, tested by ant and all. Now I add the plugin
to my registered extension points and hadoo.log shows the plugin is being
loaded:
Registered Plugins:
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - the nutch core
extension points (nutch-extensionpoints)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Basic Query
Filter (query-basic)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Html Parse
Plug-in (parse-html)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Basic Indexing
Filter (index-basic)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Basic Summarizer
Plug-in (summary-basic)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Site Query
Filter (query-site)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - HTTP Framework
(lib-http)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Text Parse
Plug-in (parse-text)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Regex URL Filter
(urlfilter-regex)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Http Protocol
Plug-in (protocol-http)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - OPIC Scoring
Plug-in (scoring-opic)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - CyberNeko HTML
Parser (lib-nekohtml)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Log4j
(lib-log4j)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - URL Query Filter
(query-url)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - WMS Parse
Plug-in (parse-wms)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Regex URL Filter
Framework (lib-regex-filter)
However, nutch seems to ignore the plugin.
2008-11-27 17:23:01,350 WARN parse.ParseUtil - No suitable parser found
when trying to parse content
http://isk.geobasis-bb.de/ows/dnm025.php?SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.2.0
of type application/vnd.ogc.wms_xml
I added the plugin to the parse-plugins.xml file:
<!-- types for wms stuff -->
<mimeType name="application/vnd.ogc.wms_xml">
<plugin id="parse-wms" />
</mimeType>
[...]
<alias name="parse-wms" extension="info.geometa.parse.wms.WMSParser" />
What gives??
I am not sure but can you check if you modified your plugin's
plugin.xml to include desired content-type?
Greets
Silvio