A small update on this:

I can successfully run the plugin from thew command line like this:

bash-3.2$ bin/nutch plugin parse-wms info.geometa.parse.wms.WMSParser sogis_geologie.wms.xml

However, nutch when crawling still refuses to use the plugin...

Greets
Silvio

Silvio Heuberger wrote:
Here's the relevant part of my plugin.xml:

<implementation id="info.geometa.parse.wms.WMSParser"
                      class="info.geometa.parse.wms.WMSParser">
        <parameter name="contentType" value="application/vnd.ogc.wms_xml"/>
        <parameter name="pathSuffix"  value=""/>
      </implementation>

I think that should nail it, right?

Doğacan Güney wrote:
Hi,

On Thu, Nov 27, 2008 at 6:30 PM, Silvio Heuberger
<[EMAIL PROTECTED]> wrote:
I have built a plugin for rudimentary parsing of WMS service descriptors.

The content type for that is:
application/vnd.ogc.wms_xml

I have the plgin built by ant, tested by ant and all. Now I add the plugin
to my registered extension points and hadoo.log shows the plugin is being
loaded:

Registered Plugins:
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     the nutch core
extension points (nutch-extensionpoints)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Basic Query
Filter (query-basic)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Html Parse
Plug-in (parse-html)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Basic Indexing
Filter (index-basic)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Basic Summarizer
Plug-in (summary-basic)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Site Query
Filter (query-site)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     HTTP Framework
(lib-http)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Text Parse
Plug-in (parse-text)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Regex URL Filter
(urlfilter-regex)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Http Protocol
Plug-in (protocol-http)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     OPIC Scoring
Plug-in (scoring-opic)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     CyberNeko HTML
Parser (lib-nekohtml)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Log4j
(lib-log4j)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     URL Query Filter
(query-url)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     WMS Parse
Plug-in (parse-wms)
2008-11-27 17:23:08,928 INFO  plugin.PluginRepository -     Regex URL Filter
Framework (lib-regex-filter)

However, nutch seems to ignore the plugin.

2008-11-27 17:23:01,350 WARN  parse.ParseUtil - No suitable parser found
when trying to parse content
http://isk.geobasis-bb.de/ows/dnm025.php?SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.2.0
of type application/vnd.ogc.wms_xml

I added the plugin to the parse-plugins.xml file:

<!-- types for wms stuff -->
  <mimeType name="application/vnd.ogc.wms_xml">
    <plugin id="parse-wms" />
  </mimeType>
[...]
<alias name="parse-wms" extension="info.geometa.parse.wms.WMSParser" />

What gives??

I am not sure but can you check if you modified your plugin's
plugin.xml to include desired content-type?

Greets
Silvio

Reply via email to