Another update: I've got nutch to run inside Eclipse and now the
log-level has been bounced.
That's part of the output:
2008-12-01 14:44:36,052 WARN parse.ParserFactory
(ParserFactory.java:matchExtensions(322)) - ParserFactory: Plugin:
info.geometa.parse.wms.WMSParser mapped to contentType
application/vnd.ogc.wms_xml via parse-plugins.xml, but not enabled via
plugin.includes in nutch-default.xml
2008-12-01 14:44:36,053 WARN parse.ParseUtil (ParseUtil.java:parse(73))
- No suitable parser found when trying to parse content
http://isk.geobasis-bb.de/ows/dnm025.php?SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.2.0
of type application/vnd.ogc.wms_xml
2008-12-01 14:44:36,053 WARN fetcher.Fetcher (Fetcher.java:output(315))
- Error parsing:
http://isk.geobasis-bb.de/ows/dnm025.php?SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.2.0:
failed(2,200): org.apache.nutch.parse.ParseException: parser not found
for contentType=application/vnd.ogc.wms_xml
url=http://isk.geobasis-bb.de/ows/dnm025.php?SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.2.0
anyone?
Silvio Heuberger wrote:
Here's the relevant part of my plugin.xml:
<implementation id="info.geometa.parse.wms.WMSParser"
class="info.geometa.parse.wms.WMSParser">
<parameter name="contentType" value="application/vnd.ogc.wms_xml"/>
<parameter name="pathSuffix" value=""/>
</implementation>
I think that should nail it, right?
Doğacan Güney wrote:
Hi,
On Thu, Nov 27, 2008 at 6:30 PM, Silvio Heuberger
<[EMAIL PROTECTED]> wrote:
I have built a plugin for rudimentary parsing of WMS service descriptors.
The content type for that is:
application/vnd.ogc.wms_xml
I have the plgin built by ant, tested by ant and all. Now I add the plugin
to my registered extension points and hadoo.log shows the plugin is being
loaded:
Registered Plugins:
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - the nutch core
extension points (nutch-extensionpoints)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Basic Query
Filter (query-basic)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Html Parse
Plug-in (parse-html)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Basic Indexing
Filter (index-basic)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Basic Summarizer
Plug-in (summary-basic)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Site Query
Filter (query-site)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - HTTP Framework
(lib-http)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Text Parse
Plug-in (parse-text)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Regex URL Filter
(urlfilter-regex)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Http Protocol
Plug-in (protocol-http)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - OPIC Scoring
Plug-in (scoring-opic)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - CyberNeko HTML
Parser (lib-nekohtml)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Log4j
(lib-log4j)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - URL Query Filter
(query-url)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - WMS Parse
Plug-in (parse-wms)
2008-11-27 17:23:08,928 INFO plugin.PluginRepository - Regex URL Filter
Framework (lib-regex-filter)
However, nutch seems to ignore the plugin.
2008-11-27 17:23:01,350 WARN parse.ParseUtil - No suitable parser found
when trying to parse content
http://isk.geobasis-bb.de/ows/dnm025.php?SERVICE=WMS&REQUEST=GetCapabilities&VERSION=1.2.0
of type application/vnd.ogc.wms_xml
I added the plugin to the parse-plugins.xml file:
<!-- types for wms stuff -->
<mimeType name="application/vnd.ogc.wms_xml">
<plugin id="parse-wms" />
</mimeType>
[...]
<alias name="parse-wms" extension="info.geometa.parse.wms.WMSParser" />
What gives??
I am not sure but can you check if you modified your plugin's
plugin.xml to include desired content-type?
Greets
Silvio