Hi,

Ah sorry. Both are actually copy and paste errors. Of course I only have one logger with the correct class name and the extension point is: "org.apache.nutch.indexer.IndexingFilter"

This is the actual plugin.xml I am using.

<?xml version="1.0" encoding="UTF-8"?>
<plugin id="simpletestplugin" name="URL Meta Indexing Filter"" version="1.0.0" provider-name="alaak">
    <runtime>
        <library name="simpletestplugin.jar">
            <export name="*"/>
        </library>
    </runtime>

    <requires>
        <import plugin="nutch-extensionpoints"/>
    </requires>

<extension id="de.effingo.crawler" name="Some Simple Test Plugin" point="org.apache.nutch.indexer.IndexingFilter">
        <implementation id="page-filter" class="testplugin.SimpleFilter"/>
    </extension>
</plugin>

Am So 12 Aug 2012 12:31:46 CEST schrieb Lewis John Mcgibbney:

Hi Alaak,

On Sun, Aug 12, 2012 at 10:58 AM, Alaak <[email protected]> wrote:

I always get output with the following
exception which basically tells me nothing:

...
Fetcher: finished at 2012-08-12 11:06:47, elapsed: 00:00:07
ParseSegment: starting at 2012-08-12 11:06:47
ParseSegment: segment: crawl/segments/20120812110633
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
at org.apache.nutch.parse.ParseSegment.parse(ParseSegment.java:209)


It tells you that there is a problem whilst parsing a particular
segment. This is quite a lot to go on.

All the Java code looks fine. I don't see any problems except that you
have an addition logging variable which seems to point outside of the
class.



<extension id="testplugin" name="Some Simple Test Plugin"
point="org.apache.nutch.segment.SegmentMergeFilter">
<implementation id="page-filter" class="testplugin.SimpleFilter"/>
</extension>
</plugin>


Now we come to the main point of concern. For me (as far as I
understand what you ar trying to do) you should not extend the
SegmentMergeFilter point. This should refer to the IndexingFilter you
wish to extend. A list of extension points can be seen here [0]

[0] http://svn.apache.org/repos/asf/nutch/trunk/src/plugin/nutch-extensionpoints/plugin.xml

hth

Lewis

Reply via email to