Sorry, I posted the incorrect error code in my previous messages.
Here is the output I get when running ant with the Nutch-87 plugin:
[caribmag]$ ant -v
Apache Ant version 1.6.5 compiled on June 2 2005
Buildfile: build.xml
Detected Java version: 1.4 in: /home/1/caribmag/j2sdk1.4.2_10/jre
Detected OS: Linux
parsing buildfile /home/1/caribmag/caribbeanlinks.com/nutch/nutch/
src/plugin/epile/build.xml with URI = file:///home/1/caribmag/
caribbeanlinks.com/nutch/nutch/src/plugin/epile/build.xml
Project base dir set to: /home/1/caribmag/caribbeanlinks.com/nutch/
nutch/src/plugin/epile
Importing file ../build-plugin.xml from /home/1/caribmag/
caribbeanlinks.com/nutch/nutch/src/plugin/epile/build.xml
parsing buildfile /home/1/caribmag/caribbeanlinks.com/nutch/nutch/
src/plugin/build-plugin.xml with URI = file:///home/1/caribmag/
caribbeanlinks.com/nutch/nutch/src/plugin/build-plugin.xml
[property] Loading /home/caribmag/WhitelistURLFilter.build.properties
[property] Unable to find property file: /home/caribmag/
WhitelistURLFilter.build.properties
[property] Loading /home/1/caribmag/caribbeanlinks.com/nutch/nutch/
src/plugin/epile/build.properties
[property] Unable to find property file: /home/1/caribmag/
caribbeanlinks.com/nutch/nutch/src/plugin/epile/build.properties
[available] Unable to find dir src/test to set property test.available
Build sequence for target(s) `jar' is [init, compile, jar]
Complete build sequence is [init, compile, jar, init-plugin,
deploy, compile-test, clean, test, ]
init:
[mkdir] Created dir: /home/1/caribmag/caribbeanlinks.com/nutch/
nutch/build/WhitelistURLFilter
[mkdir] Created dir: /home/1/caribmag/caribbeanlinks.com/nutch/
nutch/build/WhitelistURLFilter/classes
[mkdir] Created dir: /home/1/caribmag/caribbeanlinks.com/nutch/
nutch/build/WhitelistURLFilter/test
Project base dir set to: /home/1/caribmag/caribbeanlinks.com/nutch/
nutch/src/plugin/epile
[antcall] calling target(s) [init-plugin] in build file /home/1/
caribmag/caribbeanlinks.com/nutch/nutch/src/plugin/epile/build.xml
parsing buildfile /home/1/caribmag/caribbeanlinks.com/nutch/nutch/
src/plugin/epile/build.xml with URI = file:///home/1/caribmag/
caribbeanlinks.com/nutch/nutch/src/plugin/epile/build.xml
Project base dir set to: /home/1/caribmag/caribbeanlinks.com/nutch/
nutch/src/plugin/epile
Importing file ../build-plugin.xml from /home/1/caribmag/
caribbeanlinks.com/nutch/nutch/src/plugin/epile/build.xml
parsing buildfile /home/1/caribmag/caribbeanlinks.com/nutch/nutch/
src/plugin/build-plugin.xml with URI = file:///home/1/caribmag/
caribbeanlinks.com/nutch/nutch/src/plugin/build-plugin.xml
Override ignored for property name
Override ignored for property root
[property] Loading /home/caribmag/WhitelistURLFilter.build.properties
[property] Unable to find property file: /home/caribmag/
WhitelistURLFilter.build.properties
[property] Loading /home/1/caribmag/caribbeanlinks.com/nutch/nutch/
src/plugin/epile/build.properties
[property] Unable to find property file: /home/1/caribmag/
caribbeanlinks.com/nutch/nutch/src/plugin/epile/build.properties
Override ignored for property nutch.root
Override ignored for property src.dir
Override ignored for property src.test
[available] Unable to find dir src/test to set property test.available
Override ignored for property conf.dir
Override ignored for property build.dir
Override ignored for property build.classes
Override ignored for property build.test
Override ignored for property deploy.dir
Override ignored for property javac.deprecation
Override ignored for property javac.debug
Override ignored for property javadoc.link
Override ignored for property build.encoding
Build sequence for target(s) `init-plugin' is [init-plugin]
Complete build sequence is [init-plugin, init, compile, jar,
deploy, compile-test, clean, test, ]
[antcall] Entering /home/1/caribmag/caribbeanlinks.com/nutch/
nutch/src/plugin/epile/build.xml...
Build sequence for target(s) `init-plugin' is [init-plugin]
Complete build sequence is [init-plugin, init, compile, jar,
deploy, compile-test, clean, test, ]
init-plugin:
[antcall] Exiting /home/1/caribmag/caribbeanlinks.com/nutch/nutch/
src/plugin/epile/build.xml.
compile:
[echo] Compiling plugin: WhitelistURLFilter
[javac] crawl/plugin/whitelisturlfilter/WhitelistURLFilter.java
added as crawl/plugin/whitelisturlfilter/WhitelistURLFilter.class
doesn't exist.
[javac] Compiling 1 source file to /home/1/caribmag/
caribbeanlinks.com/nutch/nutch/build/WhitelistURLFilter/classes
[javac] Using modern compiler
dropping /home/1/caribmag/caribbeanlinks.com/nutch/nutch/build/
classes from path as it doesn't exist
dropping /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/plugin/
epile/home/caribmag/tomcatcommonlibservlet.jar from path as it
doesn't exist
[javac] Compilation arguments:
[javac] '-d'
[javac] '/home/1/caribmag/caribbeanlinks.com/nutch/nutch/build/
WhitelistURLFilter/classes'
[javac] '-classpath'
[javac] '/home/1/caribmag/caribbeanlinks.com/nutch/nutch/build/
WhitelistURLFilter/classes:/home/1/caribmag/caribbeanlinks.com/
nutch/nutch/lib/commons-logging-api-1.0.4.jar:/home/1/caribmag/
caribbeanlinks.com/nutch/nutch/lib/concurrent-1.3.4.jar:/home/1/
caribmag/caribbeanlinks.com/nutch/nutch/lib/jakarta-oro-2.0.7.jar:/
home/1/caribmag/caribbeanlinks.com/nutch/nutch/lib/jetty-5.1.2.jar:/
home/1/caribmag/caribbeanlinks.com/nutch/nutch/lib/junit-3.8.1.jar:/
home/1/caribmag/caribbeanlinks.com/nutch/nutch/lib/lucene-1.9-rc1-
dev.jar:/home/1/caribmag/caribbeanlinks.com/nutch/nutch/lib/lucene-
misc-1.9-rc1-dev.jar:/home/1/caribmag/caribbeanlinks.com/nutch/
nutch/lib/servlet-api.jar:/home/1/caribmag/caribbeanlinks.com/nutch/
nutch/lib/spring-beans.jar:/home/1/caribmag/caribbeanlinks.com/
nutch/nutch/lib/spring-core.jar:/home/1/caribmag/caribbeanlinks.com/
nutch/nutch/lib/taglibs-i18n.jar:/home/1/caribmag/
caribbeanlinks.com/nutch/nutch/lib/xerces-2_6_2-apis.jar:/home/1/
caribmag/caribbeanlinks.com/nutc
h/nut
c h/lib/xerces-2_6_2.jar:/home/caribmag/apache-ant/lib/ant-
launcher.jar:/home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile:/home/caribmag/apache-ant/lib/ant-antlr.jar:/home/
caribmag/apache-ant/lib/ant-apache-bcel.jar:/home/caribmag/apache-
ant/lib/ant-apache-bsf.jar:/home/caribmag/apache-ant/lib/ant-apache-
log4j.jar:/home/caribmag/apache-ant/lib/ant-apache-oro.jar:/home/
caribmag/apache-ant/lib/ant-apache-regexp.jar:/home/caribmag/apache-
ant/lib/ant-apache-resolver.jar:/home/caribmag/apache-ant/lib/ant-
commons-logging.jar:/home/caribmag/apache-ant/lib/ant-commons-
net.jar:/home/caribmag/apache-ant/lib/ant-icontract.jar:/home/
caribmag/apache-ant/lib/ant-jai.jar:/home/caribmag/apache-ant/lib/
ant-javamail.jar:/home/caribmag/apache-ant/lib/ant-jdepend.jar:/
home/caribmag/apache-ant/lib/ant-jmf.jar:/home/caribmag/apache-ant/
lib/ant-jsch.jar:/home/caribmag/apache-ant/lib/ant-junit.jar:/home/
caribmag/apache-ant/lib/ant-netrexx.jar:/home/caribmag/apache-ant/
lib/ant-nodeps.jar
:/hom
e /caribmag/apache-ant/lib/ant-starteam.jar:/home/caribmag/apache-
ant/lib/ant-stylebook.jar:/home/caribmag/apache-ant/lib/ant-
swing.jar:/home/caribmag/apache-ant/lib/ant-trax.jar:/home/caribmag/
apache-ant/lib/ant-vaj.jar:/home/caribmag/apache-ant/lib/ant-
weblogic.jar:/home/caribmag/apache-ant/lib/ant-xalan1.jar:/home/
caribmag/apache-ant/lib/ant-xslp.jar:/home/caribmag/apache-ant/lib/
ant.jar:/home/caribmag/apache-ant/lib/xercesImpl.jar:/home/caribmag/
apache-ant/lib/xml-apis.jar:/home/1/caribmag/j2sdk1.4.2_10/lib/
tools.jar'
[javac] '-sourcepath'
[javac] '/home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java'
[javac] '-encoding'
[javac] 'ISO-8859-1'
[javac] '-g'
[javac]
[javac] The ' characters around the executable and arguments are
[javac] not part of the command.
[javac] File to be compiled:
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:3: package epile.crawl.util does not exist
[javac] import epile.crawl.util.StringURL;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:4: package epile.util does not exist
[javac] import epile.util.LogLevel;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:5: package org.apache.nutch.util does not
exist
[javac] import org.apache.nutch.util.NutchConf;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:6: package org.apache.nutch.plugin does not
exist
[javac] import org.apache.nutch.plugin.Extension;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:7: package org.apache.nutch.plugin does not
exist
[javac] import org.apache.nutch.plugin.PluginRepository;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:8: package org.apache.nutch.net does not exist
[javac] import org.apache.nutch.net.URLFilter;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:9: package org.apache.nutch.fs does not exist
[javac] import org.apache.nutch.fs.*;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:10: package org.apache.nutch.io does not exist
[javac] import org.apache.nutch.io.*;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:27: cannot resolve symbol
[javac] symbol : class URLFilter
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] public class WhitelistURLFilter implements URLFilter {
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:59: cannot resolve symbol
[javac] symbol : class NutchFileSystem
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] static private NutchFileSystem nfs;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:60: package MapFile does not exist
[javac] static private MapFile.Reader whitelistMap;
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:29: cannot resolve symbol
[javac] symbol : variable LogLevel
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] private static final Logger LOG = LogLevel.get
(WhitelistURLFilter.class.getName());
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:39: cannot resolve symbol
[javac] symbol : class Extension
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] Extension[] extensions =
PluginRepository.getInstance().getExtensionPoint
(URLFilter.class.getName()).getExtentens();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:39: cannot resolve symbol
[javac] symbol : class URLFilter
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] Extension[] extensions =
PluginRepository.getInstance().getExtensionPoint
(URLFilter.class.getName()).getExtentens();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:39: cannot resolve symbol
[javac] symbol : variable PluginRepository
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] Extension[] extensions =
PluginRepository.getInstance().getExtensionPoint
(URLFilter.class.getName()).getExtentens();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:42: cannot resolve symbol
[javac] symbol : class Extension
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] Extension extension = extensions[i];
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:69: cannot resolve symbol
[javac] symbol : class NutchConf
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] NutchConf nutchConf = NutchConf.get();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:69: cannot resolve symbol
[javac] symbol : variable NutchConf
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] NutchConf nutchConf = NutchConf.get();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:82: cannot resolve symbol
[javac] symbol : class NutchConf
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] NutchConf nutchConf = NutchConf.get();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:82: cannot resolve symbol
[javac] symbol : variable NutchConf
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] NutchConf nutchConf = NutchConf.get();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:90: cannot resolve symbol
[javac] symbol : class LocalFileSystem
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] this.nfs = new LocalFileSystem();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:92: package MapFile does not exist
[javac] whitelistMap = new MapFile.Reader(this.nfs,
mapFileDir);
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:184: cannot resolve symbol
[javac] symbol : variable StringURL
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] String hostname = StringURL.extractHostname(url);
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:187: cannot resolve symbol
[javac] symbol : class UTF8
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] UTF8 value = new UTF8();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:187: cannot resolve symbol
[javac] symbol : class UTF8
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] UTF8 value = new UTF8();
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:190: cannot resolve symbol
[javac] symbol : class UTF8
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] UTF8 entry = (UTF8) whitelistMap.get(new UTF8
(hostname), value);
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:190: cannot resolve symbol
[javac] symbol : class UTF8
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] UTF8 entry = (UTF8) whitelistMap.get(new UTF8
(hostname), value);
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:190: cannot resolve symbol
[javac] symbol : class UTF8
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] UTF8 entry = (UTF8) whitelistMap.get(new UTF8
(hostname), value);
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:191: cannot resolve symbol
[javac] symbol : variable StringURL
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] String strippedURL = StringURL.removeHostname(url);
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:198: cannot resolve symbol
[javac] symbol : variable StringURL
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] String domain =
StringURL.extractDomainFromHostname(hostname);
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:201: cannot resolve symbol
[javac] symbol : class UTF8
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] entry = (UTF8) whitelistMap.get(new UTF8
(domain), value);
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:201: cannot resolve symbol
[javac] symbol : class UTF8
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] entry = (UTF8) whitelistMap.get(new UTF8
(domain), value);
[javac] ^
[javac] /home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/
plugin/epile/src/java/crawl/plugin/whitelisturlfilter/
WhitelistURLFilter.java:215: cannot resolve symbol
[javac] symbol : variable StringURL
[javac] location: class epile.crawl.plugin.WhitelistURLFilter
[javac] if (StringURL.isCGI(url))
[javac] ^
[javac] 33 errors
BUILD FAILED
/home/1/caribmag/caribbeanlinks.com/nutch/nutch/src/plugin/build-
plugin.xml:85: Compile failed; see the compiler error output for
details.
at org.apache.tools.ant.taskdefs.Javac.compile(Javac.java:933)
at org.apache.tools.ant.taskdefs.Javac.execute(Javac.java:757)
at org.apache.tools.ant.UnknownElement.execute
(UnknownElement.java:275)
at org.apache.tools.ant.Task.perform(Task.java:364)
at org.apache.tools.ant.Target.execute(Target.java:341)
at org.apache.tools.ant.Target.performTasks(Target.java:369)
at org.apache.tools.ant.Project.executeSortedTargets
(Project.java:1216)
at org.apache.tools.ant.Project.executeTarget(Project.java:
1185)
at
org.apache.tools.ant.helper.DefaultExecutor.executeTargets
(DefaultExecutor.java:40)
at org.apache.tools.ant.Project.executeTargets(Project.java:
1068)
at org.apache.tools.ant.Main.runBuild(Main.java:668)
at org.apache.tools.ant.Main.startAnt(Main.java:187)
at org.apache.tools.ant.launch.Launcher.run(Launcher.java:246)
at org.apache.tools.ant.launch.Launcher.main(Launcher.java:67)
Total time: 13 seconds
[caribmag]$
At 01:51 PM 1/3/2006, you wrote:
Nutch-87 Setup
I am looking to create a vertical/regional search application and
the Nutch-87 plugin sounds perfect for what I want to do.
However, this is all VERY new to me (java, ant, tomcat, nutch etc.
but I was able to hack my way through the installation and have a
working copy of Nutch working.
I am having problems trying to install and build the plugin. I
have read the docs but it's totally clear on the steps to add a
new plugin into nutch.
Can anyone give me any pointers as what's happening here. Please
bear in mind I am a nutch newbie.
Here are the steps I have taken:
1.) I downloaded the oc-0[1].3.2.zip file.
2.) FTP'd the zip to the server
3.) unziped in: "/caribbeanlinks.com/nutch/nutch/src/plugin/"
4.) Created the "/epile/src/java" folder and placed "/crawl/
plugin/whitelisturlfilter" directory and added
WhitelistURLFilter.java
/caribbeanlinks.com/nutch/nutch/src/plugin/epile/src/java/crawl/
plugin/whitelisturlfilter
5.) Created the build.xml and plugin.xml files in "/
caribbeanlinks.com/nutch/nutch/src/plugin/epile" (see examples
below)
6.) ran "ant"
<snip>
build.xml
---------------------------
<?xml version="1.0"?>
<project name="WhitelistURLFilter" default="jar">
<import file="../build-plugin.xml"/>
</project>
plugin.xml
---------------------------
<?xml version="1.0" encoding="UTF-8"?>
<plugin
id="epile-whitelisturlfilter"
name="Epile whitelist URL filter"
version="1.0.0"
provider-name="teamgigabyte.com">
<extension-point
id="org.apache.nutch.net.URLFilter"
name="Nutch URL Filter"/>
<runtime></runtime>
<extension id="org.apache.nutch.net.urlfiler"
name="Epile Whitelist URL Filter"
point="org.apache.nutch.net.URLFilter">
<implementation id="WhitelistURLFilter"
class="epile.crawl.plugin.WhitelistURLFilter"/>
</extension>
</plugin>