Gabriele,

The pom.xml currently in 1.3 is inaccurate and contains the wrong
dependencies / versions. See
https://issues.apache.org/jira/browse/NUTCH-995for an attempt to
generate it automatically.

Julien

On 24 May 2011 09:05, Gabriele Kahlout <[email protected]> wrote:

> Hello,
>
> I'm struggling to install nutch-1.3 into my local maven repo. How do you do
> it?
>
> I'm writing a Nutch plugin (actually a Boilerpipe integration test) as a
> maven project (module of other projects). The challenges are:
> 1. Getting it to compile with a dependency on nutch
> 2. Getting the tests to run.
>
> For 1:
> $ mvn install:install-file -Dfile=build/nutch-1.3.jar -DpomFile=pom.xml
> works but prints:
> [WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for
> org.apache.tika:tika-parsers:jar is missing. @ line 173, column 22
> [WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for
> org.apache.tika:tika-parsers:jar is missing. @ line 176, column 22
> [WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for
> org.apache.tika:tika-parsers:jar is missing. @ line 179, column 22
> [WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for
> log4j:log4j:jar is missing. @ line 189, column 22
> [WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for
> log4j:log4j:jar is missing. @ line 192, column 22
> [WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for
> log4j:log4j:jar is missing. @ line 195, column 22
> [WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for
> org.apache.gora:gora-sql:jar is missing. @ line 292, column 22
> [WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for
> org.apache.gora:gora-sql:jar is missing. @ line 295, column 22
> [WARNING] 'dependencies.dependency.exclusions.exclusion.artifactId' for
> org.apache.gora:gora-sql:jar is missing. @ line 298, column 22
> [WARNING]
> [WARNING] It is highly recommended to fix these problems because they
> threaten the stability of your build.
> [WARNING]
> [WARNING] For this reason, future Maven versions might no longer support
> building such malformed projects.
> [WARNING]
>
> Those warnings turn out to be important since building the dummy plugin
> below the build fails reporting: The POM for
> org.apache.gora:gora-sql:jar:0.1-incubating is missing, no dependency
> information available
>
>
> import  java.util.logging.Logger;
> import org.apache.nutch.crawl.CrawlDatum;
> import org.apache.nutch.crawl.Inlinks;
> import org.apache.nutch.indexer.IndexingException;
> import org.apache.nutch.indexer.NutchDocument;
> import org.apache.nutch.parse.Parse;
> import org.apache.nutch.indexer.IndexingFilter;
> /**
>  *
>  * @author simpatico
>  */
> public class BPIntegrationTest implements IndexingFilter{
>
>    @Override
>    public NutchDocument filter(NutchDocument nd, Parse parse,
> org.apache.hadoop.io.Text text, CrawlDatum cd, Inlinks inlnks) throws
> IndexingException {
>        Logger.getLogger(getClass().class.getName()).log(Level.SEVERE,
> "intercepted parsing of " + text);
>                return nd;
>    }
> }
>
> I went through  http://wiki.apache.org/nutch/WritingPluginExample.
>
>
>
> --
> Regards,
> K. Gabriele
>
> --- unchanged since 20/9/10 ---
> P.S. If the subject contains "[LON]" or the addressee acknowledges the
> receipt within 48 hours then I don't resend the email.
> subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
> time(x)
> < Now + 48h) ⇒ ¬resend(I, this).
>
> If an email is sent by a sender that is not a trusted contact or the email
> does not contain a valid code then the email is not received. A valid code
> starts with a hyphen and ends with "X".
> ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
> L(-[a-z]+[0-9]X)).
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to