Which version of nutch  is this? Did you follow the tutorial? I can help yuu if 
you provide all steps you did, starting with downloading nutch.

Alex.

 

 

 

-----Original Message-----
From: Arcondo Dasilva <arcondo.dasi...@gmail.com>
To: user <user@nutch.apache.org>
Sent: Fri, Jan 4, 2013 1:23 pm
Subject: Re: Native Hadoop library not loaded and Cannot parse sites contents


Hi Alex,

I tried. That was the first thing I did but without success.
I don't understand why I'm obliged to use Neko instead of Tika. As far as I
know tika can parse more than 1200 different formats

Kr, Arcondo


On Fri, Jan 4, 2013 at 7:47 PM, <alx...@aim.com> wrote:

> move or copy that jar file to local/lib and try again.
>
> hth.
> Alex.
>
>
>
>
>
>
>
> -----Original Message-----
> From: Arcondo <arcondo.dasi...@gmail.com>
> To: user <user@nutch.apache.org>
> Sent: Fri, Jan 4, 2013 2:55 am
> Subject: Re: Native Hadoop library not loaded and Cannot parse sites
> contents
>
>
> Hope that now you can see them
>
> Plugin folder
> <http://lucene.472066.n3.nabble.com/file/n4030524/plugin_folder.png>
>
> Parse Job
>
> <http://lucene.472066.n3.nabble.com/file/n4030524/parse_job.png>
>
> Parse error : Hadoop.log
>
> <http://lucene.472066.n3.nabble.com/file/n4030524/parse_error.png>
>
> My nutch-site.xm (plugin includes)
>
> <property>
> <name>plugin.includes</name>
>
> <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|urlnormalizer-(pass|regex|basic)|scoring-opic</value>
>  <description>Regular expression naming plugin directory names to
>   include.  Any plugin not matching this expression is excluded.
>   In any case you need at least include the nutch-extensionpoints plugin.
>  By default Nutch includes crawling just HTML and plain text via HTTP,
>    and basic indexing and search plugins. In order to use HTTPS please
>  enable
>    protocol-httpclient, but be aware of possible intermittent problems
>  with the
>   underlying commons-httpclient library.
>   </description>
>  </property>
>
>
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Native-Hadoop-library-not-loaded-and-Cannot-parse-sites-contents-tp4029542p4030524.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>
>

 

Reply via email to