On Thu, Jan 3, 2013 at 10:38 PM, Arcondo Dasilva
<[email protected]>wrote:

> Hi Lewis,
>
> Thanks for your feedback. I went through the process step by step and I'm
> still getting the error :
>
> my plugins folder looks like this :
>
> [image: Inline image 1]
>
> When I ran the parse job it gave me this :
>
> [image: Inline image 2]
>
> when I look at the log file, I get this :
>
> [image: Inline image 3]
>
> My nutch-site.xml contains this :
>
> <property>
>   <name>plugin.includes</name>
>
> <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|urlnormalizer-(pass|regex|basic)|scoring-opic</value>
>  <description>Regular expression naming plugin directory names to
>   include.  Any plugin not matching this expression is excluded.
>   In any case you need at least include the nutch-extensionpoints plugin.
> By
>   default Nutch includes crawling just HTML and plain text via HTTP,
>   and basic indexing and search plugins. In order to use HTTPS please
> enable
>   protocol-httpclient, but be aware of possible intermittent problems with
> the
>   underlying commons-httpclient library.
>   </description>
> </property>
>
>
> am I missing something else ?
>
> Thanks for your precious help.
>
> Arcondo.
>
>
>
> On Thu, Jan 3, 2013 at 11:20 PM, Lewis John Mcgibbney <
> [email protected]> wrote:
>
>> Hi Arcondo,
>>
>> The nekohtml jar should be version 0.9.5, and should reside in
>> build/plugins/lib-nekohtml once you build Nutch from source.
>> Once you use the default 'runtime' target, the corresponding plugins
>> folders should be copied into runtime/local/plugins
>> Can you check that the jar is copied to this directory before attempting
>> to
>> parse th6e URLs in your segment(s) if using 1.x.
>> I'm also assuming that you have parse-html included in the plugin.includes
>> property within nutch-site.xml before building the source.
>>
>> Lewis
>>
>> On Thu, Jan 3, 2013 at 9:11 PM, Arcondo Dasilva
>> <[email protected]>wrote:
>>
>> > Thanks for the explanation. I'm more a functional guy with no solid
>> > background in Java.
>> > Could you give some details on how to enforce it manually ?
>> >
>> > Thanks in advance, Arcondo
>> >
>> >
>> >
>> > On Thu, Jan 3, 2013 at 2:49 PM, Lewis John Mcgibbney <
>> > [email protected]> wrote:
>> >
>> > > the jar is not on the classpath
>> >
>>
>>
>>
>> --
>> *Lewis*
>>
>
>

Reply via email to