Is it just me or nobody else cant see the images attached inline by Arcondo
?


On Fri, Jan 4, 2013 at 1:18 AM, Tejas Patil <[email protected]>wrote:

>
>
>
> On Thu, Jan 3, 2013 at 10:38 PM, Arcondo Dasilva <
> [email protected]> wrote:
>
>> Hi Lewis,
>>
>> Thanks for your feedback. I went through the process step by step and I'm
>> still getting the error :
>>
>> my plugins folder looks like this :
>>
>> [image: Inline image 1]
>>
>> When I ran the parse job it gave me this :
>>
>> [image: Inline image 2]
>>
>> when I look at the log file, I get this :
>>
>> [image: Inline image 3]
>>
>> My nutch-site.xml contains this :
>>
>> <property>
>>   <name>plugin.includes</name>
>>
>> <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|urlnormalizer-(pass|regex|basic)|scoring-opic</value>
>>  <description>Regular expression naming plugin directory names to
>>   include.  Any plugin not matching this expression is excluded.
>>   In any case you need at least include the nutch-extensionpoints plugin.
>> By
>>   default Nutch includes crawling just HTML and plain text via HTTP,
>>   and basic indexing and search plugins. In order to use HTTPS please
>> enable
>>   protocol-httpclient, but be aware of possible intermittent problems
>> with the
>>   underlying commons-httpclient library.
>>   </description>
>> </property>
>>
>>
>> am I missing something else ?
>>
>> Thanks for your precious help.
>>
>> Arcondo.
>>
>>
>>
>> On Thu, Jan 3, 2013 at 11:20 PM, Lewis John Mcgibbney <
>> [email protected]> wrote:
>>
>>> Hi Arcondo,
>>>
>>> The nekohtml jar should be version 0.9.5, and should reside in
>>> build/plugins/lib-nekohtml once you build Nutch from source.
>>> Once you use the default 'runtime' target, the corresponding plugins
>>> folders should be copied into runtime/local/plugins
>>> Can you check that the jar is copied to this directory before attempting
>>> to
>>> parse th6e URLs in your segment(s) if using 1.x.
>>> I'm also assuming that you have parse-html included in the
>>> plugin.includes
>>> property within nutch-site.xml before building the source.
>>>
>>> Lewis
>>>
>>> On Thu, Jan 3, 2013 at 9:11 PM, Arcondo Dasilva
>>> <[email protected]>wrote:
>>>
>>> > Thanks for the explanation. I'm more a functional guy with no solid
>>> > background in Java.
>>> > Could you give some details on how to enforce it manually ?
>>> >
>>> > Thanks in advance, Arcondo
>>> >
>>> >
>>> >
>>> > On Thu, Jan 3, 2013 at 2:49 PM, Lewis John Mcgibbney <
>>> > [email protected]> wrote:
>>> >
>>> > > the jar is not on the classpath
>>> >
>>>
>>>
>>>
>>> --
>>> *Lewis*
>>>
>>
>>
>

Reply via email to