Is it just me or nobody else cant see the images attached inline by Arcondo ?
On Fri, Jan 4, 2013 at 1:18 AM, Tejas Patil <[email protected]>wrote: > > > > On Thu, Jan 3, 2013 at 10:38 PM, Arcondo Dasilva < > [email protected]> wrote: > >> Hi Lewis, >> >> Thanks for your feedback. I went through the process step by step and I'm >> still getting the error : >> >> my plugins folder looks like this : >> >> [image: Inline image 1] >> >> When I ran the parse job it gave me this : >> >> [image: Inline image 2] >> >> when I look at the log file, I get this : >> >> [image: Inline image 3] >> >> My nutch-site.xml contains this : >> >> <property> >> <name>plugin.includes</name> >> >> <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|urlnormalizer-(pass|regex|basic)|scoring-opic</value> >> <description>Regular expression naming plugin directory names to >> include. Any plugin not matching this expression is excluded. >> In any case you need at least include the nutch-extensionpoints plugin. >> By >> default Nutch includes crawling just HTML and plain text via HTTP, >> and basic indexing and search plugins. In order to use HTTPS please >> enable >> protocol-httpclient, but be aware of possible intermittent problems >> with the >> underlying commons-httpclient library. >> </description> >> </property> >> >> >> am I missing something else ? >> >> Thanks for your precious help. >> >> Arcondo. >> >> >> >> On Thu, Jan 3, 2013 at 11:20 PM, Lewis John Mcgibbney < >> [email protected]> wrote: >> >>> Hi Arcondo, >>> >>> The nekohtml jar should be version 0.9.5, and should reside in >>> build/plugins/lib-nekohtml once you build Nutch from source. >>> Once you use the default 'runtime' target, the corresponding plugins >>> folders should be copied into runtime/local/plugins >>> Can you check that the jar is copied to this directory before attempting >>> to >>> parse th6e URLs in your segment(s) if using 1.x. >>> I'm also assuming that you have parse-html included in the >>> plugin.includes >>> property within nutch-site.xml before building the source. >>> >>> Lewis >>> >>> On Thu, Jan 3, 2013 at 9:11 PM, Arcondo Dasilva >>> <[email protected]>wrote: >>> >>> > Thanks for the explanation. I'm more a functional guy with no solid >>> > background in Java. >>> > Could you give some details on how to enforce it manually ? >>> > >>> > Thanks in advance, Arcondo >>> > >>> > >>> > >>> > On Thu, Jan 3, 2013 at 2:49 PM, Lewis John Mcgibbney < >>> > [email protected]> wrote: >>> > >>> > > the jar is not on the classpath >>> > >>> >>> >>> >>> -- >>> *Lewis* >>> >> >> >

