On Thu, Jan 3, 2013 at 10:38 PM, Arcondo Dasilva <[email protected]>wrote:
> Hi Lewis, > > Thanks for your feedback. I went through the process step by step and I'm > still getting the error : > > my plugins folder looks like this : > > [image: Inline image 1] > > When I ran the parse job it gave me this : > > [image: Inline image 2] > > when I look at the log file, I get this : > > [image: Inline image 3] > > My nutch-site.xml contains this : > > <property> > <name>plugin.includes</name> > > <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|urlnormalizer-(pass|regex|basic)|scoring-opic</value> > <description>Regular expression naming plugin directory names to > include. Any plugin not matching this expression is excluded. > In any case you need at least include the nutch-extensionpoints plugin. > By > default Nutch includes crawling just HTML and plain text via HTTP, > and basic indexing and search plugins. In order to use HTTPS please > enable > protocol-httpclient, but be aware of possible intermittent problems with > the > underlying commons-httpclient library. > </description> > </property> > > > am I missing something else ? > > Thanks for your precious help. > > Arcondo. > > > > On Thu, Jan 3, 2013 at 11:20 PM, Lewis John Mcgibbney < > [email protected]> wrote: > >> Hi Arcondo, >> >> The nekohtml jar should be version 0.9.5, and should reside in >> build/plugins/lib-nekohtml once you build Nutch from source. >> Once you use the default 'runtime' target, the corresponding plugins >> folders should be copied into runtime/local/plugins >> Can you check that the jar is copied to this directory before attempting >> to >> parse th6e URLs in your segment(s) if using 1.x. >> I'm also assuming that you have parse-html included in the plugin.includes >> property within nutch-site.xml before building the source. >> >> Lewis >> >> On Thu, Jan 3, 2013 at 9:11 PM, Arcondo Dasilva >> <[email protected]>wrote: >> >> > Thanks for the explanation. I'm more a functional guy with no solid >> > background in Java. >> > Could you give some details on how to enforce it manually ? >> > >> > Thanks in advance, Arcondo >> > >> > >> > >> > On Thu, Jan 3, 2013 at 2:49 PM, Lewis John Mcgibbney < >> > [email protected]> wrote: >> > >> > > the jar is not on the classpath >> > >> >> >> >> -- >> *Lewis* >> > >

