How exactly should I write it? This is what it looks like now:

<property>
  <name>plugin.includes</name>

<value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
  <description>Regular expression naming plugin directory names to
  include.  Any plugin not matching this expression is excluded.
  In any case you need at least include the nutch-extensionpoints plugin. By
  default Nutch includes crawling just HTML and plain text via HTTP,
  and basic indexing and search plugins. In order to use HTTPS please enable
  protocol-httpclient, but be aware of possible intermittent problems with
the
  underlying commons-httpclient library.
  </description>
</property>


On Sun, Nov 4, 2012 at 2:07 PM, Lewis John Mcgibbney <
[email protected]> wrote:

> In your nutch-default.xml... which should be overridden in
> nutch-site.xml prior to compiling if using the src distribution.
>
>
>
> On Sun, Nov 4, 2012 at 6:10 PM, Joe Zhang <[email protected]> wrote:
> > the plugin.includes property is where?
> >
> > On Sun, Nov 4, 2012 at 4:53 AM, Lewis John Mcgibbney <
> > [email protected]> wrote:
> >
> >> Please ensure you have the correct spacing and string formatting when
> >> executing command line tasks.
> >>
> >> you don't seem to have a space between your crawldb directory and the
> >> solr core. It is my understanding that the -filter command does not
> >> take a parameter... this will pick up your urlfilter as specified in
> >> plugin.includes property...
> >>
> >> On Sun, Nov 4, 2012 at 6:10 AM, Joe Zhang <[email protected]> wrote:
> >>
> >> Lewis
> >>
>
>
>
> --
> Lewis
>

Reply via email to