-filter is just a binary flag only, right?

How do I specify the actual pattern file then?


On Sat, Nov 3, 2012 at 4:16 AM, Lewis John Mcgibbney <
[email protected]> wrote:

> Hi,
>
> Markus was referring to the -filter flag you can add to your solrindex
> command. Please take a look at the relevant wiki entry [0]
>
> You should be able to point this to a specific regex or automaton
> urlfiler file and achieve what you want... hopefully without dabbling
> in Java and indexing filters.
>
> hth
>
> Lewis
>
> [0] http://wiki.apache.org/nutch/bin/nutch%20solrindex
>
> On Sat, Nov 3, 2012 at 3:57 AM, Joe Zhang <[email protected]> wrote:
> > Markus gave me a little hint, but he's not available today. And This is
> an
> > urgent issue.
> >
> > The question is simple (nutch 1.5.1 and solr 3.6.1 working together):
> >
> > - The URL patterns in regex-urlfilter.txt control the behavior of
> crawling,
> > i.e., which pages to visit (or not to visit)
> > - What I need to do is to specificy **which pages to be indexed by solr**
> > (this is a subset of the pages visited) --> I wonder whether there is a
> > place to specify such URL patterns.
>
>
>
> --
> Lewis
>

Reply via email to