Hello,
         I think you might need to get rid of following line in
your conf/regex-urlfilter.txt, else when injecting seed URLs they will be
filtered out.

# skip URLs containing certain characters as probable queries, etc.
-[?*!@=]

Give it a try and let me know if this works.

Thank you,
Sidharth

On Mon, Mar 23, 2015 at 3:58 PM, Adamantios Corais <
[email protected]> wrote:

> Apologize for insisting but any help would be highly appreciated since I am
> newbie to Appache Nutch. Thank you!
>
>
> *// Adamantios*
>
>
>
> On Sun, Mar 22, 2015 at 4:35 PM, Adamantios Corais <
> [email protected]> wrote:
>
> > I would like to setup Nutch so that it goes through all
> > http://www.domain.com/classifieds/something/?pg=<page> pages, for goes
> > from 1 to 200 and store the urls of the form
> > http://www.domain.com/classifieds/something/view/<number>/ where is a
> > ling number? Then, I would like print out all these urls in my terminal.
> I
> > am using Apache Nutch 1.9 and Apache Solr 4.10.4.
> >
> >
> > *// Adamantios*
> >
> >
> >
>

Reply via email to