Yes. I did comment as Mark suggested
[EMAIL PROTECTED] in crawl-urlfilter.txt.
But still did not fetch the urls. Is this the only thing or should I escape
in the urlfile list ?

Thanks



On 3/10/06, Richard Braman <[EMAIL PROTECTED]> wrote:
>
> Woa!
>
> If you want to include all urls don't do +, as that will make all urls
> with ?&= get fecthed, ignoring all of your other filters
>
> just comment the line out.
>
> -----Original Message-----
> From: Vertical Search [mailto:[EMAIL PROTECTED]
> Sent: Friday, March 10, 2006 8:27 AM
> To: nutch-user
> Subject: Re: URL containing "?", "&" and "="
>
>
> Mark,
> I did follow your advice. I modified the following line in
> crawl-urlfilter.txt. But no difference. Should I escape the characters
> in urls folder ?
>
> Thanks
>
>
>
> On 3/9/06, Vertical Search <[EMAIL PROTECTED]> wrote:
> >
> >  Okay, I have noticed that for URLs containing "?", "&" and "=" I
> > cannot crawl. I have tried all combinations of modifying
> > crawl-urlfilter.txt and # skip URLs containing certain characters as
> > probable queries, etc.
> > [EMAIL PROTECTED]
> >
> > But invain. I have hit a road block.. that is terrible.. :(
> >
> >
> >
>
>

Reply via email to