Yes. I did comment as Mark suggested [EMAIL PROTECTED] in crawl-urlfilter.txt. But still did not fetch the urls. Is this the only thing or should I escape in the urlfile list ?
Thanks On 3/10/06, Richard Braman <[EMAIL PROTECTED]> wrote: > > Woa! > > If you want to include all urls don't do +, as that will make all urls > with ?&= get fecthed, ignoring all of your other filters > > just comment the line out. > > -----Original Message----- > From: Vertical Search [mailto:[EMAIL PROTECTED] > Sent: Friday, March 10, 2006 8:27 AM > To: nutch-user > Subject: Re: URL containing "?", "&" and "=" > > > Mark, > I did follow your advice. I modified the following line in > crawl-urlfilter.txt. But no difference. Should I escape the characters > in urls folder ? > > Thanks > > > > On 3/9/06, Vertical Search <[EMAIL PROTECTED]> wrote: > > > > Okay, I have noticed that for URLs containing "?", "&" and "=" I > > cannot crawl. I have tried all combinations of modifying > > crawl-urlfilter.txt and # skip URLs containing certain characters as > > probable queries, etc. > > [EMAIL PROTECTED] > > > > But invain. I have hit a road block.. that is terrible.. :( > > > > > > > >
