Woa! If you want to include all urls don't do +, as that will make all urls with ?&= get fecthed, ignoring all of your other filters
just comment the line out. -----Original Message----- From: Vertical Search [mailto:[EMAIL PROTECTED] Sent: Friday, March 10, 2006 8:27 AM To: nutch-user Subject: Re: URL containing "?", "&" and "=" Mark, I did follow your advice. I modified the following line in crawl-urlfilter.txt. But no difference. Should I escape the characters in urls folder ? Thanks On 3/9/06, Vertical Search <[EMAIL PROTECTED]> wrote: > > Okay, I have noticed that for URLs containing "?", "&" and "=" I > cannot crawl. I have tried all combinations of modifying > crawl-urlfilter.txt and # skip URLs containing certain characters as > probable queries, etc. > [EMAIL PROTECTED] > > But invain. I have hit a road block.. that is terrible.. :( > > > ------------------------------------------------------- This SF.Net email is sponsored by xPML, a groundbreaking scripting language that extends applications into web and mobile media. Attend the live webcast and join the prime developer group breaking into this new coding territory! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
