First of all Thank You Richard and Mark. I am able to move forward. Now, I have to make sure, I dont parse unnecesary URLs in a given page. Typically sites are organized such that there is a common look and feel looping back to home and things like that.. I want to just ignore some URLs which is not relevant to my crawl and only crawl those with specific pattern. Can I use the whitelist urlfilter for this purpose.. Can some one help me understand how it works.. I know how a plug in works. But I need to know, how it actually works..
Thanks On 3/9/06, Vertical Search <[EMAIL PROTECTED]> wrote: > > Okay, I have noticed that for URLs containing "?", "&" and "=" I cannot > crawl. > I have tried all combinations of modifying crawl-urlfilter.txt and > # skip URLs containing certain characters as probable queries, etc. > [EMAIL PROTECTED] > > But invain. I have hit a road block.. that is terrible.. :( > > >
