If I understand you correctly, you want to fetch only injected URLs and you do not want to follow any link found in fetched pages. I'm not aware of any configuration that could do this easily. But you could:
1. Use regex URL filter to allow only the links you want. Eg. add all URLs you inject to allowed URLs, but it can be very expensive. 2. Write your own custom URL filter that would accept only your URLs. 3. Write a HtmlParserFilter which will override the parse output and remove all links from the parsed page. But then, you won't have those links in your crawl db. Regards, Marcin > Hi everyone, > > Anyone knows if there is an option to fetch a single or a group of wanted > urls using the fetcher, but (!!) not fetching previous links extracted from > urls which have already been fetched (meaning depth 1+x). > > for e.g. > > if i did a inject-->generate-->fetch-->update loop once > now the crawldb is updated with depth 1 links from the root url list. > > i now want to inject a new url list for fetching, but want to exclude the > current links from fetching. > > > > -- > Eyal Edri
