Hi, In some cases when you crawl a webpage you already know many page urls that have a similar structure.
For example in imdb entertainment artists have the following link structure: http://www.imdb.com/name/nm1/ http://www.imdb.com/name/nm2/ http://www.imdb.com/name/nm6499112/ How about allowing the addition of urls based on generators? For example you would define in the url file: http://www.imdb.com/name/nm{{[1-6499112]}} where {{ <simple-regex> }} is the place to put a number/letter generator So that all these urls are injected into nutch? I could work on that if people are interested. Regards, Diaa