Thanks Markus. I can not use freegen as this tool is not available via REST api.
With the combination of -adddays and -expr options of generator I achieved my requirement. Here is what I did: 1. inject the urls with some metadata say pageId=<unique value> Seed file contains the below entry: http://localhost:9090/nutchsite/html/page1.html pageId=<unique vlaue> 2. now issue the generate command with the -adddays(to make all the urls to be due for fetch) and -expr(to filter out the urls) options to select only the urls to be fetched again as below: $ bin/nutch generate examplesite/crawldb examplesite/segments -expr "(pageId == '<unique value>')" -adddays 30 Please comment if you see any issues with this approach. Thanks Sujan -----Original Message----- From: Markus Jelsma [mailto:markus.jel...@openindex.io] Sent: Thursday, October 06, 2016 7:32 PM To: user@nutch.apache.org Subject: RE: nutch 1.12 How can I force a URL to get re-indexed Hi You can use -adddays N in the generator job to fool it, or just use a lower interval. Or, use the freegen tool to immediately crawl a set of URL's. Markus -----Original message----- > From:Sujan Suppala <ssupp...@opentext.com> > Sent: Thursday 6th October 2016 15:56 > To: user@nutch.apache.org > Subject: nutch 1.12 How can I force a URL to get re-indexed > > Hi, > > By default the nutch is fetching the URL based on the already set next fetch > interval(30 days), suppose if the page is updated before this interval (30 > days) how can I force to re-index? > > How can I just 're-inject' the URLs to set the next fetch date to > 'immediately'? > > Fyi, I am using the nutch rest api client for to index the URLs. > > Thanks > Sujan >