Hi,
I am using the Nutch 1.12.
I don't see an option to overwrite an url in the crawl db via
the INJECT REST call, also I have observed that this rest call is not honoring
the config settings for "db.injector.overwrite" , "db.injector.update"
properties when these are set to true.
POST /job/create
{
"type":"INJECT",
"confId":"default",
"crawlId":"TestCrawl",
"args": {"url_dir":"c:\\cygwin64\\tmp\\1475752235404-0"}
}
But I could see an option via the inject command:
$ bin/nutch inject TestCrawl/crawldb urls
-overwrite
I want to overwrite an url so that its status changes to
UNFETCHED. I have only option to use REST service. Someone help on this?
Thanks
Sujan