Modified injector to allow newly injected CrawlDatum to overwrite original
--------------------------------------------------------------------------
Key: NUTCH-521
URL: https://issues.apache.org/jira/browse/NUTCH-521
Project: Nutch
Issue Type: Improvement
Components: injector
Affects Versions: 0.9.0
Environment: Tested on Solaris and Windows with Java 1.5
Reporter: Rob Young
Attachments: inject.patch
Before this patch if a CrawlDatum is already in the crawldb then it will be
used in preference to the CrawlDatum created by the newly injected url. This
patch gives the user the ability to force the injected CrawlDatum to be used
instead. The use case for this patch was the requirement for injected urls to
jump to the top of the TopN list so that we can garuntee they will be crawled
immediately (usefull for intranet crawling where changes can trigger injects).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers