Hello,

I have figured out that it can be done indeed. However when I
inject/generate/readdb dump

Score: 1.0
Signature: null
Metadata: status: 9catId: 1

In the metadata part there is no space between 9 and catId, I wonder
if that is a problem.

Best Regards,
C.B.



On Mon, Jul 25, 2011 at 7:21 PM, Cam Bazz <[email protected]> wrote:
> Hello,
>
> How could I inject metadata for urls that I provide?
>
> In Injector.java :
>
> /** This class takes a flat file of URLs and adds them to the of pages to be
>  * crawled.  Useful for bootstrapping the system.
>  * The URL files contain one URL per line, optionally followed by
> custom metadata
>  * separated by tabs with the metadata key separated from the
> corresponding value by '='. <br>
>  * Note that some metadata keys are reserved : <br>
>  * - <i>nutch.score</i> : allows to set a custom score for a specific URL <br>
>  * - <i>nutch.fetchInterval</i> : allows to set a custom fetch
> interval for a specific URL <br>
>  * e.g. http://www.nutch.org/ \t nutch.score=10 \t
> nutch.fetchInterval=2592000 \t userType=open_source
>  **/
>
>
> could I extend this structure to store metadata about urls?
>
> Best Regards,
> -C.B.
>

Reply via email to