Hello, And how can I access this data, afterwards when parsing or indexing?
is it going to be in the parseMeta? Best On Mon, Jul 25, 2011 at 7:50 PM, Julien Nioche <[email protected]> wrote: > that's just the way the toString() method concatenates things, the key > values are stored correctly and this should not be a problem. > look at plugin/urlmeta for a way of propagating the features to the outlinks > > On 25 July 2011 17:47, Cam Bazz <[email protected]> wrote: > >> Hello, >> >> I have figured out that it can be done indeed. However when I >> inject/generate/readdb dump >> >> Score: 1.0 >> Signature: null >> Metadata: status: 9catId: 1 >> >> In the metadata part there is no space between 9 and catId, I wonder >> if that is a problem. >> >> Best Regards, >> C.B. >> >> >> >> On Mon, Jul 25, 2011 at 7:21 PM, Cam Bazz <[email protected]> wrote: >> > Hello, >> > >> > How could I inject metadata for urls that I provide? >> > >> > In Injector.java : >> > >> > /** This class takes a flat file of URLs and adds them to the of pages to >> be >> > * crawled. Useful for bootstrapping the system. >> > * The URL files contain one URL per line, optionally followed by >> > custom metadata >> > * separated by tabs with the metadata key separated from the >> > corresponding value by '='. <br> >> > * Note that some metadata keys are reserved : <br> >> > * - <i>nutch.score</i> : allows to set a custom score for a specific URL >> <br> >> > * - <i>nutch.fetchInterval</i> : allows to set a custom fetch >> > interval for a specific URL <br> >> > * e.g. http://www.nutch.org/ \t nutch.score=10 \t >> > nutch.fetchInterval=2592000 \t userType=open_source >> > **/ >> > >> > >> > could I extend this structure to store metadata about urls? >> > >> > Best Regards, >> > -C.B. >> > >> > > > > -- > * > *Open Source Solutions for Text Engineering > > http://digitalpebble.blogspot.com/ > http://www.digitalpebble.com >

