Hi Adriana you can add metadata to each seed url like this
http://www.example.com id=123 http://www.example.com id=456 each CrawlDatum include many metadatas, you can use that to store any information about url. On Fri, May 10, 2013 at 5:26 PM, Adriana Farina <[email protected]>wrote: > Hello, > > I'm using Nutch 2.1 on top of Hadoop 1.0.4, with HBase 0.90.4 as storage > system. I run Nutch in distributed mode. > > I need to associate an id to each url inside the seed list of nutch and to > store this information in HBase. I think that I have to create a new column > family in HBase and modify the gora and hbase configuration files in the > nutch conf folder. > > However, I think I need to modify the code of Nutch, but I don't know which > classes I have to modify. I googled a bit, but I didn't find any > documentation; I've searched inside the code but I wasn't able to solve my > problem. > > Can anybody help me? > > Thank you! > > > -- > Adriana Farina > -- Don't Grow Old, Grow Up... :-)

