Thank you very much! 2013/5/14 feng lu <[email protected]>
> yes, the id will be automatically stored in HBase and the outlinks that > extract from seed url will not have any of this information. the > information is store in the metadata of current url, as part of the > metadata of current url. > > > > > On Fri, May 10, 2013 at 10:59 PM, Renato Marroquín Mogrovejo < > [email protected]> wrote: > > > Hi Feng, > > > > So this means I could put any type of information for the seed urls but > > what about the ones fetched in the next cycles? They won't have any of > this > > information right? > > And where is this information stored? As part of the fetched or the > parsed > > information? > > Thanks! > > > > Renato M. > > On May 10, 2013 9:46 AM, "Adriana Farina" <[email protected]> > > wrote: > > > > > And the ids and will be automatically stored in HBase? > > > > > > > > > 2013/5/10 feng lu <[email protected]> > > > > > > > Hi Adriana > > > > > > > > you can add metadata to each seed url like this > > > > > > > > http://www.example.com id=123 > > > > http://www.example.com id=456 > > > > > > > > each CrawlDatum include many metadatas, you can use that to store any > > > > information about url. > > > > > > > > > > > > > > > > > > > > > > > > On Fri, May 10, 2013 at 5:26 PM, Adriana Farina > > > > <[email protected]>wrote: > > > > > > > > > Hello, > > > > > > > > > > I'm using Nutch 2.1 on top of Hadoop 1.0.4, with HBase 0.90.4 as > > > storage > > > > > system. I run Nutch in distributed mode. > > > > > > > > > > I need to associate an id to each url inside the seed list of nutch > > and > > > > to > > > > > store this information in HBase. I think that I have to create a > new > > > > column > > > > > family in HBase and modify the gora and hbase configuration files > in > > > the > > > > > nutch conf folder. > > > > > > > > > > However, I think I need to modify the code of Nutch, but I don't > know > > > > which > > > > > classes I have to modify. I googled a bit, but I didn't find any > > > > > documentation; I've searched inside the code but I wasn't able to > > solve > > > > my > > > > > problem. > > > > > > > > > > Can anybody help me? > > > > > > > > > > Thank you! > > > > > > > > > > > > > > > -- > > > > > Adriana Farina > > > > > > > > > > > > > > > > > > > > > -- > > > > Don't Grow Old, Grow Up... :-) > > > > > > > > > > > > > > > > -- > > > Adriana Farina > > > > > > > > > -- > Don't Grow Old, Grow Up... :-) > -- Adriana Farina

