I have checked the database after the dbupdate job is ran and i could see only markers, signature and fetch fields.
The initial seed which was crawled and parsed, has only outlinks. I notice one of the outlink is actually the inlink. Aren't inlinks supposed to be saved during the dbUpdatedJob ? When i tried to debug, i could see in eclipse and in the dbUpdateReducer job that the inlinks are being saved to the page object along with fetch fields, markers but i did not understood where the data is going from there. Is the data written to Hbase during the dbUpdateReducer job ? Thanks, Kiran. On Wed, Jan 30, 2013 at 1:43 PM, <[email protected]> wrote: > I see that inlinks are saved as ol in hbase. > > Alex. > > > > > > > > -----Original Message----- > From: kiran chitturi <[email protected]> > To: user <[email protected]> > Sent: Wed, Jan 30, 2013 9:31 am > Subject: Re: Nutch 2.0 updatedb and gora query > > > Link to the reference ( > > http://lucene.472066.n3.nabble.com/Inlinks-not-being-saved-in-the-database-td4037067.html > ) > and jira (https://issues.apache.org/jira/browse/NUTCH-1524) > > > On Wed, Jan 30, 2013 at 12:25 PM, kiran chitturi > <[email protected]>wrote: > > > Hi, > > > > I have posted a similar issue in dev list [0]. The problem comes with > > inlinks not being saved to database even though they are added to the > > webpage object. > > > > I am curious about what happens after the fields are saved in the webpage > > object. How are they sent to Gora ? Which class is used to communicate > with > > Gora ? > > > > I have seen Storage Utils class but i want to know if its the only class > > that is used to communicate with databases. > > > > Please let me know your suggestions. I feel, the inlinks are not being > > saved due to small problem in the code. > > > > > > > > [0] - > > http://mail-archives.apache.org/mod_mbox/nutch-dev/201301.mbox/browser > > > > Thanks, > > -- > > Kiran Chitturi > > > > > > -- > Kiran Chitturi > > > -- Kiran Chitturi

