I have checked the database after the dbupdate job is ran and i could see
only markers, signature and fetch fields.

The initial seed which was crawled and parsed, has only outlinks. I notice
one of the outlink is actually the inlink.

Aren't inlinks supposed to be saved during the dbUpdatedJob ? When i tried
to debug, i could see in eclipse and in the dbUpdateReducer job that the
inlinks are being saved to the page object along with fetch fields, markers
but i did not understood where the data is going from there.

Is the data written to Hbase during the dbUpdateReducer job ?

Thanks,
Kiran.




On Wed, Jan 30, 2013 at 1:43 PM, <[email protected]> wrote:

> I see that inlinks are saved as ol in hbase.
>
> Alex.
>
>
>
>
>
>
>
> -----Original Message-----
> From: kiran chitturi <[email protected]>
> To: user <[email protected]>
> Sent: Wed, Jan 30, 2013 9:31 am
> Subject: Re: Nutch 2.0 updatedb and gora query
>
>
> Link to the reference (
>
> http://lucene.472066.n3.nabble.com/Inlinks-not-being-saved-in-the-database-td4037067.html
> )
> and jira (https://issues.apache.org/jira/browse/NUTCH-1524)
>
>
> On Wed, Jan 30, 2013 at 12:25 PM, kiran chitturi
> <[email protected]>wrote:
>
> > Hi,
> >
> > I have posted a similar issue in dev list [0]. The problem comes with
> > inlinks not being saved to database even though they are added to the
> > webpage object.
> >
> > I am curious about what happens after the fields are saved in the webpage
> > object. How are they sent to Gora ? Which class is used to communicate
> with
> > Gora ?
> >
> > I have seen Storage Utils class but i want to know if its the only class
> > that is used to communicate with databases.
> >
> > Please let me know your suggestions. I feel, the inlinks are not being
> > saved due to small problem in the code.
> >
> >
> >
> > [0] -
> > http://mail-archives.apache.org/mod_mbox/nutch-dev/201301.mbox/browser
> >
> > Thanks,
> > --
> > Kiran Chitturi
> >
>
>
>
> --
> Kiran Chitturi
>
>
>


-- 
Kiran Chitturi

Reply via email to