Hi Lewis,
I looked at the JIRA you mentioned and its little different then what I was
looking for. What I need is a way to associate seed url to all the records
which are derived from this url. So I added seedUrl and its value to
metadata column during inject phase if it is null and later on in updatedb
phase I propagated it to subsequent outlinks / new records. So now all the
records and any future child records will have the same seedurl as one of
the metadata.

I was looking for some plugin which I could use but in this case I did not
find any suitable plugin.

Regards,
Anand.

On 13 March 2013 22:40, Lewis John Mcgibbney <[email protected]>wrote:

> Hi Anand,
> The first step is to look at thew issue over on NUTCH-1533
> If you feel like addressing anything then please do.
> This particular issue has nothing to do with Gora, or Hadoop so you will
> not need to look at any of the code there.
> I will also be working on that issue when I get some time.s
> Thanks
> Lewis
>
> On Mon, Mar 11, 2013 at 9:44 PM, Anand Bhagwat <[email protected]
> >wrote:
>
> > I would love to work on it but the thing is I am new to all the
> frameworks
> > which are being used here. I mean Apache Hadoop, Apache Gora and Nutch
> > itself. I am going though the source code of Nutch 2. But as you said
> with
> > little bit of help I think I would be able to contribute.
> >
> > -Anand.
> >
> >
>

Reply via email to