Sorry for the confusion, but I'd like to define the inlink for the URL: www.example.com/john_doe. Let's say www.inlink.com. Is there a way to define a inlink for a certain URL? If so how can I get the inlink for a certain URL? Thanks in advance. Hope that clears everything.
Cheers, MyD On Fri, Jan 8, 2010 at 7:16 AM, xiao yang <yangxiao9...@gmail.com> wrote: > What do you mean? You already know the url. Why do you want to find it? > > On Thu, Jan 7, 2010 at 7:12 PM, MyD <myd.ro...@googlemail.com> wrote: > > Dear Nutch developers: > > Is there any way to inject URLs and define the inlink for those URLs? How > > and where can I find the inlink from a certain URL? > > Example: > > We inject a URL www.example.com/john_doe. We start the crawl and maybe > we > > are crawling the URL www.example.com/john_doe4. > > => www.example.com/john_doe > > ==> www.example.com/john_doe1 > > ====> www.example.com/john_doe4 > > ==> www.example.com/john_doe2 > > ====> www.example.com/john_doe5 > > ==> www.example.com/john_doe3 > > ===>www.example.com/john_doe6 > > Is there any way to find the base (inlink) URL www.example.com/john_doe??? > > Thanks in advance. > > Cheers, > > MyD >