Hi, I noticed that for other urls in the seed inlinks are saved as ol. I checked the code and figured out that this is done with the part that saves anchors. So, in my case inlinks are saved as anchors in the field ol in hbase. But, for one of the ulrs, titile and inlinks are not retrieved, although its parse status marked success/ok (1/0), args=[].
Alex. -----Original Message----- From: kiran chitturi <[email protected]> To: user <[email protected]> Sent: Wed, Feb 13, 2013 12:40 pm Subject: Re: nutch cannot retrive title and inlinks of a domain Hi Alex, Inlinks does not work with me now for the same domain [0] currently. I am using Nutch-2.x and Hbase. Does the inlinks get saved for you for some of the crawl seeds ? Surprising, the title does not get saved. Did you try using parsechecker ? [0] - http://www.mail-archive.com/[email protected]/msg08627.html On Wed, Feb 13, 2013 at 3:26 PM, <[email protected]> wrote: > Hello, > > I noticed that nutch cannot retrieve title and inlinks of one of the > domains in the seed list. However, if I run identical code from the server > where this domain is hosted then it correctly parses it. The surprising > thing is that in both cases this urls has > > status: 2 (status_fetched) > parseStatus: success/ok (1/0), args=[] > > > I used nutch-2.1 with hbase-0.92.1 and nutch 1.4. > > > Any ideas why this happens? > > Thanks. > > Alex. > -- Kiran Chitturi

