Hi,
I have crawled 10K web pages with "index-links" turned on, and
"linkdb.ignore.internal.links" set to false. But pretty much all pages I
have got only have one outlink and one inlink. This makes me very confused.
Here is a sample,
{
"inlinks": "http://www.planetary.org/blogs/bruce-betts/",
"tstamp": "2017-04-18T15:45:31.457Z",
"nutch_score": 0.439538,
"segment": "20170418154526",
"digest": "1ef28e97795b40be08d312f630b1728f",
"host": "www.planetary.org",
"boost": "1.0",
"contentLength": "10355",
"outlinks": "http://ajax.googleapis.com/",
}
Thanks,
Yongyao
--
Yongyao Jiang
https://www.linkedin.com/in/yongyao-jiang-42516164
Ph.D. Student in Earth Systems and GeoInformation Sciences
NSF Spatiotemporal Innovation Center
George Mason University