hi everyone ! I'm sorry to disturb you but i need some assistance for getting the outlinks of http://elpais.com. I use Nutch 2.2.1.
The web page is well parsed, in debug I have all the outlinks in the Parse object. I use these basic plugins : protocol-http|parse-html|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic) But outlinks are never injected in hbase (with http://elpais.com or http://www.elpais.com). If i try to parse www.nytimes.com, outlinks are normally injected and added to the fetch list. Any idea ? Thanks Yann ==> I have the same issue with http://www.lemonde.fr

