hi everyone !

I'm sorry to disturb you but i need some assistance for getting the
outlinks of http://elpais.com.
I use Nutch 2.2.1.

The web page is well parsed, in debug I have all the outlinks in the Parse
object.
I use these basic plugins :

protocol-http|parse-html|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)

But outlinks are never injected in hbase (with http://elpais.com or
http://www.elpais.com).
If i try to parse www.nytimes.com, outlinks are normally injected and added
to the fetch list.

Any idea ?
Thanks
Yann

==> I have the same issue with http://www.lemonde.fr

Reply via email to