> > Anyone here know how to make Nutch read "<a href=javascript(aaa);>" as
> > http://www.myurl.com/one.php?id=aaa ?

It's a hard issue.
If you just want to map javascript(aaa) to a fix url 
www.myurl.com/one.php?id=aaa <http://www.myurl.com/one.php?id=aaa> for all 
javascript(aaa) it's quite easy to patch the nutch code to do that.
Otherwise, if you want to resolve such things in a general way, you must 
include a javascript interpreter (rhino for instance) in the Nutch's HTML 
parser.
It could be a good feature (I've planned to do it in a previous work several 
years ago, but didn't done it), but I think it is not easy.
Hope it can help.....

Jerome

-- 
http://motrech.free.fr/
http://frutch.free.fr/

Reply via email to