> > Anyone here know how to make Nutch read "<a href=javascript(aaa);>" as > > http://www.myurl.com/one.php?id=aaa ?
It's a hard issue. If you just want to map javascript(aaa) to a fix url www.myurl.com/one.php?id=aaa <http://www.myurl.com/one.php?id=aaa> for all javascript(aaa) it's quite easy to patch the nutch code to do that. Otherwise, if you want to resolve such things in a general way, you must include a javascript interpreter (rhino for instance) in the Nutch's HTML parser. It could be a good feature (I've planned to do it in a previous work several years ago, but didn't done it), but I think it is not easy. Hope it can help..... Jerome -- http://motrech.free.fr/ http://frutch.free.fr/
