Hello,

I'm trying to crawl and index the web http://www.ceamasa.com. My main
problem is that the home page contains a swf file with links to other asp
pages.

I've included the swf parser to nutch-default, but still it doesn't parse
the links in the swf so it can't reach the rest of the pages...

Here is my plugin.includes 

<name>plugin.includes</name>
 
<value>protocol-http|urlfilter-regex|parse-(text|html|js|tika|swf)|index-(basic|anchor)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)</value>

any suggestion? thanks
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Crawl-web-with-swf-file-tp1959882p1959882.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to