Thankyou! Iain From: Andrzej Bialecki [mailto:[EMAIL PROTECTED] Sent: 14 August 2006 17:31 To: [email protected] Cc: [EMAIL PROTECTED] Subject: Re: Crawling flash
Iain wrote: > I don't suppose anyone on the list has ever managed to include a flash > object in a crawl? > > There's a number of sites I need to crawl which use flash for navigation > (and have HTML content. Go figure!). > > > > I want to include embedded flash in my crawls. > > Despite (apparently successfully) including the parse-swf plugin, embedded > flash does not seem to be retrieved. Im assuming that the object tags are > not being parsed to find the .swf files. > > Can anyone comment? > I can ;) You will need to add some code to DOMContentUtils. Currently it skips <object> and <embed> tags, so that outlinks leading to the Flash content are never collected. Instead, when the code encounters an <object> tag it should descend into <param> children, pick the one with <param name="src" value="myFlash.swf">, extract the value and make a new Outlink. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
