[ https://issues.apache.org/jira/browse/TIKA-337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Nioche updated TIKA-337: ------------------------------- Attachment: test.swf test file for the swf parser > SWF parser > ---------- > > Key: TIKA-337 > URL: https://issues.apache.org/jira/browse/TIKA-337 > Project: Tika > Issue Type: New Feature > Components: parser > Reporter: Julien Nioche > Assignee: Jukka Zitting > Attachments: test.swf, TIKA-337.patch > > > Here is an initial implementation of a SWF Parser which uses JavaSWF and has > been adapted from A. Bialecki's implementation for Nutch. > The main differences with the implementation for Nutch is that we use the > latest version of JavaSWF and do not try to extract text from the actions or > structured URLs. As usual URLs can be obtained from the text extracted using > ParserPostProcessor. > JavaSWF has changed quite a bit since the Nutch integration and I wanted to > keep this initial port nice and simple. It should be possible to extract the > URLs from the actions using JavaSWF's API, I think this is what they did in > Heritrix. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.