SWF parser
----------

                 Key: TIKA-337
                 URL: https://issues.apache.org/jira/browse/TIKA-337
             Project: Tika
          Issue Type: New Feature
          Components: parser
            Reporter: Julien Nioche


Here is an initial implementation of a SWF Parser which uses JavaSWF and has 
been adapted from  A. Bialecki's implementation for Nutch.
The main differences with the implementation for Nutch is that we use the 
latest version of JavaSWF and do not try to extract text from the actions or 
structured URLs. As usual URLs can be obtained from the text extracted using 
ParserPostProcessor.
JavaSWF has changed quite a bit since the Nutch integration and I wanted to 
keep this initial port nice and simple. It should be possible to extract the 
URLs from the actions using  JavaSWF's API, I think this is what they did in 
Heritrix.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to