Sebastian Nagel created NUTCH-2861:
--------------------------------------

             Summary: Remove parse-swf
                 Key: NUTCH-2861
                 URL: https://issues.apache.org/jira/browse/NUTCH-2861
             Project: Nutch
          Issue Type: Improvement
          Components: parser, plugin
    Affects Versions: 1.18
            Reporter: Sebastian Nagel
             Fix For: 1.19


We should consider to remove the Shockwafe Flash parser plugin 
([parse-swf|https://github.com/apache/nutch/tree/master/src/plugin/parse-swf]):
- Shockwave/[Adobe Flash| https://en.wikipedia.org/wiki/Adobe_Flash] reached 
[end-of-life|https://helpx.adobe.com/shockwave/shockwave-end-of-life-faq.html]
- major browsers now block playing Flash content
- the plugin is based on 15-year old library 
([javaswf|https://github.com/apache/nutch/tree/master/src/plugin/parse-swf/lib]),
 not maintained anymore and not available on Maven repository
- it's shipped in binary form also in the source package which contradicts the 
[Apache release 
policy|https://www.apache.org/legal/release-policy.html#source-packages]

Notes:
- should place a notice about the removal in the release not, as parse-tika is 
not able to extract textual content from *.swf files
- do not forget to unregister the plugin in 
[parse-plugins.xml|https://github.com/apache/nutch/blob/6c02da053d8ce65e0283a144ab59586e563608b8/conf/parse-plugins.xml.template#L54]




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to