Reinhard Schwab created TIKA-1500:
-------------------------------------
Summary: FeedParser extracts XML markup with BodyContentHandler
Key: TIKA-1500
URL: https://issues.apache.org/jira/browse/TIKA-1500
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 1.6
Reporter: Reinhard Schwab
Priority: Minor
Fix For: 1.8
I am using FeedParser to extract text and links from feeds and have discovered,
that the extracted text contains XML markup.
Usually FeedParser strips markup from text when generating SAX events,
but one line is missing it.
The fix is trivial. I will provide a patch.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)