Michael McCandless created TIKA-1067:
----------------------------------------
Summary: Tika extracts non-existent asterisks (*) from .ppt files
Key: TIKA-1067
URL: https://issues.apache.org/jira/browse/TIKA-1067
Project: Tika
Issue Type: Bug
Reporter: Michael McCandless
I created a new blank presentation, put in title + subtitle, saved it as .ppt,
and then ran TikaCLI -t:
{noformat}
<body><div class="slideShow"><div class="slide"><p
class="slide-master-content">*<br/>
*<br/>
</p>
<p class="slide-content">Testing<br/>
testing<br/>
</p>
</div>
</div>
<div class="slideNotes"/>
{noformat}
The two extra *'s seem to be coming from the master slide, but I'm not sure
which text runs they are and how to stop them ...
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira