Matt Sheppard created TIKA-1730:
-----------------------------------

             Summary: Excel to HTML filtering seems to produce some font 
setting gibberish in output
                 Key: TIKA-1730
                 URL: https://issues.apache.org/jira/browse/TIKA-1730
             Project: Tika
          Issue Type: Bug
            Reporter: Matt Sheppard


Noticed while upgrading form Tika 1.8 to 1.10 - An .xls file I can provide, 
which used to filter pretty normally, now produces the following...

{noformat}
<div class="outside">&amp;C&amp;"Arial,Bold"&amp;11&amp;F</div>
{noformat}

...seemingly at the end of the first sheet's output when filtered with {{java 
-jar tika-app-1.10.jar funnelback-claim-form-with-expense-codes.xls}}.

It looks like some styling information which should not be getting displayed at 
text here.

Would be nice if that could be fixed in some future version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to