[ 
https://issues.apache.org/jira/browse/TIKA-922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276390#comment-13276390
 ] 

Nick Burch commented on TIKA-922:
---------------------------------

Tika is returning the values/text stored in the file itself, and is not doing 
any interpretation on them. If iWorks stores 90% as 0.9 (or as close to that as 
floating point allows), then that's what we'll return

For the Excel formats, something very similar gets stored in the files too. 
However, for the Excel formats, we have a full library (Apache POI) around it 
to handle formatting

As there's no such library for iWorks at the moment, I wonder how close the 
iWorks formatting rules are to Excel ones? If they're close enough, then we 
might be able to re-use some of the formatting support in POI
                
> iWork number cell formats which are being modified in parsing
> -------------------------------------------------------------
>
>                 Key: TIKA-922
>                 URL: https://issues.apache.org/jira/browse/TIKA-922
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.0
>         Environment: Windows 7, 64 bit
>            Reporter: Erik Peterson
>              Labels: iwork
>
> iWork Number cell formats which Tika parser is parsing but in a modified form.
>   Percentage turns into a decimal. ie 90% becomes .9000000002 
>   Accounting appends a $, but the $ is missing from parsed data
>   Fraction is turned into a decimal 
>   Number System (ie Binary) translated to decimal. Ie '11001000' becomes '200'
>   Scientific Numbers translated to decimal. ie 9.0000E-03 becomes 9000 
>   Drop down menu parses all the menu items, but not what's selected. 
>   Currency & Number aren't displayed properly ie. $0.60 becomes .59999 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to