I'm having an issue with custom formats in an Excel spreadsheet.  I
found some pretty old issues that lead me to believe custom formats
should be supported.  And doing some testing they seem to be for the
most part.

https://issues.apache.org/jira/browse/TIKA-103
https://issues.apache.org/jira/browse/TIKA-360
https://issues.apache.org/jira/browse/TIKA-2025

Hopefully my attachment comes through.  I've made a simple xlsx with 3
columns, 'formatting', 'expected' and 'actual'.  Where the
'formatting' column is the name of the built-in format applied, or
definition of the custom format, the 'expected' column is a
text-formatted version of what I expect, and the 'actual' column is
the column with formatting applied.

Things seem to work fine for built-in formats.  The exception being
the 14-digit number is not coming through Tika with E-notation, but
that seems to be due to TIKA-2025, so that's fine.  But my two custom
formats that zero-pad don't seem to work at all while my format that
appends an 'a' to a number works fine.  I've pasted the plain text
output from TikaCLI app below.

Is there some way to get Tika to respect the zero-pad formats?  Or are
my expectations wrong somehow?

Sheet1
formatting expected actual
General 123 123
General 12345678901234 12345678901234
General 12345678901 12345678901
Short Date 12/18/19 12/18/19
Long Date Wednesday, December 18, 2019 Wednesday, December 18, 2019
Percentage 50.00% 50.00%
Number w Thousands Sep 1,234 1,234
Accounting $ (1,234.56) $   (1,234.56)
0# 01 1
0############# 012345678980123 1.23457E+12
###a 123a 123a

Thanks for any advice!

Attachment: format_tests.xlsx
Description: MS-Excel 2007 spreadsheet

Reply via email to