On Thu, 26 Jan 2012, Gangwal, Adish (IS Consultant) wrote:
When I parse the excel which has an empty cell, it doesn't create a
extra tab character.
If there are three cells of which middle one is empty, it skips the
middle cell and only outputs 1st and 3rd cell with a tab
Tika itself doesn't generate tab characters, it generates xhtml table
elements. It's the text content handler that does tabs
In general though, Tika will generate the text that is present.
If you're trying to generate a CSV or similar, and want full control over
what shows up, missing cells etc, then I'd suggest you look at using
Apache POI directly.
Nick