On Thu, 26 Jan 2012, Gangwal, Adish (IS Consultant) wrote:
When I parse the excel which has an empty cell, it doesn't create a extra tab character.

If there are three cells of which middle one is empty, it skips the middle cell and only outputs 1st and 3rd cell with a tab

Tika itself doesn't generate tab characters, it generates xhtml table elements. It's the text content handler that does tabs

In general though, Tika will generate the text that is present.

If you're trying to generate a CSV or similar, and want full control over what shows up, missing cells etc, then I'd suggest you look at using Apache POI directly.

Nick

Reply via email to