On Thu, 6 Oct 2011, Joe Gallo wrote:
I ran into a problem with tika extraction of ppt files today, and I think it traced it back to some mistaken code in the HSLFExtractorxhtml.characters( hf.getFooterText() ); <----------
I think this was raised in TIKA-727 and fixed in r1177313, any chance you could check with a svn checkout / recent nightly build and verify your problem is fixed?
Nick
