RE: PDF text extracted without spaces

Jukka Zitting Sun, 05 Dec 2010 03:54:53 -0800

Hi,

From: Ganesh [mailto:[email protected]]
> I newbie with Tika. I am using latest version 0.8 version. I extracted
> text from PDF document but found spaces and new line missing. Indexing
> the data gives wrong result. Could any one in this group could help me?


That's an unfortunate regression that got included in the 0.8 release. See 
TIKA-548 [1] for the details.

The problem is fixed in the latest 0.9-SNAPSHOT version, and we probably should 
cut a new release soon with this fix.

[1] https://issues.apache.org/jira/browse/TIKA-548

BR,

Jukka Zitting

RE: PDF text extracted without spaces

Reply via email to