Re: PDF text extracted without spaces

Ganesh Mon, 06 Dec 2010 01:45:49 -0800

Excatly the same issue. The spaces and newline is not extracted properly. 

When could we expect the new release?


Regards
Ganesh 

----- Original Message ----- 
From: "Jukka Zitting" <[email protected]>
To: <[email protected]>
Sent: Sunday, December 05, 2010 5:24 PM
Subject: RE: PDF text extracted without spaces


> Hi,
> 
> From: Ganesh [mailto:[email protected]]
>> I newbie with Tika. I am using latest version 0.8 version. I extracted
>> text from PDF document but found spaces and new line missing. Indexing
>> the data gives wrong result. Could any one in this group could help me?
> 
> That's an unfortunate regression that got included in the 0.8 release. See 
> TIKA-548 [1] for the details.
> 
> The problem is fixed in the latest 0.9-SNAPSHOT version, and we probably 
> should cut a new release soon with this fix.
> 
> [1] https://issues.apache.org/jira/browse/TIKA-548
> 
> BR,
> 
> Jukka Zitting
>
Send free SMS to your Friends on Mobile from your Yahoo! Messenger. Download 
Now! http://messenger.yahoo.com/download.php

Re: PDF text extracted without spaces

Reply via email to