I suggest you give "LocationTextExtractionStrategy" a shot.  It has a
bit more brains when it comes to deciding which characters should be
grouped and which should not.

>From the JavaDoc:
This renderer also uses a simple strategy based on the font metrics to
determine if a blank space should be inserted into the output. 

Specifically:
  float dist = chunk.distanceFromEndOf(lastChunk);
                    
  if (dist < -chunk.charSpaceWidth)
    sb.append(' ');


And thanks to this being an open source project, if that isn't sensitive
enough for you, you could add a "spaceFudge" setting to
LocationTextExtractionStrategy and play with its value to get the output
you want.

So you'd want something like:

    String pageText = PdfTextExtractor.getTextFromPage( reader, pageNum,
new LocationTextExtractionStrategy() );

I wish you luck.

--Mark Storer
  Senior Software Engineer
  Cardiff.com
 
import legalese.Disclaimer;
Disclaimer<Cardiff> DisCard = null;
 
 

> -----Original Message-----
> From: DivyaKambhatla [mailto:[email protected]] 
> Sent: Wednesday, March 16, 2011 7:13 AM
> To: [email protected]
> Subject: Re: [iText-questions] iText 5.0.5 + spaces between words.
> 
> Ohh..
> Thank You for your valuable inputs.
> 
> --
> View this message in context: 
> http://itext-general.2136553.n4.nabble.com/iText-5-0-5-spaces-
> between-words-tp3381312p3381860.html
> Sent from the iText - General mailing list archive at Nabble.com.
> 
> --------------------------------------------------------------
> ----------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit for 
> your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> iText-questions mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/itext-questions
> 
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered 
> with a reference to the iText book: 
> http://www.itextpdf.com/book/ Please check the keywords list 
> before you ask for examples: http://itextpdf.com/themes/keywords.php
> 
> 

------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to