Already fixed in the SVN. Paulo
On Fri, Oct 12, 2012 at 8:58 PM, Alekz ! <alek...@hotmail.com> wrote: > Hello all, > > > > I've been playing around with the LocationTextExtractionStrategy class, > trying to combine iText text extraction capabilities to build a "Select > Text" tool (like the one in Acrobat Reader) on a rendered PDF. > > > > Digging the source code I found 2 things I'd like to share with you, one is > what I think it's a bug, the other one is more of a question. > > > > The bug: > > LocationTextExtractionStrategy uses TextRenderInfo internally, and this > class added a new method in v5.3.3: GetCharacterRenderInfos which returns > info for individual glyphs of a text chunk. This method calls a private > constructor of TextRenderInfo, which has this body: > > private TextRenderInfo(TextRenderInfo parent, int charIndex, float > horizontalOffset) > { > > this.text = parent.text.Substring(charIndex, charIndex + 1); > this.textToUserSpaceTransformMatrix = new Matrix(horizontalOffset, > 0).Multiply(parent.textToUserSpaceTransformMatrix); > > this.gs = parent.gs; > > this.markedContentInfos = parent.markedContentInfos; > > } > > > > The first line should be parent.text.Substring(charIndex, 1); and I think > the bug comes from the port of java (I don't know Java, but guess that > java's substring function uses <startIndex, endIndex> as parameters, while > the .NET version of substring uses <startIndex, length> as parameters. > > > > Without this fix I was getting an ArgumentOutOfRange exception: "Index and > length must refer to a location within the string." when the charIndex > variable had a value near half the text's length. > > > > Now, the question/request: > > While checking the information obtained by the chunks' TextRenderInfo > objects, I've seen that the font size returned (at least with the PDF file > I've been working with) is always 1.0f no matter what chunk I'm parsing. But > with Adobe I can select text and with the Typewritter tool I can see what > font/fontsize a chunk uses. For example Arial 20pt. > > I came back to iText and found out that the internal variable > textToUserSpaceTransformMatrix holds a multiplier (?) for the font size, in > the "d" element (Matrix.I22 constant), so I added a method to the > TextRenderInfo class, like this: > > > > public float GetRealFontSize() > > { > > return gs.GetFontSize() * textToUserSpaceTransformMatrix[Matrix.I22]; > > } > > Is there a better way to get the font size of a glyph without adding this > custom method? > > I need the font size to calculate the user selection when he's using the > mouse, and the GraphicState (gs) is not accessible from outside iText. > > > > Thanks for reading! Any tips are very welcome. > > Alex > > > > > > > ------------------------------------------------------------------------------ > Don't let slow site performance ruin your business. Deploy New Relic APM > Deploy New Relic app performance management and know exactly > what is happening inside your Ruby, Python, PHP, Java, and .NET app > Try New Relic at no cost today and get our sweet Data Nerd shirt too! > http://p.sf.net/sfu/newrelic-dev2dev > _______________________________________________ > iText-questions mailing list > iText-questions@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/itext-questions > > iText(R) is a registered trademark of 1T3XT BVBA. > Many questions posted to this list can (and will) be answered with a > reference to the iText book: http://www.itextpdf.com/book/ > Please check the keywords list before you ask for examples: > http://itextpdf.com/themes/keywords.php ------------------------------------------------------------------------------ Don't let slow site performance ruin your business. Deploy New Relic APM Deploy New Relic app performance management and know exactly what is happening inside your Ruby, Python, PHP, Java, and .NET app Try New Relic at no cost today and get our sweet Data Nerd shirt too! http://p.sf.net/sfu/newrelic-dev2dev _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php