Dear Michael,

How do I add two TextChunk objects?

Regards,
Kausik

On Tue, Jan 15, 2013 at 4:08 AM, mkl <m...@wir-sind-cool.org> wrote:

> Kausik Porel,
>
> Kausik Porel wrote
> > I have tried with the following code to extract the coordinate of the
> > words. But this code mainly gives the position of a line not the word.
> Can
> > you please look at the code and suggests. The code is attached with the
> > mail. This code is a copy of LocationTextExtractionStrategy and added
> some
> > codes as per my requirement.
> >
> > TextStrategy.txt (19K)
> > &lt;
> http://itext-general.2136553.n4.nabble.com/attachment/4657368/0/TextStrategy.txt&gt
> ;
>
> Yes, it obviously gives the position of a line or of a segment not directly
> adherent to the previous one because you add data to the StringBuilder
> exactly in those situations, i.e. if (dist < -chunk.charSpaceWidth), if
> (dist > chunk.charSpaceWidth / 2.0f), and if not
> (chunk.SameLine(lastChunk)).
>
> You completely forget the case of chunk.text containing a space character,
> let alone many! If there is a space character in the chunk, you have to
> analyze the partial chunk dimensions. Unfortunately the necessary
> information is lost at that point in time because TextChunk does not carry
> the needed data.
>
> Thus, unless you want to enhance the TextChunk class, you should check
> already in RenderText() whether renderInfo.GetText() contains space
> characters, split the TextRenderInfo into individual character
> TextRenderInfo objects if it does (TextRenderInfo has a method for that!),
> and add the matching multiple TextChunk objects.
>
> Now when you hit a text chunk consisting only of a space character, you
> found the end of a word.
>
> Additionally you add lastWidth += rect.Width but completely forget the
> dist.
>
> Furthermore you also only set your variables `last*` at the beginning of a
> line. Whenever you process a horizontal gap, though, i.e. whenever (the
> absolute value of) dist is too big, you set them to 0.
>
> Regards,   Michael
>
>
>
> --
> View this message in context:
> http://itext-general.2136553.n4.nabble.com/How-do-I-extract-the-coordinate-of-the-words-from-a-pdf-document-tp4657306p4657375.html
> Sent from the iText - General mailing list archive at Nabble.com.
>
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. SALE $99.99 this month only -- learn more at:
> http://p.sf.net/sfu/learnmore_122412
> _______________________________________________
> iText-questions mailing list
> iText-questions@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a
> reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples:
> http://itextpdf.com/themes/keywords.php
>
------------------------------------------------------------------------------
Master SQL Server Development, Administration, T-SQL, SSAS, SSIS, SSRS
and more. Get SQL Server skills now (including 2012) with LearnDevNow -
200+ hours of step-by-step video tutorials by Microsoft MVPs and experts.
SALE $99.99 this month only - learn more at:
http://p.sf.net/sfu/learnmore_122512
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to