Dear all,

I am trying to check the underline and overstrike status of characters in a
PDF.

I heard that underline and overstrike are achieved by graphical objects in
PDF. So I tried to detect the rectangle and line objects first. Based on
their position and the bounding box position of characters, I may be able to
identify which character is underlined and which is overstrike.

I have problem in the first step - detecting position of the graphical
objects.
==OUTPUT==
PageSize: 0,0 [595,842]
RECTANGLE: 0,0.1 [595.2,841.8]
LINE: (0,-1.6) - (13,-1.6)
LINE: (0,5.1) - (12,5.1)
==OUTPUT==
The rectangle here is the page and can be ignored. However, the lines seem
to be a different coordinate system to the page. Any idea. Please help.


The sample PDF and a code fragment is included for illustration of the
problem.

UnderAndStrike.pdf
<http://itext-general.2136553.n4.nabble.com/file/n4657914/UnderAndStrike.pdf>  

==CODE==
int pg = 1;

iTextSharp.text.Rectangle pgsz = linestamperreader.GetPageSize(pg);
Console.WriteLine("PageSize: " + pgsz.Left + "," + pgsz.Bottom + " [" +
pgsz.Width + "," + pgsz.Height + "]");


List<string> buf = new List<string>();
byte[] pageBytes = reader.GetPageContent(pg);
PRTokeniser tokeniser = new PRTokeniser(new
RandomAccessFileOrArray(pageBytes));

PRTokeniser.TokType tokenType;
string tokenValue;
while (tokeniser.NextToken())
{
        tokenType = tokeniser.TokenType;
        tokenValue = tokeniser.StringValue;

        if (tokenType == PRTokeniser.TokType.NUMBER)
        {
                buf.Add(tokenValue);
        }
        else if (tokenType == PRTokeniser.TokType.OTHER)
        {
                if (tokenValue == "re")
                {
                        // rectangle
                        float x = float.Parse(buf[buf.Count - 4]);
                        float y = float.Parse(buf[buf.Count - 3]);
                        float w = float.Parse(buf[buf.Count - 2]);
                        float h = float.Parse(buf[buf.Count - 1]);

                        Console.WriteLine("RECTANGLE: " + x.ToString() + "," + 
y.ToString() + "
[" + w.ToString() + "," + h.ToString() + "]");
                }
                else if (tokenValue == "l")
                {
                        float fx = float.Parse(buf[buf.Count - 4]);
                        float fy = float.Parse(buf[buf.Count - 3]);
                        float x = float.Parse(buf[buf.Count - 2]);
                        float y = float.Parse(buf[buf.Count - 1]);

                        Console.WriteLine("LINE: (" + fx.ToString() + "," + 
fy.ToString() + ") -
(" + x.ToString() + "," + y.ToString() + ")");
                }
        }
}
==END==



--
View this message in context: 
http://itext-general.2136553.n4.nabble.com/Problem-is-check-underline-overstrike-status-of-a-character-in-PDF-tp4657914.html
Sent from the iText - General mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Own the Future-Intel&reg; Level Up Game Demo Contest 2013
Rise to greatness in Intel's independent game demo contest.
Compete for recognition, cash, and the chance to get your game 
on Steam. $5K grand prize plus 10 genre and skill prizes. 
Submit your demo by 6/6/13. http://p.sf.net/sfu/intel_levelupd2d
_______________________________________________
iText-questions mailing list
iText-questions@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference 
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: 
http://itextpdf.com/themes/keywords.php

Reply via email to