On 18/02/2012 15:25, sselvia wrote: > I sent a request to get pricing for commercial support. I brought up the PDF > in Preview and saved the page in question to a separate PDF. When I process > the file after the "Save As" the word "null" is returned by the getText() > method. > http://itext-general.2136553.n4.nabble.com/file/n4399827/iTextTest.pdf > iTextTest.pdf > > Thanks for all of the help to this point.
I didn't see the question on the internal support list yet. Am I correct that you posted the question twice, once on the free list, once on the customer's list? Because I can't see it on the customer's support list. Anyway, this is the syntax inside your PDF: q Q q 0 0.03 612 791.97 re W n /Gs1 gs /Cs1 cs 1 1 1 sc /Gs2 gs 0 0.029999 612 791.97 re f Q q 0 594.42 612 197.58 re W n /Gs2 gs q 619.2599 0 0 198.48 -3.600012 594.42 cm /Im1 Do Q Q q 0 395.94 612 198.48 re W n /Gs2 gs q 619.2599 0 0 198.48 -3.600012 395.94 cm /Im2 Do Q Q q 0 197.46 612 198.48 re W n /Gs2 gs q 619.2599 0 0 198.48 -3.600012 197.46 cm /Im3 Do Q Q q 0 0.03 612 197.43 re W n /Gs2 gs q 619.2599 0 0 198.42 -3.600012 -0.959936 cm /Im4 Do Q Q q 0 0 612 792 re W n 0.754 sc /Gs1 gs /Gs2 gs BT -0.0004 Tc 28.02 0 0 28.02 69.12 742.98 Tm /TT1.1 1 Tf (!"#$%&'\(%\)*+) Tj 0 Tc ET BT 28.02 0 0 28.02 211.5008 742.98 Tm /G2 1 Tf <0001> Tj ET BT -0.0001 Tc 28.02 0 0 28.02 217.8614 742.98 Tm /TT1.1 1 Tf (\),) Tj 0 Tc ET BT 28.02 0 0 28.02 241.74 742.98 Tm /G2 1 Tf <0001> Tj ET BT 0.0005 Tc 28.02 0 0 28.02 248.1006 742.98 Tm /TT1.1 1 Tf (-.&.*\() Tj 0 Tc ET BT 28.02 0 0 28.02 328.6216 742.98 Tm /G2 1 Tf <0001> Tj ET BT -0.0011 Tc 28.02 0 0 28.02 334.9822 742.98 Tm /TT1.1 1 Tf (/012\)$\)3%&) Tj 0 Tc ET BT 28.02 0 0 28.02 459.5423 742.98 Tm /G2 1 Tf <0001> Tj ET BT -0.0001 Tc 28.02 0 0 28.02 465.9028 742.98 Tm /TT1.1 1 Tf (42.*1+) Tj 0 Tc ET 0 sc BT -0.0004 Tc 28.02 0 0 28.02 67.98239 744.1204 Tm /TT1.1 1 Tf (!"#$%&'\(%\)*+) Tj 0 Tc ET BT 28.02 0 0 28.02 210.3632 744.1204 Tm /G2 1 Tf <0001> Tj ET BT -0.0001 Tc 28.02 0 0 28.02 216.7238 744.1204 Tm /TT1.1 1 Tf (\),) Tj 0 Tc ET BT 28.02 0 0 28.02 240.6024 744.1204 Tm /G2 1 Tf <0001> Tj ET BT 0.0005 Tc 28.02 0 0 28.02 246.9629 744.1204 Tm /TT1.1 1 Tf (-.&.*\() Tj 0 Tc ET BT 28.02 0 0 28.02 327.484 744.1204 Tm /G2 1 Tf <0001> Tj ET BT -0.0011 Tc 28.02 0 0 28.02 333.8445 744.1204 Tm /TT1.1 1 Tf (/012\)$\)3%&) Tj 0 Tc ET BT 28.02 0 0 28.02 458.4047 744.1204 Tm /G2 1 Tf <0001> Tj ET BT -0.0001 Tc 28.02 0 0 28.02 464.7652 744.1204 Tm /TT1.1 1 Tf (42.*1+) Tj 0 Tc ET 0.754 sc BT -0.0023 Tc 28.02 0 0 28.02 112.5062 709.3196 Tm /TT1.1 1 Tf (\)*) Tj 0 Tc ET BT 28.02 0 0 28.02 142.566 709.3196 Tm /G2 1 Tf <0001> Tj ET BT -0.0006 Tc 28.02 0 0 28.02 148.9266 709.3196 Tm /TT1.1 1 Tf (\(5.) Tj 0 Tc ET BT 28.02 0 0 28.02 187.7455 709.3196 Tm /G2 1 Tf <0001> Tj ET BT 0.0018 Tc 28.02 0 0 28.02 194.106 709.3196 Tm /TT1.1 1 Tf (6',.) Tj 0 Tc ET BT 28.02 0 0 28.02 244.2646 709.3196 Tm /G2 1 Tf <0001> Tj ET BT -0.0001 Tc 28.02 0 0 28.02 250.6252 709.3196 Tm /TT1.1 1 Tf (7%.$1) Tj 0 Tc ET BT 28.02 0 0 28.02 308.1054 709.3196 Tm /G2 1 Tf <0001> Tj ET BT -0.0003 Tc 28.02 0 0 28.02 314.4043 709.3196 Tm /TT1.1 1 Tf (8+\(%"'\(.) Tj 0 Tc ET BT 28.02 0 0 28.02 416.2234 709.3196 Tm /G2 1 Tf <0001> Tj ET BT 0.0002 Tc 28.02 0 0 28.02 422.5839 709.3196 Tm /TT1.1 1 Tf (,\)2) Tj 0 Tc ET BT 28.02 0 0 28.02 456.4853 709.3196 Tm /G2 1 Tf <0001> Tj ET BT 0.0006 Tc 28.02 0 0 28.02 462.8458 709.3196 Tm /TT1.1 1 Tf (\(5.) Tj 0 Tc ET BT 28.02 0 0 28.02 501.7852 709.3196 Tm /G2 1 Tf <0001> Tj ET 0 sc BT -0.0023 Tc 28.02 0 0 28.02 111.3658 710.46 Tm /TT1.1 1 Tf (\)*) Tj 0 Tc ET BT 28.02 0 0 28.02 141.4256 710.46 Tm /G2 1 Tf <0001> Tj ET BT -0.0006 Tc 28.02 0 0 28.02 147.7861 710.46 Tm /TT1.1 1 Tf (\(5.) Tj 0 Tc ET BT 28.02 0 0 28.02 186.6051 710.46 Tm /G2 1 Tf <0001> Tj ET BT 0.0018 Tc 28.02 0 0 28.02 192.9656 710.46 Tm /TT1.1 1 Tf (6',.) Tj 0 Tc ET BT 28.02 0 0 28.02 243.127 710.46 Tm /G2 1 Tf <0001> Tj ET BT -0.0001 Tc 28.02 0 0 28.02 249.4875 710.46 Tm /TT1.1 1 Tf (7%.$1) Tj 0 Tc ET BT 28.02 0 0 28.02 306.9678 710.46 Tm /G2 1 Tf <0001> Tj ET BT -0.0003 Tc 28.02 0 0 28.02 313.2667 710.46 Tm /TT1.1 1 Tf (8+\(%"'\(.) Tj 0 Tc ET BT 28.02 0 0 28.02 415.0858 710.46 Tm /G2 1 Tf <0001> Tj ET BT 0.0002 Tc 28.02 0 0 28.02 421.4463 710.46 Tm /TT1.1 1 Tf (,\)2) Tj 0 Tc ET BT 28.02 0 0 28.02 455.3477 710.46 Tm /G2 1 Tf <0001> Tj ET BT 0.0006 Tc 28.02 0 0 28.02 461.7082 710.46 Tm /TT1.1 1 Tf (\(5.) Tj 0 Tc ET BT 28.02 0 0 28.02 500.6476 710.46 Tm /G2 1 Tf <0001> Tj ET 0.754 sc BT 0.0004 Tc 28.02 0 0 28.02 209.7076 675.7208 Tm /TT1.1 1 Tf (9&:$';'5') Tj 0 Tc ET BT 28.02 0 0 28.02 338.2269 675.7208 Tm /G2 1 Tf <0001> Tj ET BT -0.0004 Tc 28.02 0 0 28.02 344.5874 675.7208 Tm /TT1.1 1 Tf (-%<.2) Tj 0 Tc ET 0 sc BT 0.0004 Tc 28.02 0 0 28.02 208.5671 676.8612 Tm /TT1.1 1 Tf (9&:$';'5') Tj 0 Tc ET BT 28.02 0 0 28.02 337.0865 676.8612 Tm /G2 1 Tf <0001> Tj ET BT -0.0004 Tc 28.02 0 0 28.02 343.447 676.8612 Tm /TT1.1 1 Tf (-%<.2) Tj 0 Tc ET 1 sc BT 0.0008 Tc 13.98 0 0 13.98 393.12 29.1 Tm /TT3.0 1 Tf [ (SDI ) -1 (Environmental ) -1 (Services, ) -1 (Inc. ) ] TJ 0 Tc ET q 97.44208 0 0 65.99999 44.99999 51.48006 cm /Im5 Do Q BT 0.0001 Tc 13.98 0 0 13.98 45.18 28.86 Tm /TT3.0 1 Tf [ (Putnam ) 1 (County ) 1 (Environmental ) 1 (Council, ) 1 (Inc.) -3 ( ) ] TJ 0 Tc ET BT -0.0007 Tc 16.02 0 0 16.02 274.56 119.58 Tm /TT3.0 1 Tf [ (June ) 2 (2010) ] TJ 0 Tc ET q 103.7244 0 0 64.49999 488.16 52.98005 cm /Im6 Do Q Q This is the result when iText parses this syntax for text: Implications nullof nullRecent nullHydrologic nullTrends Implications nullof nullRecent nullHydrologic nullTrends on nullthe nullSafe nullYield nullEstimate nullfor nullthe null on nullthe nullSafe nullYield nullEstimate nullfor nullthe null Ocklawaha nullRiver Ocklawaha nullRiver June 2010 SDI Environmental Services, Inc. Putnam County Environmental Council, Inc. Why is some of the text duplicated? Because the text occurs twice in the PDF syntax. For instance: (!"#$%&'\(%\)*+) stands for "Implications". It occurs on two places: coordinate 69.12, 742.98; and coordinate 67.98239, 744.1204. Because the two separate instances of the word are so close to each other, you see it only once in the PDF, but that doesn't mean it isn't there twice. As for the 'null', you have some odd String <0001> that separates the words in those first sentences. Hope this helps. ------------------------------------------------------------------------------ Virtualization & Cloud Management Using Capacity Planning Cloud computing makes use of virtualization - but cloud computing also focuses on allowing computing to be delivered as a service. http://www.accelacomm.com/jaw/sfnl/114/51521223/ _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
