On 7/05/2013 8:08, shailendra3009 wrote: > I used itextshap to extract text from pdf. i used below code to extract text > line by line. It is extracting code perfectly only it is not reading white > spaces in PDF. specially i need to read white spaces using this. Your question sounds like "I have no money in my wallet; how can I fetch the zero dollar notes from my wallet?"
In a PDF, all text is added at absolute positions. For instance: one word is added at position x = 36, y = 806; another word is added on position x = 300, y = 806. Some other text is added at position x = 36, y = 790; x = 36, y = 774; x = 36; 742;... Where are the spaces? There are none! But by doing the math, you can see that there's a gap between the text that starts at position x = 36 and the one that starts at position x = 300. Also, you see a pattern in the y positions: 806 - 16 = 790; 790 - 16 = 774; 774 - 16 = 758; 758 - 16 = 742; ... This looks like a line was skipped at position 758. However, as explained multiple times, the concept of a line doesn't exist in PDF. See for instance: http://stackoverflow.com/questions/16392886/need-to-extract-text-line-by-line-from-pdf-using-itextsharp-and-put-enter-at-eve ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. This 200-page book is written by three acclaimed leaders in the field. The early access version is available now. Download your free book today! http://p.sf.net/sfu/neotech_d2d_may _______________________________________________ iText-questions mailing list iText-questions@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php