We simply extract the text from a PDF and use it to load a tag in an xml file.
Let us say, a PDF has text, "....of ferritic alloys, largely due to the complexities associated with the solid state phase transformations that occur in multipass welding" When i extract the text from the PDF using iText5.0.5, the output i get is.. "in the case of ferritic alloys, largelyduetothecomplexitiesassociatedwiththesolidstatephasetransformationsthatoccurin multipass welding". I have attached the PDF from which i have tried extracting the content and the extracted content as a document. And i am using PdfTextExtractor.getTextfromPage method for the extraction. http://itext-general.2136553.n4.nabble.com/file/n3381417/samplePDF.pdf samplePDF.pdf http://itext-general.2136553.n4.nabble.com/file/n3381417/iText5.0.5_PDFExtracted_Content.rtf iText5.0.5_PDFExtracted_Content.rtf http://itext-general.2136553.n4.nabble.com/file/n3381417/samplePDF.pdf samplePDF.pdf http://itext-general.2136553.n4.nabble.com/file/n3381417/iText5.0.5_PDFExtracted_Content.rtf iText5.0.5_PDFExtracted_Content.rtf -- View this message in context: http://itext-general.2136553.n4.nabble.com/iText-5-0-5-spaces-between-words-tp3381312p3381417.html Sent from the iText - General mailing list archive at Nabble.com. ------------------------------------------------------------------------------ Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization - today and in the future. http://p.sf.net/sfu/internap-sfd2d _______________________________________________ iText-questions mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/itext-questions iText(R) is a registered trademark of 1T3XT BVBA. Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/ Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
