Re: [iText-questions] Text Extraction with TABs instead of spaces

2011-07-21 Thread Dániel Kékesi
Hi, Thanks for the quick response. > On 20/07/2011 20:58, Dániel Kékesi wrote: >> Dear All, >> >> I am using iTextSharp in my application and found its text extraction >> capabilities excellent. I am facing a problem though. I use the >> PdfTextExtractor.GetTextFromPage method but it returns text

Re: [iText-questions] Text Extraction with TABs instead of spaces

2011-07-21 Thread 1T3XT BVBA
On 20/07/2011 20:58, Dániel Kékesi wrote: Dear All, I am using iTextSharp in my application and found its text extraction capabilities excellent. I am facing a problem though. I use the PdfTextExtractor.GetTextFromPage method but it returns text pieces that are far apart separated by a single

[iText-questions] Text Extraction with TABs instead of spaces

2011-07-21 Thread Kékesi Dániel
Dear All, I am using iTextSharp in my application and found its text extraction capabilities excellent. I am facing a problem though. I use the PdfTextExtractor.GetTextFromPage method but it returns text pieces that are far apart separated by a single space. Take the following example (as disp

[iText-questions] Text Extraction with TABs instead of spaces

2011-07-20 Thread Dániel Kékesi
Dear All, I am using iTextSharp in my application and found its text extraction capabilities excellent. I am facing a problem though. I use the PdfTextExtractor.GetTextFromPage method but it returns text pieces that are far apart separated by a singl

Re: [iText-questions] Text extraction - font height/bounding box

2011-07-07 Thread Kevin Day
What about getascentlist and getdescentline isn't working for you? That is the mechanism that you should be using for what you are trying to do. -- View this message in context: http://itext-general.2136553.n4.nabble.com/Text-extraction-font-height-bounding-box-tp3327638p3651761.html Sent from t

Re: [iText-questions] Text extraction - font height/bounding box

2011-07-07 Thread LIHE
^^ isn't it possible ? -- View this message in context: http://itext-general.2136553.n4.nabble.com/Text-extraction-font-height-bounding-box-tp3327638p3651543.html Sent from the iText - General mailing list archive at Nabble.com. --

[iText-questions] Text extraction - font height/bounding box

2011-02-28 Thread LIHE
someone out there knows how to get the hight of the font/text when doing text extraction. Background: I've got a drawing with position numbers, my aim is drawing a box around it with some action. I can get the text and the baseline, thats working great ! But as I don't know the height - I've go

Re: [iText-questions] Text Extraction Problem

2010-12-16 Thread Paulo Soares
There's no ToUnicode table, the text is impossible to decode, not even Acrobat can do it. Paulo - Original Message - From: yun wang To: itext-questions@lists.sourceforge.net Sent: Friday, December 17, 2010 3:57 AM Subject: [iText-questions] Text Extraction Problem

[iText-questions] Text Extraction Problem

2010-12-16 Thread yun wang
Dear Support, I am using iText 5.06 to create a plain text file. I have problem with the PDF files generated from CAD. I cannot get the text output. Here is some debug information: inside PdfContentStreamProcessor::displayPdfString::unicode, the Unicode string outputs some squares. Inside T

Re: [iText-questions] Text Extraction filtering out artifacts

2010-06-14 Thread Kevin Day
How is the text 'tagged' ? If it's done using marked content, the MarkedContentRenderFilter may be of use. - K -- View this message in context: http://itext-general.2136553.n4.nabble.com/Text-Extraction-filtering-out-artifacts-tp2254895p2254975.html Sent from the iText - General mailing list a

[iText-questions] Text Extraction filtering out artifacts

2010-06-14 Thread Art Kirshner
On page 505 of the 2nd addition of the book, a reference is made to excluding items from a text extraction. I would like to filter items tagged as Artifacts from the textextraction is this possible? The example seems to show filtering by region. Thanks, Art --

Re: [iText-questions] Text extraction

2010-05-11 Thread 1T3XT info
Rui Ribeiro wrote: > Hi, > > I need to extract all text from a pdf file. Is there any code to achieve > that? I have bought the 2^nd edition of itext in action, but seems some > of the relevant chapters for this are not yet available, but I do need > to do this as soon as possible. Any hints or

[iText-questions] Text extraction

2010-05-11 Thread Rui Ribeiro
Hi, I need to extract all text from a pdf file. Is there any code to achieve that? I have bought the 2nd edition of itext in action, but seems some of the relevant chapters for this are not yet available, but I do need to do this as soon as possible. Any hints or suggestions (I will be using iT