Hello I try to extract text from pdfs which have a subset font,
that means its a encoding table in each pdf, that define the encoding... !
I saw that I just get strange Stings like: ,
and wondered - i found out that there is a subset font in the pdf,
but how can I encode the strings back ?????
I'm using basically this code but with itextsharp
I'm sure U can read it, too !
Dim listener As New TextString_Listener
Dim reader As iTextSharp.text.pdf.PdfReader
reader = New iTextSharp.text.pdf.PdfReader(Application.StartupPath &
"\Sepp.pdf")
Dim parser As New
iTextSharp.text.pdf.parser.PdfReaderContentParser(reader)
Dim pageNumber As Integer = 1
'Todo: More Pages maybe not working
While pageNumber <= reader.NumberOfPages
listener = parser.ProcessContent(pageNumber, listener)
pageNumber += 1
End While
reader.Close()
In my listener i just make strings for each block, nothing special...
--
View this message in context:
http://itext-general.2136553.n4.nabble.com/Ectract-text-encode-with-subset-Font-GlyphID-tp3067012p3067012.html
Sent from the iText - General mailing list archive at Nabble.com.
------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
iText-questions mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/itext-questions
Many questions posted to this list can (and will) be answered with a reference
to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples:
http://itextpdf.com/themes/keywords.php