Hi,

Am 05.11.2011 02:05, schrieb Enio Lopes:
Hello,

I'm using Pdfbox on a .NET project, and so far it has been really good for 
reading the pdfs I've been using.
Now I need to identify different fonts that are present on the pdf, I saw that 
there is a function called getFonts on the stripper instance, but when I use it 
I get the following error:

EmptyStackException was unhandled

Here is the code: https://gist.github.com/1340915
Your code can't work, as the doc and the stripper instance aren't connected in any way. However, you won't get the fonts you're looking for even if you initialize the stripper using the doc. As every page most likely has it own resources, you should implement something like the following:

- load the document
- get all pages by calling doc.getDocumentCatalog().getAllPages()
- iterate over the list containing the pages
- retrieve the resources from every page using page.getResources()
- get all fonts by calling resources.getFonts()

You may have a look at the command line tool ExtractImages [1] which works similar except that it extracts all images instead of all fonts.

Thank you.

BR
Andreas Lehmkühler

[1]
http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/ExtractImages.java

Reply via email to