I think you need to look up the inheritance chain (of PDFText2HTML) at one of the parent class(s).
*org.apache.pdfbox.util.PDFStreamEngine* has method on it with the signature of "*processStream <http://incubator.apache.org/pdfbox/javadoc/org/apache/pdfbox/util/PDFStreamEngine.html#processStream%28org.apache.pdfbox.pdmodel.PDPage,%20org.apache.pdfbox.pdmodel.PDResources,%20org.apache.pdfbox.cos.COSStream%29>*(PDPage <http://incubator.apache.org/pdfbox/javadoc/org/apache/pdfbox/pdmodel/PDPage.html> aPage,PDResources <http://incubator.apache.org/pdfbox/javadoc/org/apache/pdfbox/pdmodel/PDResources.html> resources, COSStream <http://incubator.apache.org/pdfbox/javadoc/org/apache/pdfbox/cos/COSStream.html> cosStream)" which may be what you want. On Fri, Oct 23, 2009 at 1:14 PM, Shen Wang <felix.s.w...@gmail.com> wrote: > Hey guys, > > I know this question may be silly, but I have worked on this for two days > and got really frustrated(just jump in the javadoc back and forth and then > get lost again and again). Does anybody know how to use the Class > PDFText2HTML? The javadoc to this class is > http://incubator.apache.org/pdfbox/javadoc/org/apache/pdfbox/util/PDFText2HTML.html. > How can I let the class know which pdf I am looking at? Except the method > endDocument(), I don't see any other way that I can pass the information of > the file to this class. I know I must miss something, could you please help > me out? Thanks. > > Best, > > Felix > >