Re: Help wiht PDFBox

Omar Chiyean Mon, 22 Feb 2010 07:24:22 -0800

Hi..
I used your code, but i receive somethings like these:

org.apache.encoding.macromanencod...@22d3


and this information is doesn't say anything about the encodign of the pdf.
Because MacRoman is the default charset in java mac.

I want to know the encoding, because, there are some signs and accents
that are not well processed witn pdfbox1.0.0

Thanks in advance...

2010/2/20 Villu Ruusmann <[email protected]>

> Hello there,
>
> On Fri, Feb 19, 2010 at 10:10 PM, Omar Chiyean <[email protected]>
> wrote:
> > Is there a w
>
ay to see how is character encoded the pdf that is being
> > stripped??
> >
>
> PDFTextStripper textStripper = new PDFTextStripper(){
>
>    @Override
>    public void processTextPosition(TextPosition text){
>        super.processTextPosition(text);
>
>        PDFont font = text.getFont();
>
>        Encoding fontEncoding = null;
>        try {
>            fontEncoding = font.getEncoding();
>        } catch(IOException ioe){
>            // Ignored
>        }
>
>        System.out.println(text.getCharacter() + " (font= " +font+ ",
> fontEncoding=" +fontEncoding+ ")");
>    }
> };
>

Re: Help wiht PDFBox

Reply via email to