[ 
https://issues.apache.org/jira/browse/PDFBOX-932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun updated PDFBOX-932:
----------------------------------

    Labels: Appearance Encoding  (was: )

> Swedish characters are garbled in form
> --------------------------------------
>
>                 Key: PDFBOX-932
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-932
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm
>    Affects Versions: 1.4.0
>         Environment: Mac OSX, Java6
>            Reporter: Pär Wenåker
>              Labels: Appearance, Encoding
>
> When using swedish characters to fill in a form they show up garbled in the 
> PDF. This seems to have to do with the PDAppearance class. When calling 
> setValue on the field, the value seems to be set ok since COSString handles 
> characters outside ASCII in its writePDF method. When PDAppearance writes the 
> value in insertGeneratedAppearance it does not do the same check. If the same 
> check is done it seems to work for PDAppearance to (see patch below). Since I 
> do not know very much about the PDF format, I dont know if this is the right 
> way to do it...
>         PDDocument document = PDDocument.load(<pdf-file>);
>         PDDocumentCatalog docCatalog = document.getDocumentCatalog();
>         PDAcroForm form = docCatalog.getAcroForm();
>         PDField field = form.getField(<field name>);
>         field.setValue("åäö");
> @@ -400,9 +401,32 @@
>          {
>              throw new IOException( "Error: Unknown justification value:" + q 
> );
>          }
> -        printWriter.println("(" + value + ") Tj");
> -        printWriter.println("ET" );
> -        printWriter.flush();
> +        boolean outsideASCII = false;
> +        byte[] bytes = value.getBytes("ISO-8859-1");
> +        int length = bytes.length; 
> +        
> +        for( int i=0; i<length && !outsideASCII; i++ )
> +        {
> +            //if the byte is negative then it is an eight bit byte and is
> +            //outside the ASCII range.
> +            outsideASCII = bytes[i] <0;
> +        }
> +        if(!outsideASCII) {
> +            printWriter.println("(" + value + ") Tj");
> +            printWriter.println("ET" );
> +            printWriter.flush();            
> +        } else {
> +            printWriter.print("<");
> +            for(int i=0; i<length; i++ )
> +            {
> +                String val = COSHEXTable.HEX_TABLE[ (bytes[i]+256)%256 ];    
>        
> +                printWriter.write(val);
> +            }
> +            printWriter.println("> Tj");
> +            printWriter.println("ET" );
> +            printWriter.flush();            
> +        }
>      }
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to