Re: Font names using multi-byte strings

David Law via fop-users Fri, 30 Jan 2026 01:15:04 -0800

Hi,

Would it make sense to use:

characters = name.substring((skipFirst ? 1 :0)).getBytes(StandardCharsets.UTF_8);


...which does not throw an UnsupportedEncodingException?

All the best,
Dave


On 29/01/2026 20:53, Luca Bellonda wrote:

I was able to get the same name and I suggest this method:
1- Translate the string to UTF-8 bytes representation.
2- Get the bytes and encode them instead of characters.

This way the UTF-8 statement should be respected.
In code (beware: not efficient and without error management):

in escapeName of PDFName:

    byte[] characters ;
  // get UTF-8 bytes
    try {
         characters = name.substring((skipFirst ? 1 : 0)).getBytes("UTF-8");
       } catch (UnsupportedEncodingException e) {
           throw new RuntimeException("Invalid Name:"+name);
       }

// encode bytes
         for (int i = 0, c = characters.length; i < c; i++) {
         int ch = (0x00FF&characters[i]);

             if (ch < 33 || ch > 126 || ESCAPED_NAME_CHARS.indexOf(ch) >= 0)
{
                 sb.append('#');
                 toHex(ch, sb);
             } else {
             char cxc = (char)ch ;
                 sb.append(cxc);
             }
         }
         return sb.toString();

where toHex() manages an int

Best regards.

Il giorno gio 29 gen 2026 alle ore 01:28 Joao Andre Goncalves <
[email protected]> ha scritto:

I tried that earlier but while the PDF is valid, the font name would not
match the original PDF. I did find the following on the ISO document:

c) Any character that is not a regular character shall be written using
its 2-digit hexadecimal code, preceded by the NUMBER SIGN only.



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Font names using multi-byte strings

Reply via email to