The problem is that some files do this as an obfuscation technique.

What might be detected is fonts that don't have unicode extraction. See in LegacyPDFStreamEngine "if (unicode == null)". Make your own or extend it and check for TextPosition objects with unicode null. (See PrintTextLocations example from the source code download on how to get TextPosition objects).

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org

Reply via email to