The problem is that some files do this as an obfuscation technique.
What might be detected is fonts that don't have unicode extraction. See in LegacyPDFStreamEngine "if (unicode == null)". Make your own or extend it and check for TextPosition objects with unicode null. (See PrintTextLocations example from the source code download on how to get TextPosition objects).
Tilman --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org For additional commands, e-mail: users-h...@pdfbox.apache.org