Am 03.11.23 um 17:24 schrieb Tilman Hausherr:
https://www.danisch.de/blog/2023/10/31/aktennotiz-zu-pdftotext-bei-vermurksten-zeichensaetzen/

The text is in german but what he says that he was able to extract text from obfuscated PDFs by converting them to PostScript and then back to PDF. I didn't test this myself but I suspect that the conversion to PostScript dumps the /ToUnicode stream, and that it is rebuilt from the font itself when the conversion is done.
The information has to be somehwere otherwise such "conversion" won't work.

@Tilman did you try to contact the author to ask for an example?

Andreas

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to