Am 03.11.23 um 17:24 schrieb Tilman Hausherr:
https://www.danisch.de/blog/2023/10/31/aktennotiz-zu-pdftotext-bei-vermurksten-zeichensaetzen/
The text is in german but what he says that he was able to extract text
from obfuscated PDFs by converting them to PostScript and then back to
PDF. I didn't test this myself but I suspect that the conversion to
PostScript dumps the /ToUnicode stream, and that it is rebuilt from the
font itself when the conversion is done.
The information has to be somehwere otherwise such "conversion" won't work.
@Tilman did you try to contact the author to ask for an example?
Andreas
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]