Am 02.04.2019 um 03:59 schrieb Tim Allison:
Again, short of AI, your best bet is to run OCR (tesseract) on these files.
Another possible idea: create a huge database of fonts names, glyph paths (or a hash of it) and unicodes.
One could create such a database by using "good" pdfs as source, or (more simple) by just getting the original fonts and going though them.
The main problem might be that such a database is possibly huge or too slow. But it would bring better results than OCR.
Tilman --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

