On Sat, Nov 8, 2025 at 4:16 AM Steve Litt via PLUG-discuss < [email protected]> wrote:
> Cool as hell. One thing though: pdf-OCR.sh doesn't completely do OCR, > because it doesn't emit ASCII or UTF-8, but instead emits a searchable, > and I would guess text-select/copyable PDF. What would need to be > changed to emit ASCII or UTF-8 ? > > Perhaps this could help extract text from PDFs https://www.pythontutorials.net/blog/how-to-extract-text-from-a-pdf-file/ Search string <python script to extract text from pdf> showed several other tools if the above does not cut it for you. HTH -- Arun Khan
--------------------------------------------------- PLUG-discuss mailing list: [email protected] To subscribe, unsubscribe, or to change your mail settings: https://lists.phxlinux.org/mailman/listinfo/plug-discuss
