On Sat, Nov 8, 2025 at 4:16 AM Steve Litt via PLUG-discuss <
[email protected]> wrote:

> Cool as hell. One thing though: pdf-OCR.sh doesn't completely do OCR,
> because it doesn't emit ASCII or UTF-8, but instead emits a searchable,
> and I would guess text-select/copyable PDF. What would need to be
> changed to emit ASCII or UTF-8 ?
>
>
Perhaps this could help extract text from PDFs
https://www.pythontutorials.net/blog/how-to-extract-text-from-a-pdf-file/

Search string <python script to extract text from pdf> showed several other
tools if the above does not cut it for you.

HTH
--
Arun Khan
---------------------------------------------------
PLUG-discuss mailing list: [email protected]
To subscribe, unsubscribe, or to change your mail settings:
https://lists.phxlinux.org/mailman/listinfo/plug-discuss

Reply via email to