On Wed, Aug 16, 2023 at 12:44:38PM -0700, DSpace Community wrote: > DSpace does not have an OCR engine. It is only able to index PDFs (or > other electronic files) if they have been previously OCR'ed by a different > system.
Or if they contained machine-readable text to begin with. So: a PDF that was rendered from a word-processing document (for example) probably contains text that can be flattened and indexed. A PDF which contains images of paper documents will not, unless the imaging software or some other tool has OCRed the images and added a text layer to the PDF. -- Mark H. Wood Lead Technology Analyst University Library Indiana University - Purdue University Indianapolis 755 W. Michigan Street Indianapolis, IN 46202 317-274-0749 www.ulib.iupui.edu -- All messages to this mailing list should adhere to the Code of Conduct: https://www.lyrasis.org/about/Pages/Code-of-Conduct.aspx --- You received this message because you are subscribed to the Google Groups "DSpace Community" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/dspace-community/ZN4gnn_q-XJ3Uppp%40IUPUI.Edu.
signature.asc
Description: PGP signature
