Hi all!

I'm currently evaluating Mayan as a replacement for my current DMS. The 
documents are all in the JPG format, multiple pages of the same document 
per folder, scanned at 300dpi. So far adding JPGs does not allow me to 
create multi-page documents. I used img2pdf to generate multi-page PDFs for 
import into Mayan, which mostly works fine. BUT: The OCR-quality for the 
same page is worse when using the PDF files.

I've tried multiple ways to generate the combined PDF and I can see some 
differences but never managed to get the same recognition quality as using 
the pure JPG. Since img2pdf (to my knowledge) does not touch the actual JPG 
data and since I'm using PDF page size fit to image size I don't know 
what's going wrong here. The PDFs look fine in my PDF viewer and are 
reported to have correct page sizes. Generating the pages with imagemagick 
does not improve recognition.

This leads me to the conclusion that the PDFs are rendered internally which 
degrades the quality.

I have two questions:

1) What can I do to improve PDF recognition quality, either in generating 
the PDF or in Mayan settings?
2) Is there another way to make multi-page documents from JPGs? Maybe using 
the REST-API?

Using Mayan version 2.6.2

Cheers,
Flo


-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Mayan EDMS" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to