[tesseract-ocr] Re: Text output vs. PDF

Jeff Breidenbach Mon, 29 Jun 2015 00:46:20 -0700

Unfortunately, I think there is nothing we can do. I've done everything I 
can to 
maximize compatibility with various PDF rendering engines, but Preview uses 
particularly terrible text extraction heuristics. To be fair, the root 
problem is
the design and complexity of the PDF specification itself.


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/262a0e22-eddf-4b10-bd17-7e7f5f17cac9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[tesseract-ocr] Re: Text output vs. PDF

Reply via email to