This command: $ tesseract.exe 18.jpg test Gives me "test.txt", which has all the text from 18.jpg, as expected.
This command: $ tesseract.exe 18.jpg test pdf Gives me "test.pdf", which doesn't appear to have most of the sentences that exist in test.txt when opened in SumatraPDF. All the PDF text can be highlighted, but when doing a search from within the PDF, only fragments of sentences are found. Opening this same file in Adobe Reader, all text can be found with the find function. My environment: $ tesseract.exe -v tesseract 3.04.00 leptonica-1.71 libjpeg 8d : libpng 1.5.18 : libtiff 4.0.3 : zlib 1.2.8 SumatraPDF v2.5.2 Adobe Reader 11.0.07 Can someone help me out with why this might be happening? Thanks, Chris -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9653f6bd-5251-42b5-a5e1-592d85c26c5c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.

