I am using the git version -- output and messages attached. pdf seems to have all the lines.
User@HP ~/tesseract-ocr/testing $ tesseract 5.tif 5 pdf Tesseract Open Source OCR Engine v3.04.00 with Leptonica Page 1 OSD: Weak margin (5.78), horiz textlines, not CJK: Don't rotate. Page 2 Too few characters. Skipping this page OSD: Weak margin (0.00) for 0 blob text block, but using orientation anyway: 0 Empty page!! Too few characters. Skipping this page OSD: Weak margin (0.00) for 0 blob text block, but using orientation anyway: 0 Empty page!! Warning in pixReadMemTiff: tiff page 2 not found User@HP ~/tesseract-ocr/testing $ tesseract -v tesseract 3.04.00 leptonica-1.71 libgif 5.1.0 : libjpeg 8d : libpng 1.6.14 : libtiff 4.0.3 : zlib 1.2.8 : libwebp 0.4.2 ShreeDevi ____________________________________________________________ भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com On Thu, Jan 8, 2015 at 9:24 PM, C. <cars...@lehrach.de> wrote: > sorry, meant: 5.pdf is the resulting file. > > Am Donnerstag, 8. Januar 2015 16:53:31 UTC+1 schrieb C.: > >> tesseract 3.03, example is attached (5.tif is the original, 5.tig the >> result). >> >> >> Am Donnerstag, 8. Januar 2015 16:02:31 UTC+1 schrieb shree: >>> >>> I don't think that's the supposed behavior. What version of tesseract >>> are you using? Please post a sample image for testing? >>> >>> ShreeDevi >>> ____________________________________________________________ >>> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com >>> >>> On Thu, Jan 8, 2015 at 8:00 PM, C. <car...@lehrach.de> wrote: >>> >>>> If I do a simple "tesseract 1.tif 2 pdf ", all vertical and horizontal >>>> lines (and grahics with small lines) in the source-file dissapear in the >>>> resulting PDF-file (Ubuntu server 12.04, tesseract 3.03). >>>> >>>> Is that the supposed behavior? >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> To post to this group, send email to tesser...@googlegroups.com. >>>> Visit this group at http://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit https://groups.google.com/d/ >>>> msgid/tesseract-ocr/dcbb0e46-b29b-447a-a5f4-d634b4371725% >>>> 40googlegroups.com >>>> <https://groups.google.com/d/msgid/tesseract-ocr/dcbb0e46-b29b-447a-a5f4-d634b4371725%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to tesseract-ocr+unsubscr...@googlegroups.com. > To post to this group, send email to tesseract-ocr@googlegroups.com. > Visit this group at http://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/6637bf0e-bf23-4ac8-a5bf-8add588ca9be%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/6637bf0e-bf23-4ac8-a5bf-8add588ca9be%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at http://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduW_RyxzRCwSsec9q%3DVHsD6ogUHwxm5yVPZbmdF1SMqQWg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
5.pdf
Description: Adobe PDF document