Looks like i have related problem when trying to create HOCR files for a 
single word images. The result for single word is disappearing, however I 
can find it from txt files without HOCR parameter.

I am using 

tesseract 3.05.00dev
 leptonica-1.73
  libjpeg 8d (libjpeg-turbo 1.3.0) : libpng 1.2.50 : libtiff 4.0.3 : zlib 
1.2.8

myimage.tif does not work (only works if i use psm 5 6 7 8 9 10 and just 
text)

<https://lh3.googleusercontent.com/-JmhHhKSP2DU/V0g6kGB3qmI/AAAAAAAAAlE/uvgiK3n3rtcc-2RLdrgXKEumBS_vkKEAQCLcB/s1600/myimage.tif>

however this image works (both txt and hocr formats)




<https://lh3.googleusercontent.com/-28Kktv_pmD0/V0g68yK8gwI/AAAAAAAAAlM/AkK5Z_BLfpUBuB35UpXNqRa7s2yvBGRaQCLcB/s1600/ja.tif>

ERROR message:

Too few characters. Skipping this page

OSD: Weak margin (0.00) for 1 blob text block, but using orientation 
anyway: 0

Empty page!!



>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/8e457ec2-272c-4092-aaf5-c7f244c4144f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to