Hello I'm doing a simple command like this:
tesseract thumb0546.jpg outputbase tsv
The issue is that for one of the words, the letter 'a' it's giving me the
full image size as the rect containing the word.
5 1 1 1 3 2 *0 0 640 360* 96 a
I'm using OS X. Here's the version info. Image and full tsv attached.
Anyone know how to fix this?
*tesseract -v*
tesseract 4.0.0
leptonica-1.77.0
libgif 5.1.4 : libjpeg 9c : libpng 1.6.36 : libtiff 4.0.10 : zlib 1.2.11
: libwebp 1.0.2 : libopenjp2 2.3.0
Found AVX2
Found AVX
Found SSE
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/07fb0fe0-344c-4a18-a32a-70b0fb815421%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
| level | page_num | block_num | par_num | line_num | word_num | left | top | width | height | conf | text |
| 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 640 | 360 | -1 |
| 2 | 1 | 1 | 0 | 0 | 0 | 112 | 64 | 432 | 168 | -1 |
| 3 | 1 | 1 | 1 | 0 | 0 | 112 | 64 | 432 | 168 | -1 |
| 4 | 1 | 1 | 1 | 1 | 0 | 112 | 64 | 432 | 29 | -1 |
| 5 | 1 | 1 | 1 | 1 | 1 | 112 | 64 | 3 | 24 | 95 | | |
| 5 | 1 | 1 | 1 | 1 | 2 | 129 | 64 | 53 | 23 | 95 | find |
| 5 | 1 | 1 | 1 | 1 | 3 | 195 | 65 | 16 | 22 | 96 | it |
| 5 | 1 | 1 | 1 | 1 | 4 | 225 | 64 | 63 | 24 | 96 | hard |
| 5 | 1 | 1 | 1 | 1 | 5 | 302 | 67 | 26 | 21 | 96 | to |
| 5 | 1 | 1 | 1 | 1 | 6 | 342 | 69 | 45 | 24 | 96 | say |
| 5 | 1 | 1 | 1 | 1 | 7 | 400 | 64 | 43 | 24 | 96 | the |
| 5 | 1 | 1 | 1 | 1 | 8 | 456 | 64 | 88 | 29 | 96 | things |
| 4 | 1 | 1 | 1 | 2 | 0 | 166 | 133 | 322 | 27 | -1 |
| 5 | 1 | 1 | 1 | 2 | 1 | 166 | 136 | 4 | 21 | 67 | | |
| 5 | 1 | 1 | 1 | 2 | 2 | 182 | 137 | 68 | 21 | 67 | want |
| 5 | 1 | 1 | 1 | 2 | 3 | 263 | 137 | 27 | 20 | 95 | to |
| 5 | 1 | 1 | 1 | 2 | 4 | 304 | 140 | 45 | 20 | 96 | say |
| 5 | 1 | 1 | 1 | 2 | 5 | 361 | 133 | 44 | 25 | 96 | the |
| 5 | 1 | 1 | 1 | 2 | 6 | 419 | 137 | 69 | 20 | 96 | most |
| 4 | 1 | 1 | 1 | 3 | 0 | 151 | 203 | 355 | 29 | -1 |
| 5 | 1 | 1 | 1 | 3 | 1 | 151 | 203 | 57 | 24 | 96 | Find |
| 5 | 1 | 1 | 1 | 3 | 2 | 0 | 0 | 640 | 360 | 96 | a |
| 5 | 1 | 1 | 1 | 3 | 3 | 223 | 203 | 89 | 24 | 96 | little |
| 5 | 1 | 1 | 1 | 3 | 4 | 326 | 203 | 36 | 24 | 96 | bit |
| 5 | 1 | 1 | 1 | 3 | 5 | 373 | 203 | 29 | 24 | 96 | of |
| 5 | 1 | 1 | 1 | 3 | 6 | 414 | 203 | 92 | 29 | 96 | steady |
| 2 | 1 | 2 | 0 | 0 | 0 | 232 | 273 | 187 | 30 | -1 |
| 3 | 1 | 2 | 1 | 0 | 0 | 232 | 273 | 187 | 30 | -1 |
| 4 | 1 | 2 | 1 | 1 | 0 | 232 | 273 | 187 | 30 | -1 |
| 5 | 1 | 2 | 1 | 1 | 1 | 232 | 280 | 28 | 17 | 96 | as |
| 5 | 1 | 2 | 1 | 1 | 2 | 275 | 275 | 3 | 22 | 93 | | |
| 5 | 1 | 2 | 1 | 1 | 3 | 291 | 278 | 45 | 25 | 95 | get |
| 5 | 1 | 2 | 1 | 1 | 4 | 348 | 273 | 71 | 24 | 96 | close |
| 2 | 1 | 3 | 0 | 0 | 0 | 552 | 317 | 75 | 27 | -1 |
| 3 | 1 | 3 | 1 | 0 | 0 | 552 | 317 | 75 | 27 | -1 |
| 4 | 1 | 3 | 1 | 1 | 0 | 552 | 317 | 75 | 15 | -1 |
| 5 | 1 | 3 | 1 | 1 | 1 | 552 | 317 | 11 | 14 | 67 | Sing |
| 5 | 1 | 3 | 1 | 1 | 2 | 586 | 320 | 41 | 12 | 62 | KING |
| 4 | 1 | 3 | 1 | 2 | 0 | 562 | 335 | 62 | 9 | -1 |
| 5 | 1 | 3 | 1 | 2 | 1 | 562 | 335 | 62 | 9 | 0 | PoC |