A common technique is to pre-process your input image.

Resizing produced good results.I also use psm 6 for these types of image
with various text locations.

In this case I first used your cropped image:

tesseract ArrisVIP2500_cropped.png out -l eng -psm 6 config

and got:

AT&T U verse
rowsn
O F3.
vrrzsoo ’e'

Then I resampled your image to 2000px wide:

tesseract ArrisVIP2500_cropped_2000.png out2000 -l eng -psm 6 config

and got:

AT&T U verse
POWER © " ‘|
/ ‘j""'j"’..
VIP2500 '%’

Cheers



On 7 January 2015 at 19:26, newbie <[email protected]> wrote:

> I am using tess4j, a java wrapper around tesseract and Here are the images
> and results. The intent is to extract VIP2500(model number) from the image.
> An help is appreciated.
>
> Attached are the original png  file ( ArrisVIP2500.png),binarized
> file(ArrisVIP2500_bin.TIF) and then a zoomed and cropped
> file(ArrisVIP2500_cropped.png).
>
> *ArrisVIP2500.png*
>
>  é ATE-T U-verse
>
> rowan 0
> /
>
> *ArrisVIP2500_bin.TIF*
>
> AT&T U-verse
>
> rowan <3 3
> / --
>
> vxvzsoo ‘Q’
>
> *ArrisVIP2500_cropped.png*
>
> ATE-T U-verse
>
> rowsn Q
>
> VIPZSOO ‘e’                      This looks the closest to VIP2500 , I
> need to get tess4j to reconginze digits, that said, this might not be a
> realistic scenario, as someone/something
>
>                                            Needs to zoom and crop the
> image before hand(preprocessing).
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/009ffbc7-90cc-417a-90c8-b4ac9b5bb203%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/009ffbc7-90cc-417a-90c8-b4ac9b5bb203%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAORW5vhaaiMjyjJY5fngqM44aSRUhUEYCHssBf8WtDDo64csCQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to