Hello Neo,

how did you turn the original images to those results? What kind of
image processing?

Thanks

On Thu, Dec 6, 2012 at 4:41 AM, Neo Song <[email protected]> wrote:
> Dear SteveP and others,
>
>     I have managed to do some image preprocessing and now I have got the
> attachment images to send to tesseract. However they are all hollow-style
> characters, which is very bad for OCR. How can I transfer them to
> solid-style characters?
>
> 在 2012年12月6日星期四UTC+8上午5时28分28秒,SteveP写道:
>>
>> If your characters have a fixed size in terms of pixels, then you might
>> get better results from doing a subimage search than by using OCR.  I mean
>> searching for a rectangular subimage within the image of the card.  The
>> subimages that you could use would be reference images of each digit 0-9.
>>
>> You would probably need to do some image processing first to convert the
>> images to black and white.  Let me know if you need ideas.
>>
>> On Thu, Nov 22, 2012 at 9:04 AM, Tom Morris <[email protected]> wrote:
>>>
>>> Embossed cards are designed to be printed.  Have you considered taking an
>>> impression and scanning the impression?  Or just scanning (magnetically) the
>>> magnetic strip on the card?
>>>
>>> There have been other discussions of training Tesseract for OCR-A (and I
>>> think OCR-B).  Farrington 7B is another in that set of OCRable fonts, so the
>>> process should be the same.
>>>
>>> Tom
>>>
>>>
>>> On Wednesday, November 21, 2012 9:41:02 AM UTC-5, Neo Song wrote:
>>>>
>>>> Thank you for your reply!
>>>> And since the bank card embossing characters are designed to be
>>>> OCR-able(according to the ISO 7811 spec), why there is no implementation
>>>> examples available on the internet? And there is no similar problem in
>>>> tesseract forum either. I have searched for a lot, but I find nothing.
>>>> This problem should be an easy one or not?
>>>>
>>>> 在 2012年11月20日星期二UTC+8下午9时45分13秒,TP写道:
>>>>>
>>>>> On Mon, Nov 19, 2012 at 9:07 AM, Neo Song <[email protected]> wrote:
>>>>>>
>>>>>> Dear All,
>>>>>>     I am now needing to OCR the embossing characters on the bank card.
>>>>>> These characters are in two kind of font. The first one is Farrington 7B,
>>>>>> which is used to present the account number, and another font is
>>>>>> unknown(maybe bank-dependent) and is used to present card holder name, 
>>>>>> card
>>>>>> issue time and card serial number.
>>>>>>     Now the problem is the embossing characters are very difficult to
>>>>>> OCR since they will be very bright under special light. While if the 
>>>>>> extra
>>>>>> light is not applied, the card background will largely affect these
>>>>>> characters, and will cause error.
>>>>>>     I have uploaded two images. The first sample image shows that
>>>>>> improper light applied will cause the characters to be dark/light mixed 
>>>>>> and
>>>>>> OCR result is very bad. The second image shows that a better light will 
>>>>>> make
>>>>>> the background dark and embossing characters very sharp, while the OCR
>>>>>> result is a little bit better, but still not good enough.
>>>>>>     Can anybody give me some advice on the light applied, or image
>>>>>> pre-processing technique to improve the OCR result? Thank you all!
>>>>>
>>>>>
>>>>> Crazy (and expensive) idea:
>>>>>
>>>>> How about taking two or maybe four pictures of each card with the light
>>>>> coming low from the side on the left and right (and maybe also from
>>>>> top/bottom), then doing some sort of image processing combination? 
>>>>> Hopefully
>>>>> if the light is low enough the background will fade out and only the 
>>>>> various
>>>>> edges of the raised characters will be visible.  Of course this would
>>>>> require some special hardware and the ability to turn a different light on
>>>>> for each scan.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To post to this group, send email to [email protected]
>>>
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>>
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to