Tesseract is mostly used to recognize text from images.

>From what I understand you want to protect yourself from phishing.
A very good way to do that is to familiarize yourself with Levenshtein 
distance algorithm.
It's very simple - it calculates how many changes you need to make to a 
string to get to the desired string.
For example if you have paiipal and compare it to paypal it will give you a 
distance of 3 - remove 2 letters and add 1.

Why am I suggesting this - because your problem has already been solved in 
a slightly different situation - corporate world.
Sometimes a bad employee in a company would try to switch the company name 
on a document with the same name but 2 letters are swapped for example,
small alterations like this are hard to notice for a human, like you 
pointed out, but for a machine is very easy.

I hope this helps, if not, maybe I did not fully understand your intentions 
and you would have to clarify why you need to use Tesseract so I can 
further help you.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/861e711a-d5e7-4299-a954-bb438d9706b6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to