I have a recently released a bookclub - related app called Bookship, which 
features the ability to scan a page of text from a book so users can post 
quotes. (www.bookshipapp.com). So my use case is people taking pictures of 
pages with their phone and OCR-ing it.

I extensively tested Tesseract (an open source project at this point, not a 
formal Google product I don't think), and compared it Google Cloud Vision API's 
OCR product (https://cloud.google.com/vision/). For my use case, Google Cloud 
API blew away Tesseract. Tesseract really struggled with images that weren't 
perfectly vertical/horizontal and had difficulty dealing with the top and 
bottom of images (i.e. if a line got cut in half by the picture, Tesseract 
produced a few lines of gibberish at the top. The Google Cloud API seems to be 
nearly flawless at all of that. And was an order of magnitude faster. And also 
provides additional features (entity extraction, objectionable content, etc).

Of course, Tesseract is free and the Google product requires licensing - 
although provides a limited (1000/month I think) for free.

And of course these results may be due to my use case or my incorrect setup 
somehow..

Your Mileage May Vary :)

Mark

Reply via email to