Question #265586 on Sikuli changed:
https://answers.launchpad.net/sikuli/+question/265586

    Status: Open => Answered

RaiMan proposed the following answer:
I am sorry to say: the implementation of the Tesseract C++ API in Sikuli is a 
mess.
But it is as it is since X1.0-RC3 was released by the former developers.
The internally used features are on the level of Tesseract 2 and there are only 
some minor fixes, that allow to use Tesseract 3.
But these fixes do not apply to the text() feature (OCR) nor to the findText() 
feature (currently find(some text)).
So the quality of OCR and the searching of text are still on the initial (poor) 
level.

A light font on dark background is a problem with Tesseract and the
recommendation is to switch the image to dark on light background before
giving into Tesseract. The only thing according to the recommendations
of Tesseract OCR is the image conversion to grey-scaled and some
rescaling to meet the recommendation of using 300dpi images.

I will touch this area only later this year with version 2 and will definitely 
use Tess4J for OCR (Region.text()
I have not yet decided, how I will tackle the findText() feature, but if 
possible then also with Tess4J.

So if you want to know, how SikuliX uses it:
You simply have to understand the mostly C++ code, which is nearly not 
documented.

What I have added recently in version 1.1.0 (faq 2709) is the possibility to 
switch to a different language pack, which theoretically includes the possible 
use of your own traineddata.
Additionally Tesseract allows different optionfiles in the tessdata folder for 
different goals. I have not yet tested, wether the current approach of SikuliX 
lets Tesseract recognise these options. 

... and yes, text() and findText() are complete different
implementations, so if one can do, it does not mean that the other does
as well and vice versa.

If you go into the code and find something to improve: always welcome.

Just fork the github repo and send pull requests.

-- 
You received this question notification because you are a member of
Sikuli Drivers, which is an answer contact for Sikuli.

_______________________________________________
Mailing list: https://launchpad.net/~sikuli-driver
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~sikuli-driver
More help   : https://help.launchpad.net/ListHelp

Reply via email to