Question #266981 on Sikuli changed:
https://answers.launchpad.net/sikuli/+question/266981

    Status: Open => Answered

RaiMan proposed the following answer:
--- I am not sure that I understand the meaning of the paragraph "The big 
difference against...."
... no problem, it took me nearly a week, to get behind the logic that the 
former developer implemented in the C++ code (it is a mess as already 
mentioned, since obsolete parts and the interface to the text feature 
(Tesseract) is mixed with living code and spread over 3 modules).

-- some theory about cost of time with matchTemplate:
(for details see the doc of matchTemplate in OpenCV)

supposing you know how it internally works, you get the following:
base image: 1.000 x 600 (the image to search in)
probe image: 100 x 100 (the image to be searched)

we have 600.000 pixels minus 10.000 pixels to check (the lower right
corner of the base need not be visited, since the area is smaller than
the probe).

So for 590.000 pixels we have to evaluate the score at this point being the top 
left corner of a possible match.
For each pixel the score formula has to be evaluated (the type TM_CCOEFF_NORMED 
in your case) with 10.000 base and probe pixel pairs (this effort only depends 
on the probe size, lets name it score-check)

So the score check has to be made 590.000 times and if you have an RGB
image it has to be made 3 times (once for each channel and finally
merged into a single-value-vector of scores)

Finally you have this vector containing a score value between 0 and 1
for each pixel, that you can ask with the minMaxLoc() function for the
minimum or maximum value (depends on the type used, for TM_CCOEFF_NORMED
maximum is relevant).

--- So how to gain speed?
Looking at the above timing: the smaller the base the faster and the more base 
and probe are equal in size the faster.
If base and probe are equal in size you only have 1 score value to evaluate, 
which means that comparing 2 equal sized images is very fast  and faster, than 
searching for the same probe in a larger image.
So the recommendation for SikuliX users is: keep the search region (base) as 
small as possible.

But there is another possibility to get faster, even if the user does not obey 
the recommendation and always searches the whole screen: resize base and probe 
to some smaller images and do the search there.
It makes sense, to set a resize limit, so that the small probe does not get 
smaller than about 100 - 200 pixels (experience value, to keep the uniqueness).

So to make it simple:
in our case the smallest size would be 1/8 (100 / 12), which would result in 
images:
base: 125 x 75 (9.375)
probe: 12 x 12 (144)

this search costs you some milliseconds (instead of (some) hundred
milliseconds)

If you now take a match in this small situation and calculate the position in 
the original base image using the resize factor, it might come to rounding 
differences against what you get with a search with the original images.
And you might be left with a different score value.

As already mentioned: I already implemented the SikuliX find method in
Java only (classes ImageFinder and ImageFind, currently switched off),
where I use this approach, but at the end evaluate the true result by
doing a search in the calculated area with some outer margin.

So if you want to have a Java only example: ImageFind.doFind()

always welcome with any suggestions or even contributions.

You received this question notification because you are a member of
Sikuli Drivers, which is an answer contact for Sikuli.

_______________________________________________
Mailing list: https://launchpad.net/~sikuli-driver
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~sikuli-driver
More help   : https://help.launchpad.net/ListHelp

Reply via email to