Question #266981 on Sikuli changed:
https://answers.launchpad.net/sikuli/+question/266981
Status: Open => Answered
RaiMan proposed the following answer:
--- I am not sure that I understand the meaning of the paragraph "The big
difference against...."
... no problem, it took me nearly a week, to get behind the logic that the
former developer implemented in the C++ code (it is a mess as already
mentioned, since obsolete parts and the interface to the text feature
(Tesseract) is mixed with living code and spread over 3 modules).
-- some theory about cost of time with matchTemplate:
(for details see the doc of matchTemplate in OpenCV)
supposing you know how it internally works, you get the following:
base image: 1.000 x 600 (the image to search in)
probe image: 100 x 100 (the image to be searched)
we have 600.000 pixels minus 10.000 pixels to check (the lower right
corner of the base need not be visited, since the area is smaller than
the probe).
So for 590.000 pixels we have to evaluate the score at this point being the top
left corner of a possible match.
For each pixel the score formula has to be evaluated (the type TM_CCOEFF_NORMED
in your case) with 10.000 base and probe pixel pairs (this effort only depends
on the probe size, lets name it score-check)
So the score check has to be made 590.000 times and if you have an RGB
image it has to be made 3 times (once for each channel and finally
merged into a single-value-vector of scores)
Finally you have this vector containing a score value between 0 and 1
for each pixel, that you can ask with the minMaxLoc() function for the
minimum or maximum value (depends on the type used, for TM_CCOEFF_NORMED
maximum is relevant).
--- So how to gain speed?
Looking at the above timing: the smaller the base the faster and the more base
and probe are equal in size the faster.
If base and probe are equal in size you only have 1 score value to evaluate,
which means that comparing 2 equal sized images is very fast and faster, than
searching for the same probe in a larger image.
So the recommendation for SikuliX users is: keep the search region (base) as
small as possible.
But there is another possibility to get faster, even if the user does not obey
the recommendation and always searches the whole screen: resize base and probe
to some smaller images and do the search there.
It makes sense, to set a resize limit, so that the small probe does not get
smaller than about 100 - 200 pixels (experience value, to keep the uniqueness).
So to make it simple:
in our case the smallest size would be 1/8 (100 / 12), which would result in
images:
base: 125 x 75 (9.375)
probe: 12 x 12 (144)
this search costs you some milliseconds (instead of (some) hundred
milliseconds)
If you now take a match in this small situation and calculate the position in
the original base image using the resize factor, it might come to rounding
differences against what you get with a search with the original images.
And you might be left with a different score value.
As already mentioned: I already implemented the SikuliX find method in
Java only (classes ImageFinder and ImageFind, currently switched off),
where I use this approach, but at the end evaluate the true result by
doing a search in the calculated area with some outer margin.
So if you want to have a Java only example: ImageFind.doFind()
always welcome with any suggestions or even contributions.
You received this question notification because you are a member of
Sikuli Drivers, which is an answer contact for Sikuli.
_______________________________________________
Mailing list: https://launchpad.net/~sikuli-driver
Post to : [email protected]
Unsubscribe : https://launchpad.net/~sikuli-driver
More help : https://help.launchpad.net/ListHelp