[Sikuli-driver] [Bug 710586] Re: X 1.0rc2: Region.text() -- known problems and needed improvements

RaiMan Sun, 15 May 2011 23:35:50 -0700

** Description changed:

  ******* this report is a summary of known problems
  
  The text recognition feature (OCR - Region.text()) together with the
  possibility to find text in an image is still experimental and under
  developement.
  
  This are currently reported bugs:
  bug 777660: text recognition errors with some fonts
+ bug 783082: [request] want font parameters for text recognition
  bug 735434: Text extraction from Images fails in some cases on colored 
backgrounds
  bug 695616: Inconsistency in text recognition and matching, especially with 
integers-as-text!
  bug 695650: find(text).text() does not return same text
  bug 701005: text() always returns text with trailing x'200A20'
  bug 701012: text() does not return all intervening blanks, add's others
  
  Other experienced oddities
  -- there are problems with text, that is not in english language
  -- very small and very large fonts may not work
  -- multiline text makes problems
  -- intervening/preceding/trailing grafics and symbols are tried to be 
interpreted as text
  
  Tip when using Region.text():
  Currently you get the best results, when the region represents only one line 
of text and only contains text (no graphics/symbols) in english language. If 
you can influence it: make the text as large as possible.
  
  -- additional information:
  Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) 
is used.
  So their restrictions apply (e.g. minimum size of font, ...).
  Information can be found on their Wiki.


** Description changed:

- ******* this report is a summary of known problems
+ ******* this report is a summary of known problems and feature requests
  
  The text recognition feature (OCR - Region.text()) together with the
  possibility to find text in an image is still experimental and under
  developement.
  
  This are currently reported bugs:
  bug 777660: text recognition errors with some fonts
  bug 783082: [request] want font parameters for text recognition
  bug 735434: Text extraction from Images fails in some cases on colored 
backgrounds
  bug 695616: Inconsistency in text recognition and matching, especially with 
integers-as-text!
  bug 695650: find(text).text() does not return same text
  bug 701005: text() always returns text with trailing x'200A20'
  bug 701012: text() does not return all intervening blanks, add's others
  
  Other experienced oddities
  -- there are problems with text, that is not in english language
  -- very small and very large fonts may not work
  -- multiline text makes problems
  -- intervening/preceding/trailing grafics and symbols are tried to be 
interpreted as text
  
  Tip when using Region.text():
  Currently you get the best results, when the region represents only one line 
of text and only contains text (no graphics/symbols) in english language. If 
you can influence it: make the text as large as possible.
  
  -- additional information:
  Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) 
is used.
  So their restrictions apply (e.g. minimum size of font, ...).
  Information can be found on their Wiki.

-- 
You received this bug notification because you are a member of Sikuli
Drivers, which is subscribed to Sikuli.
https://bugs.launchpad.net/bugs/710586

Title:
  X 1.0rc2: Region.text() -- known problems and needed improvements

Status in Sikuli:
  In Progress

Bug description:
  ******* this report is a summary of known problems and feature
  requests

  The text recognition feature (OCR - Region.text()) together with the
  possibility to find text in an image is still experimental and under
  developement.

  This are currently reported bugs:
  bug 777660: text recognition errors with some fonts
  bug 783082: [request] want font parameters for text recognition
  bug 735434: Text extraction from Images fails in some cases on colored 
backgrounds
  bug 695616: Inconsistency in text recognition and matching, especially with 
integers-as-text!
  bug 695650: find(text).text() does not return same text
  bug 701005: text() always returns text with trailing x'200A20'
  bug 701012: text() does not return all intervening blanks, add's others

  Other experienced oddities
  -- there are problems with text, that is not in english language
  -- very small and very large fonts may not work
  -- multiline text makes problems
  -- intervening/preceding/trailing grafics and symbols are tried to be 
interpreted as text

  Tip when using Region.text():
  Currently you get the best results, when the region represents only one line 
of text and only contains text (no graphics/symbols) in english language. If 
you can influence it: make the text as large as possible.

  -- additional information:
  Internally the tesseract OCR engine (http://code.google.com/p/tesseract-ocr/) 
is used.
  So their restrictions apply (e.g. minimum size of font, ...).
  Information can be found on their Wiki.

_______________________________________________
Mailing list: https://launchpad.net/~sikuli-driver
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~sikuli-driver
More help   : https://help.launchpad.net/ListHelp

[Sikuli-driver] [Bug 710586] Re: X 1.0rc2: Region.text() -- known problems and needed improvements

Reply via email to