Hey, some updates:
for preprocessing i started experimenting with unsharp masks to increase the local contrast of the image. Thus i am now able to get the 'Wallpaper' String. Anyway i am still wondering why the results are sometimes that bad if i recognize the whole screen: The layout results i got from TessBaseAPIAnalyseLayout are pretty good now. But on some screens the recognition give completely wrong results for all strings. Then i use the results from TessBaseAPIAnalyseLayout to restrict the area where reognize should get text from and tada the results are fine. Any Ideas why this happens? BR Andreas On Thursday, October 17, 2013 8:15:32 AM UTC+2, Andreas Lüdeke wrote: > > Hey, > > i am actually trying to recognize text on screenshots. (See the attached > example). My goal is to recognize all strings on a screenshot. > > I am actually up scaling all images by factor 3 to get at least 300dpi. > I am using PSM_SINGLE_BLOCK Page Segmentation. > I am using Tess4j Wrapper. > > With this i got most of the strings out of the images. > Most Strings have a lot of issues so that i recognize every string again > with its rectangle i got from the first scan. > (This isn't quite efficient) > > Could someone recommend some preprocessing steps to get all strings out of > a screenshot (e.g. in the sample the 'Wallpaper' String is missing) > Could someone propose how to prevent duplicate recognition of all strings? > > Thx in advance > > Andreas > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

