Hi,

adding a additional border is know trick how to fix OCR results especially
for single char mode.
As far as I remember nobody find other way how to solve this issue.

Zdenko


On Wed, Dec 11, 2013 at 12:26 PM, Faruk Terzioğlu
<[email protected]>wrote:

> With using "api.GetComponentImages(RIL_SYMBOL, true, NULL, NULL);"
> function, I take every pieces of text with a loop using
> "api.GetThresholdedImage()" and save to disk with "pixWrite". (Image is at
> below)
>
>
> <https://lh4.googleusercontent.com/--6j09JgYq0Q/Uqg6dprZU0I/AAAAAAAAAFE/bUPRhdYPHJM/s1600/binary.png>
> When I give same image as input to the tesseract api, it doesn't recognize
> it. But if I add only 1 pixel wide line to any of sides, it recognize it
> well. (Image is at below)
>
>
> <https://lh5.googleusercontent.com/-TBaomOmhywQ/Uqg6nXUrv9I/AAAAAAAAAFM/eHjMstd0UKk/s1600/binaryOK.png>
> (one more pixel line on right side)
>
>
> api.Init("","eng", OEM_TESSERACT_ONLY);
> PIX *pixs;
> pixs = pixRead("C:/tesseract/sampleImg/tr/binary.png");
> api.SetImage(pixs);
> text_out = api.GetUTF8Text();
>
> I did this because on same image, I get good results with
> PSM_SINGLE_BLOCK, PSM_SINGLE_WORD, PSM_SINGLE_LINE but with
> PSM_SINGLE_CHAR, results are terrible.
> To check why this result occur, I saved every image part to the disk and
> send them seperately to the tesseract.
>
> If I add 1 pixel to every sides as below, resuts are pretty good;
> box->x = box->x - 1;
> box->y = box->y - 1;
> box->w = box->w + 2;
> box->h = box->h + 2;
>
> Sample result ;
>
> ,<https://lh3.googleusercontent.com/-gcjCyirWWu0/Uqg3N09PlsI/AAAAAAAAAE4/fG_-acCmoNk/s1600/ikili.png>
> First image (with wider boxes) recognized very well (result are on
> left-top, question marks is for newline character) but on second image,
> every letters recognized wrong or even couldn't recognized.
>
> For this sample I add 1 pixel to every side, but I couldn't check on every
> image.
> How can I solve this issue?
>
> void tesseractWithOpenCVComponentImages()
> {
>     initTesseract();
>
>     Mat img;
>     img = imread(imagePath,CV_LOAD_IMAGE_GRAYSCALE);
>     if(!img.data) { cout << "Resim yüklenemedi!"; cin.get(); exit(1); }
>
>     api.SetImage(img.data, img.cols, img.rows, 1, img.step1());
>     text_out = api.GetUTF8Text();
>
>     PageSegMode segMode = PSM_SINGLE_CHAR;
>     PageIteratorLevel level = RIL_SYMBOL;
>
>     api.SetPageSegMode(segMode);
>
>     Boxa *boxes = api.GetComponentImages(level, true, NULL, NULL);
>     for(int i=0; i< boxes->n; i++)
>     {
>         Box *box = boxaGetBox(boxes, i, L_CLONE);
>         // This will give good result ->
>         box->x = box->x - 1;
>         box->y = box->y - 1;
>         box->w = box->w + 2;
>         box->h = box->h + 2;
>         // <- This will give good result
>
>         api.TesseractRect(img.data, 1, img.step1(),box->x, box->y, box->w,box
> ->h);
>         char *outText = api.GetUTF8Text();
>
>         printf("%d. : %s", i+1, outText);
>     }
> }
>
>
>
>
>  --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to