Dňa 17.07.2012 02:32, Wei Liu wrote / napísal(a):
>
> My platform: Mac OS X 10.7.4 + Xcode 4.3.2 + OpenCV 2.4.0
>
>
> I want to use tesseract-ocr to recognize a few image (see attachment), and 
> I wrote a simple function to process the image using OpenCV, which is shown 
> as following
>
>
> char* wl_ocr(const IplImage* im)
>
> {
>
>     // convert image to gray
>
>     IplImage* imGray = wl_rgb2gray(im);
>
>     cv::Mat matGray = imGray;
>
>     
>
>     // initialize tesseract-ocr
>
>     tesseract::TessBaseAPI tess;
>
>     tess.Init("", "eng", tesseract::OEM_DEFAULT);
>
>     tess.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
> );
>
>     // tess.SetVariable("tessedit_char_whitelist", 
> "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789");
>
>     tess.SetPageSegMode(tesseract::PSM_AUTO);
>
>     
>
>     // process the image
>
>     // tess.TesseractRect(matGray.data, 1, matGray.step1(), 0, 0, 
> matGray.cols, matGray.rows);
>
>     tess.SetImage((uchar*)matGray.data, matGray.size().width, matGray.size
> ().height, matGray.channels(), matGray.step1());
>
>     tess.Recognize(0);
>
>     
>
>     // get the recognized text
>
>     char* text;
>
>     text = tess.GetUTF8Text();
>
>     
>
>     // clean up
>
>     cvReleaseImage(&imGray);
>
>     
>
>     return text;
>
> }
>
>
> I got the following results:
>
>
> 0.png --> CAUTION
>
> 1.png --> TILE WAL
>
> 2.png --> SLIPPERY
>
>
> The correct one should be:
>
>
> 0.png --> CAUTION
>
> 1.png --> TILE WALKWAY
>
> 2.png --> SLIPPERY WHEN WET
>
>
> The images seem to be pretty simple and clean, but my function cannot 
> output the whole words but only part of the words. I am not sure if I 
> misconfigure something in my code or if there is anything wrong with my 
> code.
>
>
> BTW. I did not train tesseract-ocr, I simply copy eng.traineddata to 
> certain folder (/usr/local/share/tessdata)
>

What version of tesseract are you using? At the moment I do not have
time to test your code, but I just tried this (using tesseract 3.02):

$ tesseract 0.png 0 && cat 0.txt
Tesseract Open Source OCR Engine v3.02 with Leptonica
CAUTION

$ tesseract 1.png 1 && cat 1.txt
Tesseract Open Source OCR Engine v3.02 with Leptonica
TILE WALKWAY

$ tesseract 2.png 2 && cat 2.txt
Tesseract Open Source OCR Engine v3.02 with Leptonica
SLIPPERY WHEN WET

it looks tesseract 3.02 is able to OCR your images correctly (e.g. you
should upgrade to 3.02 version or debug your code).

--
Zdenko

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to