Re: Running OSD support of Tesseract OCR

zdenko podobny Sun, 18 Nov 2012 12:27:53 -0800

Hi all,

you will not get OSD (Orientation and script detection) output information
with tesseract executable. At the moment tesseract provide (save) only ocr
result. Somebody could consider help (tesseract --help) misleading because
it enumerate all possible page segmentation modes. I think that starting
with other than 0 psm would also trigger question...


OSD info you get via API. Here is simple snippet:

    inputfile = "/usr/src/tesseract-3.02/eurotextUpsideDown.png";

    image = pixRead(inputfile);


    api->Init("/usr/src/tesseract-3.02/", "eng");

    api->SetPageSegMode(tesseract::PSM_AUTO_OSD);

    api->SetImage(image);

    api->Recognize(0);


    tesseract::PageIterator* it =  api->AnalyseLayout();

    tesseract::Orientation orientation;

    tesseract::WritingDirection direction;

    tesseract::TextlineOrder order;

    float deskew_angle;


    it->Orientation(&orientation, &direction, &order, &deskew_angle);

    printf("Orientation: %d;\nWritingDirection: %d\nTextlineOrder: %d\n" \

           "Deskew angle: %.4f\n",

           orientation, direction, order, deskew_angle);


In output you can see these information(eurotextUpsideDown.png is 180
degree rotated image eurotext.tif):

Orientation: 2;

WritingDirection: 0

TextlineOrder: 2

Deskew angle: -0.0038


This means that page is upside down (Orientation)[1], text is written from
left to right (WritingDirection)[2], lines are ordered from top to down
(TextlineOrder)[3], and there is small deskew angle[4].

As you can see attributes WritingDirection, TextlineOrder, Deskew do not
reflect that page is upside down, so you will get the same result for
eurotext.tif.


[1]
http://code.google.com/p/tesseract-ocr/source/browse/trunk/ccstruct/publictypes.h?r=716#81

[2]
http://code.google.com/p/tesseract-ocr/source/browse/trunk/ccstruct/publictypes.h?r=716#111

[3]
http://code.google.com/p/tesseract-ocr/source/browse/trunk/ccstruct/publictypes.h?r=716#125

[4]
http://code.google.com/p/tesseract-ocr/source/browse/trunk/ccmain/pageiterator.h?r=716#239

--
Zdenko

On Tue, Nov 13, 2012 at 10:26 AM, Alex <[email protected]> wrote:

> Hello,Chirag.
>
> I am also trying to find a way to detect the script of the input document.
>
> Kindly let me know if  you have some progress.
>
> Thanks and Regards,
> Alex
>
>
>
> 在 2012年3月14日星期三UTC+8下午7时17分29秒，Chirag Jain写道：
>>
>> With -psm 3, I got non-empty files (test_osd.txt)  which were empty with
>> -psm 0. This is true for both with/without -l options.
>>
>> However, the results of detectOS is same for both -psm [0/3] option for
>> any of with/without -l options.
>>
>> Please note that I have modified the code slightly to call detectOS
>> separately, which has been doing a good job for orientation detection given
>> script.  I am struggling to detect the script of the input document.
>>
>> Regards,
>> Chirag
>>
>>
>> On Wed, Mar 14, 2012 at 4:05 PM, Sriranga(78yrsold) 
>> <[email protected]>wrote:
>>
>>>  one more important - please test again as follows:
>>> 1st test:tesseract.exe  japanese_doc.tif  test_osd -l jpn -psm 3
>>> 2nd test:tesseract.exe  japanese_doc.tif  test_osd         -psm 3
>>> Please check the output text files "test_osd"  - you will find
>>> difference in script between two.
>>>
>>> On Wed, Mar 14, 2012 at 3:51 PM, Sriranga(78yrsold) <[email protected]
>>> > wrote:
>>>
>>>>  I noticed "-l lang" before "-psm 0" is missing in your commandline.
>>>> In the absence of "-l lang" tesseract will always  assume as "-l eng".
>>>>
>>>> extract of help is reproduced below:
>>>>
>>>> M:\>tesseract.exe -h
>>>> *Usage:tesseract.exe imagename outputbase [-l lang] [-psm pagesegmode]
>>>> [configfil*
>>>> e...]
>>>> pagesegmode values are:
>>>> 0 = Orientation and script detection (OSD) only.
>>>> 1 = Automatic page segmentation with OSD.
>>>> 2 = Automatic page segmentation, but no OSD, or OCR
>>>> 3 = Fully automatic page segmentation, but no OSD. (Default)
>>>> 4 = Assume a single column of text of variable sizes.
>>>> 5 = Assume a single uniform block of vertically aligned text.
>>>> 6 = Assume a single uniform block of text.
>>>> 7 = Treat the image as a single text line.
>>>> 8 = Treat the image as a single word.
>>>> 9 = Treat the image as a single word in a circle.
>>>> 10 = Treat the image as a single character.
>>>> -l lang and/or -psm pagesegmode must occur before anyconfigfile.
>>>>
>>>>
>>>>
>>>> On Wed, Mar 14, 2012 at 3:22 PM, Chirag <[email protected]> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I was able to successfully test orientation detection (after stepping
>>>>> though the code) for various scripts using following commands:
>>>>>
>>>>> English: tesseract.exe  english_doc.tif  test_osd -l eng -psm 0
>>>>> Japanese: tesseract.exe  japanese_doc.tif  test_osd -l jpn -psm 0
>>>>> Korean: tesseract.exe  korean_doc.tif  test_osd -l kor -psm 0
>>>>>
>>>>> In these cases, the executable search for eng.traineddata,
>>>>> jpn.traineddata and kor.traineddata respectively along with 
>>>>> osd.traineddata.
>>>>>
>>>>> The performance is really good.
>>>>>
>>>>>
>>>>> However, it seems like Tesseract is detecting orientation given script.
>>>>>
>>>>>
>>>>> If I run the executable as following:
>>>>>
>>>>> Japanese: tesseract.exe  japanese_doc.tif  test_osd  -psm 0
>>>>> Korean: tesseract.exe  korean_doc.tif  test_osd  -psm 0
>>>>>
>>>>> The results are not good. It seems like script detection is not robust.
>>>>>
>>>>> Am I missing some step? Kindly clarify.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Chirag
>>>>>
>>>>>
>>>>> On Sat, Mar 3, 2012 at 7:12 PM, koray <[email protected]>wrote:
>>>>>
>>>>>>   OSD returns emty text when I tried. Can anyone please clarify if
>>>>>> this is a bug or I m doing things wrong?
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To post to this group, send email to [email protected]
>>>>>>
>>>>>> To unsubscribe from this group, send email to
>>>>>> tesseract-oc...@**googlegroups.com
>>>>>>
>>>>>> For more options, visit this group at
>>>>>> http://groups.google.com/**group/tesseract-ocr?hl=en<http://groups.google.com/group/tesseract-ocr?hl=en>
>>>>>>
>>>>>
>>>>>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: Running OSD support of Tesseract OCR

Reply via email to