Re: Running OSD support of Tesseract OCR

Chirag Wed, 14 Mar 2012 03:35:37 -0700

Thanks Sriranga for the response.

If I am specifying the script, it is just orientation detection of that
particular script.


The script detection is still a question mark.

Regards,
Chirag


On Wed, Mar 14, 2012 at 3:51 PM, Sriranga(78yrsold) <[email protected]
> wrote:

>  I noticed "-l lang" before "-psm 0" is missing in your commandline. In
> the absence of "-l lang" tesseract will always  assume as "-l eng".
>
> extract of help is reproduced below:
>
> M:\>tesseract.exe -h
> *Usage:tesseract.exe imagename outputbase [-l lang] [-psm pagesegmode]
> [configfil*
> e...]
> pagesegmode values are:
> 0 = Orientation and script detection (OSD) only.
> 1 = Automatic page segmentation with OSD.
> 2 = Automatic page segmentation, but no OSD, or OCR
> 3 = Fully automatic page segmentation, but no OSD. (Default)
> 4 = Assume a single column of text of variable sizes.
> 5 = Assume a single uniform block of vertically aligned text.
> 6 = Assume a single uniform block of text.
> 7 = Treat the image as a single text line.
> 8 = Treat the image as a single word.
> 9 = Treat the image as a single word in a circle.
> 10 = Treat the image as a single character.
> -l lang and/or -psm pagesegmode must occur before anyconfigfile.
>
>
> On Wed, Mar 14, 2012 at 3:22 PM, Chirag <[email protected]> wrote:
>
>> Hi all,
>>
>> I was able to successfully test orientation detection (after stepping
>> though the code) for various scripts using following commands:
>>
>> English: tesseract.exe  english_doc.tif  test_osd -l eng -psm 0
>> Japanese: tesseract.exe  japanese_doc.tif  test_osd -l jpn -psm 0
>> Korean: tesseract.exe  korean_doc.tif  test_osd -l kor -psm 0
>>
>> In these cases, the executable search for eng.traineddata,
>> jpn.traineddata and kor.traineddata respectively along with osd.traineddata.
>>
>> The performance is really good.
>>
>>
>> However, it seems like Tesseract is detecting orientation given script.
>>
>>
>> If I run the executable as following:
>>
>> Japanese: tesseract.exe  japanese_doc.tif  test_osd  -psm 0
>> Korean: tesseract.exe  korean_doc.tif  test_osd  -psm 0
>>
>> The results are not good. It seems like script detection is not robust.
>>
>> Am I missing some step? Kindly clarify.
>>
>>
>> Regards,
>> Chirag
>>
>>
>> On Sat, Mar 3, 2012 at 7:12 PM, koray <[email protected]> wrote:
>>
>>>  OSD returns emty text when I tried. Can anyone please clarify if
>>> this is a bug or I m doing things wrong?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To post to this group, send email to [email protected]
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>> For more options, visit this group at
>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>
>>
>>  --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>
>  --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Re: Running OSD support of Tesseract OCR

Reply via email to