I tested using my own lang.tif as follows:
 1) using with -l option -psm 3 ->pl see attached testtif-osd.txt.
(non-english)
  2)using without -l option -psm3 ->pl see attached 2testtif-osd.txt. (in
English)
In both cases there are *no empty* output but in different lang

Extract of cmd reproduced below, if used -psm 0
M:\>tesseract.exe test.tif 2testtif-osd    -psm 0
Tesseract Open Source OCR Engine v3.02 with Leptonica
Error during processing.

M:\>tesseract.exe test.tif 2testtif-osd -l k27   -psm 0
Tesseract Open Source OCR Engine v3.02 with Leptonica
Error during processing.


On Wed, Mar 14, 2012 at 4:47 PM, Chirag <[email protected]> wrote:

> With -psm 3, I got non-empty files (test_osd.txt)  which were empty with
> -psm 0. This is true for both with/without -l options.
>
> However, the results of detectOS is same for both -psm [0/3] option for
> any of with/without -l options.
>
> Please note that I have modified the code slightly to call detectOS
> separately, which has been doing a good job for orientation detection given
> script.  I am struggling to detect the script of the input document.
>
> Regards,
> Chirag
>
>
> On Wed, Mar 14, 2012 at 4:05 PM, Sriranga(78yrsold) <
> [email protected]> wrote:
>
>> one more important - please test again as follows:
>> 1st test:tesseract.exe  japanese_doc.tif  test_osd -l jpn -psm 3
>> 2nd test:tesseract.exe  japanese_doc.tif  test_osd         -psm 3
>> Please check the output text files "test_osd"  - you will find difference
>> in script between two.
>>
>> On Wed, Mar 14, 2012 at 3:51 PM, Sriranga(78yrsold) <
>> [email protected]> wrote:
>>
>>>  I noticed "-l lang" before "-psm 0" is missing in your commandline. In
>>> the absence of "-l lang" tesseract will always  assume as "-l eng".
>>>
>>> extract of help is reproduced below:
>>>
>>> M:\>tesseract.exe -h
>>> *Usage:tesseract.exe imagename outputbase [-l lang] [-psm pagesegmode]
>>> [configfil*
>>> e...]
>>> pagesegmode values are:
>>> 0 = Orientation and script detection (OSD) only.
>>> 1 = Automatic page segmentation with OSD.
>>> 2 = Automatic page segmentation, but no OSD, or OCR
>>> 3 = Fully automatic page segmentation, but no OSD. (Default)
>>> 4 = Assume a single column of text of variable sizes.
>>> 5 = Assume a single uniform block of vertically aligned text.
>>> 6 = Assume a single uniform block of text.
>>> 7 = Treat the image as a single text line.
>>> 8 = Treat the image as a single word.
>>> 9 = Treat the image as a single word in a circle.
>>> 10 = Treat the image as a single character.
>>> -l lang and/or -psm pagesegmode must occur before anyconfigfile.
>>>
>>>
>>>
>>> On Wed, Mar 14, 2012 at 3:22 PM, Chirag <[email protected]> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I was able to successfully test orientation detection (after stepping
>>>> though the code) for various scripts using following commands:
>>>>
>>>> English: tesseract.exe  english_doc.tif  test_osd -l eng -psm 0
>>>> Japanese: tesseract.exe  japanese_doc.tif  test_osd -l jpn -psm 0
>>>> Korean: tesseract.exe  korean_doc.tif  test_osd -l kor -psm 0
>>>>
>>>> In these cases, the executable search for eng.traineddata,
>>>> jpn.traineddata and kor.traineddata respectively along with 
>>>> osd.traineddata.
>>>>
>>>> The performance is really good.
>>>>
>>>>
>>>> However, it seems like Tesseract is detecting orientation given script.
>>>>
>>>>
>>>> If I run the executable as following:
>>>>
>>>> Japanese: tesseract.exe  japanese_doc.tif  test_osd  -psm 0
>>>> Korean: tesseract.exe  korean_doc.tif  test_osd  -psm 0
>>>>
>>>> The results are not good. It seems like script detection is not robust.
>>>>
>>>> Am I missing some step? Kindly clarify.
>>>>
>>>>
>>>> Regards,
>>>> Chirag
>>>>
>>>>
>>>> On Sat, Mar 3, 2012 at 7:12 PM, koray <[email protected]> wrote:
>>>>
>>>>>  OSD returns emty text when I tried. Can anyone please clarify if
>>>>> this is a bug or I m doing things wrong?
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To post to this group, send email to [email protected]
>>>>> To unsubscribe from this group, send email to
>>>>> [email protected]
>>>>> For more options, visit this group at
>>>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>>>
>>>>
>>>>  --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To post to this group, send email to [email protected]
>>>> To unsubscribe from this group, send email to
>>>> [email protected]
>>>> For more options, visit this group at
>>>> http://groups.google.com/group/tesseract-ocr?hl=en
>>>>
>>>
>>>
>>  --
>> You received this message because you are subscribed to the Google
>> Groups "tesseract-ocr" group.
>> To post to this group, send email to [email protected]
>> To unsubscribe from this group, send email to
>> [email protected]
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en
>>
>
>  --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
|ಮ್ಮತ್ತೆ ಬರಪಣಿಗ್ಗೆಯ ನೂದಲನೆಯ ದಿನ ಹೃಷಿಂಕೆಂಶ ನನ್ನನ್ನು ಒೂಃ
ಹೆ|ಣುಆಗಿಬಿರಂತ್ತಿದ್ದ.ಕ್ರಿಕೆಟ್ ಕಾಮೆಂಟರಿ ಕೇಳುತ್ತಿರುವಾಗ, ಈಗ, ಹ್ನ
ಅವರೆದುರು ರಾಘಂ ನನ್ನ ಹಿಮಾಲಯ ಬರಹ ಮೆಚ್ಚಿದಜ್ಞಕ್ಷ್ಯ. ನನಗೇ -
ಹ್ನಷಿಕೇಶ, ಈಗಲ|ರ್ತಿಎ, ಕನಸಿನಲ್ಲಿ ಬಿರುವುದು ಹೀಗೆ ಏನೇನೆ|ನೀ ಕಾ
ಸಾರದ ಮೆಕೆಲೆ ನಡೆದದಸ್ಥಿ, ಹಿಂದೆ ಚಂಪ್ರಾ ಜೆ|ರ್ತಿಎತೆ ಪ್ರಕಿಂಗ್ ಹೆ|ನೀ
ದಾಟುವ ಸಾಹಸ ಅಭಾಕ್ಷ್ಯಸ ಮಾಡಿದಸ್ಥಿ ಇವಲ್ಲ ಆ ಸೇತುವೆ ಮೆಆಲೆ

33:39 zadwséfiofis mcsosozb as sfiaeseas awed Zoe:
£.re3efi?,3z33;§DdQ.§s,€é;:-5*? aainoesb éeeéoggdomfi, éeri, $2)
9365363 (Tag?) 6% aoaraoofia 236$ fiszgdg. 537% -
$e)e?a§Q23,é%r1©.@,3§%:§€; 2Dd3§)dD aoefi asfiefixae 3%
mad ineef’ szfiddg, aood zéomv zétraé Q3/5on9 fine
muss mm eazgafix araiadg wag es ?3’e:$>£ fined’

Reply via email to