I tested using my own lang.tif as follows: 1) using with -l option -psm 3 ->pl see attached testtif-osd.txt. (non-english) 2)using without -l option -psm3 ->pl see attached 2testtif-osd.txt. (in English) In both cases there are *no empty* output but in different lang
Extract of cmd reproduced below, if used -psm 0 M:\>tesseract.exe test.tif 2testtif-osd -psm 0 Tesseract Open Source OCR Engine v3.02 with Leptonica Error during processing. M:\>tesseract.exe test.tif 2testtif-osd -l k27 -psm 0 Tesseract Open Source OCR Engine v3.02 with Leptonica Error during processing. On Wed, Mar 14, 2012 at 4:47 PM, Chirag <[email protected]> wrote: > With -psm 3, I got non-empty files (test_osd.txt) which were empty with > -psm 0. This is true for both with/without -l options. > > However, the results of detectOS is same for both -psm [0/3] option for > any of with/without -l options. > > Please note that I have modified the code slightly to call detectOS > separately, which has been doing a good job for orientation detection given > script. I am struggling to detect the script of the input document. > > Regards, > Chirag > > > On Wed, Mar 14, 2012 at 4:05 PM, Sriranga(78yrsold) < > [email protected]> wrote: > >> one more important - please test again as follows: >> 1st test:tesseract.exe japanese_doc.tif test_osd -l jpn -psm 3 >> 2nd test:tesseract.exe japanese_doc.tif test_osd -psm 3 >> Please check the output text files "test_osd" - you will find difference >> in script between two. >> >> On Wed, Mar 14, 2012 at 3:51 PM, Sriranga(78yrsold) < >> [email protected]> wrote: >> >>> I noticed "-l lang" before "-psm 0" is missing in your commandline. In >>> the absence of "-l lang" tesseract will always assume as "-l eng". >>> >>> extract of help is reproduced below: >>> >>> M:\>tesseract.exe -h >>> *Usage:tesseract.exe imagename outputbase [-l lang] [-psm pagesegmode] >>> [configfil* >>> e...] >>> pagesegmode values are: >>> 0 = Orientation and script detection (OSD) only. >>> 1 = Automatic page segmentation with OSD. >>> 2 = Automatic page segmentation, but no OSD, or OCR >>> 3 = Fully automatic page segmentation, but no OSD. (Default) >>> 4 = Assume a single column of text of variable sizes. >>> 5 = Assume a single uniform block of vertically aligned text. >>> 6 = Assume a single uniform block of text. >>> 7 = Treat the image as a single text line. >>> 8 = Treat the image as a single word. >>> 9 = Treat the image as a single word in a circle. >>> 10 = Treat the image as a single character. >>> -l lang and/or -psm pagesegmode must occur before anyconfigfile. >>> >>> >>> >>> On Wed, Mar 14, 2012 at 3:22 PM, Chirag <[email protected]> wrote: >>> >>>> Hi all, >>>> >>>> I was able to successfully test orientation detection (after stepping >>>> though the code) for various scripts using following commands: >>>> >>>> English: tesseract.exe english_doc.tif test_osd -l eng -psm 0 >>>> Japanese: tesseract.exe japanese_doc.tif test_osd -l jpn -psm 0 >>>> Korean: tesseract.exe korean_doc.tif test_osd -l kor -psm 0 >>>> >>>> In these cases, the executable search for eng.traineddata, >>>> jpn.traineddata and kor.traineddata respectively along with >>>> osd.traineddata. >>>> >>>> The performance is really good. >>>> >>>> >>>> However, it seems like Tesseract is detecting orientation given script. >>>> >>>> >>>> If I run the executable as following: >>>> >>>> Japanese: tesseract.exe japanese_doc.tif test_osd -psm 0 >>>> Korean: tesseract.exe korean_doc.tif test_osd -psm 0 >>>> >>>> The results are not good. It seems like script detection is not robust. >>>> >>>> Am I missing some step? Kindly clarify. >>>> >>>> >>>> Regards, >>>> Chirag >>>> >>>> >>>> On Sat, Mar 3, 2012 at 7:12 PM, koray <[email protected]> wrote: >>>> >>>>> OSD returns emty text when I tried. Can anyone please clarify if >>>>> this is a bug or I m doing things wrong? >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "tesseract-ocr" group. >>>>> To post to this group, send email to [email protected] >>>>> To unsubscribe from this group, send email to >>>>> [email protected] >>>>> For more options, visit this group at >>>>> http://groups.google.com/group/tesseract-ocr?hl=en >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To post to this group, send email to [email protected] >>>> To unsubscribe from this group, send email to >>>> [email protected] >>>> For more options, visit this group at >>>> http://groups.google.com/group/tesseract-ocr?hl=en >>>> >>> >>> >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en
|ಮ್ಮತ್ತೆ ಬರಪಣಿಗ್ಗೆಯ ನೂದಲನೆಯ ದಿನ ಹೃಷಿಂಕೆಂಶ ನನ್ನನ್ನು ಒೂಃ ಹೆ|ಣುಆಗಿಬಿರಂತ್ತಿದ್ದ.ಕ್ರಿಕೆಟ್ ಕಾಮೆಂಟರಿ ಕೇಳುತ್ತಿರುವಾಗ, ಈಗ, ಹ್ನ ಅವರೆದುರು ರಾಘಂ ನನ್ನ ಹಿಮಾಲಯ ಬರಹ ಮೆಚ್ಚಿದಜ್ಞಕ್ಷ್ಯ. ನನಗೇ - ಹ್ನಷಿಕೇಶ, ಈಗಲ|ರ್ತಿಎ, ಕನಸಿನಲ್ಲಿ ಬಿರುವುದು ಹೀಗೆ ಏನೇನೆ|ನೀ ಕಾ ಸಾರದ ಮೆಕೆಲೆ ನಡೆದದಸ್ಥಿ, ಹಿಂದೆ ಚಂಪ್ರಾ ಜೆ|ರ್ತಿಎತೆ ಪ್ರಕಿಂಗ್ ಹೆ|ನೀ ದಾಟುವ ಸಾಹಸ ಅಭಾಕ್ಷ್ಯಸ ಮಾಡಿದಸ್ಥಿ ಇವಲ್ಲ ಆ ಸೇತುವೆ ಮೆಆಲೆ
33:39 zadwséfiofis mcsosozb as sfiaeseas awed Zoe: £.re3efi?,3z33;§DdQ.§s,€é;:-5*? aainoesb éeeéoggdomfi, éeri, $2) 9365363 (Tag?) 6% aoaraoofia 236$ fiszgdg. 537% - $e)e?a§Q23,é%r1©.@,3§%:§€; 2Dd3§)dD aoefi asfiefixae 3% mad ineef’ szfiddg, aood zéomv zétraé Q3/5on9 fine muss mm eazgafix araiadg wag es ?3’e:$>£ fined’

