Thanks Sriranga for the response. If I am specifying the script, it is just orientation detection of that particular script.
The script detection is still a question mark. Regards, Chirag On Wed, Mar 14, 2012 at 3:51 PM, Sriranga(78yrsold) <[email protected] > wrote: > I noticed "-l lang" before "-psm 0" is missing in your commandline. In > the absence of "-l lang" tesseract will always assume as "-l eng". > > extract of help is reproduced below: > > M:\>tesseract.exe -h > *Usage:tesseract.exe imagename outputbase [-l lang] [-psm pagesegmode] > [configfil* > e...] > pagesegmode values are: > 0 = Orientation and script detection (OSD) only. > 1 = Automatic page segmentation with OSD. > 2 = Automatic page segmentation, but no OSD, or OCR > 3 = Fully automatic page segmentation, but no OSD. (Default) > 4 = Assume a single column of text of variable sizes. > 5 = Assume a single uniform block of vertically aligned text. > 6 = Assume a single uniform block of text. > 7 = Treat the image as a single text line. > 8 = Treat the image as a single word. > 9 = Treat the image as a single word in a circle. > 10 = Treat the image as a single character. > -l lang and/or -psm pagesegmode must occur before anyconfigfile. > > > On Wed, Mar 14, 2012 at 3:22 PM, Chirag <[email protected]> wrote: > >> Hi all, >> >> I was able to successfully test orientation detection (after stepping >> though the code) for various scripts using following commands: >> >> English: tesseract.exe english_doc.tif test_osd -l eng -psm 0 >> Japanese: tesseract.exe japanese_doc.tif test_osd -l jpn -psm 0 >> Korean: tesseract.exe korean_doc.tif test_osd -l kor -psm 0 >> >> In these cases, the executable search for eng.traineddata, >> jpn.traineddata and kor.traineddata respectively along with osd.traineddata. >> >> The performance is really good. >> >> >> However, it seems like Tesseract is detecting orientation given script. >> >> >> If I run the executable as following: >> >> Japanese: tesseract.exe japanese_doc.tif test_osd -psm 0 >> Korean: tesseract.exe korean_doc.tif test_osd -psm 0 >> >> The results are not good. It seems like script detection is not robust. >> >> Am I missing some step? Kindly clarify. >> >> >> Regards, >> Chirag >> >> >> On Sat, Mar 3, 2012 at 7:12 PM, koray <[email protected]> wrote: >> >>> OSD returns emty text when I tried. Can anyone please clarify if >>> this is a bug or I m doing things wrong? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "tesseract-ocr" group. >>> To post to this group, send email to [email protected] >>> To unsubscribe from this group, send email to >>> [email protected] >>> For more options, visit this group at >>> http://groups.google.com/group/tesseract-ocr?hl=en >>> >> >> -- >> You received this message because you are subscribed to the Google >> Groups "tesseract-ocr" group. >> To post to this group, send email to [email protected] >> To unsubscribe from this group, send email to >> [email protected] >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en >> > > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

