See https://github.com/tesseract-ocr/tessdoc/blob/master/examples/OSD_example.cc
//Get OSD - new code int orient_deg; float orient_conf; const char* script_name; float script_conf; api->DetectOrientationScript(&orient_deg, &orient_conf, &script_name, &script_conf); printf("************\n Orientation in degrees: %d\n Orientation confidence: %.2f\n" " Script: %s\n Script confidence: %.2f\n", orient_deg, orient_conf, script_name, script_conf); On Thursday, March 25, 2021 at 2:11:42 PM UTC+5:30 charles...@gmail.com wrote: > Hi, > > I have investigated on trying to detect language automatically. > I referred to these links. Thank you, Merlijin. > https://archive.org/services/docs/api/ocr.html#autonomous-mode > https://git.archive.org/www/tesseract/-/blob/master/main.py#L757 > > So in my analysis, it used OSD of tesseract engine to detect layout and > script. > After detect script, it detects languages on the script. > > So I tried to use OSD engine mode based on textfairy which is Android OCR > app based on tesseract 4.1.1. > But it doesn't work and I can't make sure how I can use OSD engine mode in > Android. > I set 'osd' as language option string and used osd.traindata and set > 'OEM_OSD_ONLY' as engine mode. > But it doesn't work. > > Hope anyone can help you to use OSD engine mode in Android. > > Thank you. > Best, > Charles. > > On Monday, March 22, 2021 at 10:28:38 AM UTC+8 Charles Cho wrote: > >> Hi, Merlijn. >> >> Thanks for your kind response. >> >> Regarding autonomous mode, I'm trying to find such module for Android. >> But I found nothing. I will try more. >> >> >I am not sure what you're finding on google play store, but I have found >> >there to be no limitation to the amount of languages that can be used >> >during OCR. Keep in mind that using more languages will slow down the >> >OCR process. >> It's textfairy, open source app. >> https://play.google.com/store/apps/details?id=com.renard.ocr >> >> Your response is really helpful. >> >> Best, >> Charles. >> On Sunday, March 21, 2021 at 8:29:13 AM UTC+8 Merlijn Wajer wrote: >> >>> Hi, >>> >>> On 19/03/2021 10:11, Charles Cho wrote: >>> > Hello, >>> > I'm working on a ocr android app based on tesseract. >>> > I want to add feature that detects language automatically and >>> recognize >>> > at least 2 languages at once. >>> > I have investigated on that for a while so I know that I have to >>> specify >>> > language for tesseract. >>> > Then how can I implement auto detection of language? >>> >>> Not exactly a mobile use case, but you can read how the Internet Archive >>> does this (I coined it "autonomous mode", where the software just >>> figures out the scripts and languages): >>> >>> https://archive.org/services/docs/api/ocr.html#autonomous-mode >>> >>> And the code is available, here (I plan to split out the archive.org >>> specific code from the python code that invokes Tesseract and performs >>> heuristics like script detection): >>> >>> https://git.archive.org/www/tesseract/-/blob/master/main.py#L757 >>> >>> the tl;dr is to first perform script detection, and use the detected >>> script to OCR the page - then use language detection libraries to guess >>> the languages on the page. >>> >>> > And tesseract on google play store can recognize 3 languages at once. >>> > Is it maximum? >>> >>> I am not sure what you're finding on google play store, but I have found >>> there to be no limitation to the amount of languages that can be used >>> during OCR. Keep in mind that using more languages will slow down the >>> OCR process. >>> >>> > Any help and advice would be really appreciated. >>> >>> Hope this helps. >>> >>> Cheers, >>> Merlijn >>> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/20bdef8f-a543-420d-aba8-a9260fe3a28bn%40googlegroups.com.