Yes, if a variable isn't mentioned in the config file it will just go to the default, which is as tesseract only.
On Mon, Jul 15, 2013 at 10:53:57AM -0700, bear wrote: > I see, that was very helpful. Thanks Shree. I unpacked Arabic, and noticed > the engine mode: > > tessedit_ocr_engine_mode 1 > > I unpacked Spanish, and it did not contain an engine mode variable > declaration. > Does that mean that it will default to using tesseract only (and not cube) as > defined in my tesseractclass.cpp? Or, will the absence of the variable from a > language specific .config file default to something else? > > Thanks again. > > On Monday, July 15, 2013 1:23:07 PM UTC-4, shree wrote: > > You can unpack the traineddata file and take a look at the .config file in > it. > > eg. In case of hin.traineddata the config file uses combined mode - cube > as > well as OEM which makes it very slow. I changed the config value to use > OEM > only and recombined the file and that improved the speed. > > Please see http://tesseract-ocr.googlecode.com/svn/trunk/doc/ > combine_tessdata.1.html > > Shree > > Shree Devi Kumar > ____________________________________________________________ > भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com > > > On Mon, Jul 15, 2013 at 9:12 PM, bear <[email protected]> wrote: > > Thanks, Nick. After poking through the source, it seems that one of > my assumptions was incorrect; tesseract will default to the > OEM_TESSERACT_ONLY mode, therefore it will not try to infer the best > mode to use for individual languages (by default). > > tesseractclass.cpp: > > INT_INIT_MEMBER(tessedit_ocr_engine_mode, > tesseract::OEM_TESSERACT_ONLY, > "Which OCR engine(s) to run (Tesseract, Cube, > both)." > " Defaults to loading and running only Tesseract" > " (no Cube,no combiner)." > " Values from OcrEngineMode enum in > tesseractclass.h)", > this->params()), > > On Monday, July 15, 2013 10:38:00 AM UTC-4, Nick White wrote: > > Hi, > > > I never set the tessedit_ocr_engine_mode > > configuration for tesseract, so I assume that it is using the > default mode > > which, from my reading, will infer the best mode to use from the > engine for the > > particular language. > > You're right in your assumptions, it will use the default > (non-cube) > mode unless you tell it otherwise. You're also correct that the > default mode is likely the best for Spanish. > > > Finally, where can I set the tessedit_ocr_engine_mode? I cannot > find this in > > any documentation online. Do I need to modify the source before > compiling? Is > > there a configuration file that I can modify or add? > > It's a configuration variable, which you set the same way as any > other configuration variable. That is documented a little here: > http://code.google.com/p/tesseract-ocr/wiki/ControlParams > > I'm afraid I can't help you with performance, as I have no > knowledge > of android stuff. You might find it useful to look at the code of > Renard's excellent looking Text Fairy app for android: > https://github.com/renard314/textfairy > > Nick > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email > to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

