I was reading the docs (
https://tesseract-ocr.github.io/4.0.0/a02186.html#a96899e8e5358d96752ab1cfc3bc09f3e
 ) 
and came across this apparent conflict and also noticed that the two 
paragraphs have overlapping content (i.e. datapath, language) 


*The datapath must be the name of the parent directory of tessdata and must 
end in / . *Any name after the last / will be stripped.* The language is 
(usually) an ISO 639-3 string or nullptr will default to eng. *It is 
entirely safe (and eventually will be efficient too) to call Init multiple 
times on the same instance to change language, or just to reset the 
classifier. The language may be a string of the form [~]<lang>[+[~]<lang>]* 
indicating that multiple languages are to be loaded. Eg hin+eng will load 
Hindi and English. Languages may specify internally that they want to be 
loaded with one or more other languages, so the ~ sign is available to 
override that. Eg if hin were set to load eng by default, then hin+~eng 
would force loading only hin. The number of loaded languages is limited 
only by memory, with the caveat that loading additional languages will 
impact both speed and accuracy, as there is more work to do to decide on 
the applicable language, and there is more chance of hallucinating 
incorrect words. WARNING: On changing languages, all Tesseract 
<https://tesseract-ocr.github.io/4.0.0/a02358.html> parameters are reset 
back to their default values. (Which may vary between languages.) If you 
have a rare need to set a Variable that controls initialization for a 
second call to Init you should explicitly call End() 
<https://tesseract-ocr.github.io/4.0.0/a01625.html#ga38027513ee9c0348de1790bddcdc3391>
 and 
then use SetVariable before Init. This is only a very rare use case, since 
there are very few uses that require any parameters to be set before Init.

If set_only_non_debug_params is true, only params that do not contain 
"debug" in the name will be set.


*The datapath must be the name of the data directory (no ending /) *or some 
other file in which the data directory resides (for instance argv[0].) *The 
language is** (usually) an ISO 639-3 string or nullptr will default to eng. 
*If numeric_mode is true, then only digits and Roman numerals will be 
returned.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d844f8d4-4a93-487d-9eca-934f32f290d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to