Dear Tesseract developers, The source code in mainblk.c reads:
* TESSDATA_PREFIX Environment variable overrules everything. * Compiled in -DTESSDATA_PREFIX is next. * An actual value of argv0 is used if not NULL, otherwise current directory. But actually it is: * TESSDATA_PREFIX Environment variable overrules everything. * Compiled in -DTESSDATA_PREFIX overrides argv0. * argv0 is used if not NULL, otherwise current directory. I think that things should be all way around: more specific (application-specific) values should override system-wide settings. In particular: * If argv0 is not NULL, it is used. * Otherwise if TESSDATA_PREFIX environment variable is defined, it is used. * Otherwise (if defined) compiled-in TESSDATA_PREFIX is used. * Otherwise current directory is used. I have problem when trying to use system-wide library (/usr/lib/libtesseract.so.3) with my trained datasets in multi-threading environment: obviously setting TESSDATA_PREFIX env does not work reliably and the only correct way is to pass directory to TessBaseAPI::Init(), but it is ignored. The attached patch solves this for me, but it also makes sure that despite of source (argument, environment, pre-compiled) directory is post-processed the same way. -- With best regards, Dmitry -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
--- ./ccutil/mainblk.cpp.orig 2012-10-26 19:03:06.000000000 +0200 +++ ./ccutil/mainblk.cpp 2013-06-06 20:46:27.000000000 +0200 @@ -51,30 +51,34 @@ // TESSDATA_PREFIX Environment variable overrules everything. // Compiled in -DTESSDATA_PREFIX is next. // An actual value of argv0 is used if not NULL, otherwise current directory. - if (!getenv("TESSDATA_PREFIX")) { + if (argv0 != NULL) { + datadir = argv0; + } else { + if (getenv("TESSDATA_PREFIX")) { + datadir = getenv("TESSDATA_PREFIX"); + } else { #ifdef TESSDATA_PREFIX #define _STR(a) #a #define _XSTR(a) _STR(a) - datadir = _XSTR(TESSDATA_PREFIX); + datadir = _XSTR(TESSDATA_PREFIX); #undef _XSTR #undef _STR #else - if (argv0 != NULL) { - datadir = argv0; - // Remove tessdata from the end if present, as we will add it back! - int length = datadir.length(); - if (length >= 8 && strcmp(&datadir[length - 8], "tessdata") == 0) - datadir.truncate_at(length - 8); - else if (length >= 9 && strcmp(&datadir[length - 9], "tessdata/") == 0) - datadir.truncate_at(length - 9); - if (datadir.length() == 0) - datadir = "./"; - } else { datadir = "./"; - } #endif + } + } + + // datadir may still be empty: + if (datadir.length() == 0) { + datadir = "./"; } else { - datadir = getenv("TESSDATA_PREFIX"); + // Remove tessdata from the end if present, as we will add it back! + int length = datadir.length(); + if (length >= 8 && strcmp(&datadir[length - 8], "tessdata") == 0) + datadir.truncate_at(length - 8); + else if (length >= 9 && strcmp(&datadir[length - 9], "tessdata/") == 0) + datadir.truncate_at(length - 9); } // check for missing directory separator