Not bug; just not up to date with Tesseract 4.x. On Monday, December 9, 2019 at 12:14:49 AM UTC-6, NY C wrote: > > I know there are new OcrEngineMode value in Tesseract. > But not in tess-two. > > In tesseract 4.x, ocrEngineMode is : > > enum OcrEngineMode { > OEM_TESSERACT_ONLY, // Run Tesseract only - fastest; deprecated > OEM_LSTM_ONLY, // Run just the LSTM line recognizer. > OEM_TESSERACT_LSTM_COMBINED, // Run the LSTM recognizer, but allow > fallback > // to Tesseract when things get difficult. > // deprecated > OEM_DEFAULT, // Specify this mode when calling init_*(), > // to indicate that any of the above modes > // should be automatically inferred from > the > // variables in the language-specific > config, > // command-line configs, or if not > specified > // in any of the above should be set to the > // default OEM_TESSERACT_ONLY. > OEM_COUNT // Number of OEMs > }; > > However, in the newest release of tess-two, the ocrEngineMode is : > > @IntDef({OEM_TESSERACT_ONLY, OEM_CUBE_ONLY, > OEM_TESSERACT_CUBE_COMBINED, OEM_DEFAULT}) > public @interface OcrEngineMode {} > public static final int OEM_TESSERACT_ONLY = 0; > @Deprecated > public static final int OEM_CUBE_ONLY = 1; > @Deprecated > public static final int OEM_TESSERACT_CUBE_COMBINED = 2; > public static final int OEM_DEFAULT = 3; > > If there is no way to set OEM_LSTM_ONLY in tess-two, > I can only assume this is a bug in tess-two. > > > > Quan Nguyen於 2019年12月9日星期一 UTC+8上午12時38分56秒寫道: >> >> There are new OcrEngineMode >> <https://github.com/tesseract-ocr/tesseract/blob/master/include/tesseract/publictypes.h> >> >> values. >> >> >> On Saturday, December 7, 2019 at 7:37:49 PM UTC-6, NY C wrote: >>> >>> Hi, I am using tess-two for OCR. >>> >>> >>> (Alex Chon version : https://github.com/alexcohn/tess-two >>> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Falexcohn%2Ftess-two&sa=D&sntz=1&usg=AFQjCNEQGm3c_HnjOOVpdOoDYCwnElOb5Q> >>> ) >>> >>> >>> Code: >>> >>> TessBaseAPI baseApi = new TessBaseAPI(); >>> baseApi.setDebug(true); >>> baseApi.init(pathfiles, language); >>> //baseApi.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, "0123456789"); >>> baseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO); >>> baseApi.setImage(bmp); >>> result= baseApi.getUTF8Text(); >>> baseApi.end(); >>> >>> >>> The code run perfectly when I use this tessdata : >>> https://github.com/tesseract-ocr/tessdata >>> >>> But when I use tessdata_fast ( >>> https://github.com/tesseract-ocr/tessdata_fast), The code crashes on >>> baseApi.init. >>> >>> >>> There is no error message since the init method calls native C++. As far >>> as I can trace, the init method crashes on this line: >>> >>> boolean success = nativeInitOem(mNativeData, datapath, language, >>> ocrEngineMode); >>> >>> >>> I also tried to set the OEM like this: >>> >>> baseApi.init(pathfiles, language, TessBaseAPI.OEM_CUBE_ONLY); >>> >>> >>> All the OEM parameters have been tried : >>> >>> (OEM_TESSERACT_ONLY = 0, OEM_CUBE_ONLY = 1, OEM_TESSERACT_CUBE_COMBINED >>> = 2, OEM_DEFAULT = 3) >>> >>> Crashes as well. >>> >>> >>> How could I fix this? >>> >>> >>> >>>
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/9dad07d8-3ab9-4af3-8296-18ed37e29f02%40googlegroups.com.