I know there are new OcrEngineMode value in Tesseract.
But not in tess-two.

In tesseract 4.x, ocrEngineMode is :

enum OcrEngineMode {
  OEM_TESSERACT_ONLY,           // Run Tesseract only - fastest; deprecated
  OEM_LSTM_ONLY,                // Run just the LSTM line recognizer.
  OEM_TESSERACT_LSTM_COMBINED,  // Run the LSTM recognizer, but allow 
fallback
                                // to Tesseract when things get difficult.
                                // deprecated
  OEM_DEFAULT,                  // Specify this mode when calling init_*(),
                                // to indicate that any of the above modes
                                // should be automatically inferred from the
                                // variables in the language-specific 
config,
                                // command-line configs, or if not specified
                                // in any of the above should be set to the
                                // default OEM_TESSERACT_ONLY.
  OEM_COUNT                     // Number of OEMs
};

However, in the newest release of tess-two, the ocrEngineMode is :

    @IntDef({OEM_TESSERACT_ONLY, OEM_CUBE_ONLY, 
OEM_TESSERACT_CUBE_COMBINED, OEM_DEFAULT})
    public @interface OcrEngineMode {}
    public static final int OEM_TESSERACT_ONLY = 0;
    @Deprecated
    public static final int OEM_CUBE_ONLY = 1;
    @Deprecated
    public static final int OEM_TESSERACT_CUBE_COMBINED = 2;
    public static final int OEM_DEFAULT = 3;

If there is no way to set OEM_LSTM_ONLY in tess-two,
I can only assume this is a bug in tess-two.



Quan Nguyen於 2019年12月9日星期一 UTC+8上午12時38分56秒寫道:
>
> There are new OcrEngineMode 
> <https://github.com/tesseract-ocr/tesseract/blob/master/include/tesseract/publictypes.h>
>  
> values.
>
>
> On Saturday, December 7, 2019 at 7:37:49 PM UTC-6, NY C wrote:
>>
>> Hi, I am using tess-two for OCR.
>>
>>
>> (Alex Chon version : https://github.com/alexcohn/tess-two 
>> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Falexcohn%2Ftess-two&sa=D&sntz=1&usg=AFQjCNEQGm3c_HnjOOVpdOoDYCwnElOb5Q>
>> )
>>
>>
>> Code:
>>
>>         TessBaseAPI baseApi = new TessBaseAPI();
>>         baseApi.setDebug(true);
>>         baseApi.init(pathfiles, language);
>>         //baseApi.setVariable(TessBaseAPI.VAR_CHAR_WHITELIST, "0123456789");
>>         baseApi.setPageSegMode(TessBaseAPI.PageSegMode.PSM_AUTO);
>>         baseApi.setImage(bmp);
>>         result= baseApi.getUTF8Text();
>>         baseApi.end();
>>
>>
>> The code run perfectly when I use this tessdata :
>> https://github.com/tesseract-ocr/tessdata
>>
>> But when I use tessdata_fast (
>> https://github.com/tesseract-ocr/tessdata_fast), The code crashes on 
>> baseApi.init.
>>
>>
>> There is no error message since the init method calls native C++. As far 
>> as I can trace, the init method crashes on this line:
>>
>> boolean success = nativeInitOem(mNativeData, datapath, language, 
>> ocrEngineMode);
>>
>>
>> I also tried to set the OEM like this: 
>>
>>   baseApi.init(pathfiles, language, TessBaseAPI.OEM_CUBE_ONLY);
>>
>>
>> All the OEM parameters have been tried :
>>
>> (OEM_TESSERACT_ONLY = 0, OEM_CUBE_ONLY = 1, OEM_TESSERACT_CUBE_COMBINED = 
>> 2, OEM_DEFAULT = 3) 
>>
>> Crashes as well.
>>
>>
>> How could I fix this?
>>
>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/4a0d7fba-73fe-43d9-96e7-55072b82f876%40googlegroups.com.

Reply via email to