I solved it and come here to write it down, which might be useful for 
somebody else.

I did not build the new tess-two!
After I rebuilt it, I get a new API.
I must say following examples of other apis in the code makes it easy to 
implement a new API, which saves me a lot of time to read the JNI tutorial.
BTW, I used some methods to support the new api. Just let you know, do not 
include the supporting methods in baseapi.h



On Sunday, November 18, 2012 10:46:20 AM UTC-6, Linda Li wrote:
>
> I am using tesseract-ocr-android library.
>
> See https://github.com/rmtheis/tess-two
>
> Now I want to add a new api.
>
> (1) Current output api getUTF8Text() 
>
> The current output api getUTF8Text() outputs the text, where the words 
> are separated by a space. However, I find that a word can be space too. So 
> actually we can not tell word indexes from the getUTF8Text output. Without 
> word indexes, we can not match wordBoundingBoxes, wordConfidences apis to 
> each word.
>
>
>  For example, for this line:
>
> line 8: 
>
> “ @- 7- - @ - - @ ”
>
> How can you tell how many words in the line?
>
> In the hOCR text, we can see:
>
> <span class='ocr_line' id='line_8' title="bbox 0 771 2876 834">
>
> <span class='ocrx_word' id='word_125' title="bbox 0 805 70 834"></span> 
>
> <span class='ocrx_word' id='word_126' title="bbox 184 773 318 
> 831">@-</span> 
>
> <span class='ocrx_word' id='word_127' title="bbox 387 820 389 822"> 
> </span> 
>
> <span class='ocrx_word' id='word_128' title="bbox 452 771 550 
> 816">7-</span> 
>
> <span class='ocrx_word' id='word_129' title="bbox 583 823 585 825"> 
> </span> 
>
> <span class='ocrx_word' id='word_130' title="bbox 616 808 622 
> 815">-</span> 
>
> <span class='ocrx_word' id='word_131' title="bbox 758 819 812 832"></span> 
>
> <span class='ocrx_word' id='word_132' title="bbox 839 821 841 823"> 
> </span> 
>
> <span class='ocrx_word' id='word_133' title="bbox 865 818 888 831"></span> 
>
> <span class='ocrx_word' id='word_134' title="bbox 923 802 1016 
> 830"></span> 
>
> <span class='ocrx_word' id='word_135' title="bbox 1214 816 1216 
> 819">@</span> 
>
> <span …..
>
>
>  Some words are spaces, some are empty. 
>
>
>  (2) I want to add a new api to output a string of the result, in the 
> following format:
>
> - - - - - - - - - - - - - - - - 
>
> line, 1, left, top, right, bottom, word1, word2, word3, word4, word5, 
> meanConfidenceOfThisLine \n
>
> line, 2, left, top, right, bottom, word1, word2, meanConfidenceOfThisLine 
> \n
>
> - - - - - - - - - - - - - - - - 
>
> Using the tess-two as the base.
>
>
>  I define a new api in 
> /jin/com_googlecode_tesseract_android/src/api/baseapi.cpp 
>
> *char** TessBaseAPI::*GetLineWordConfidenceText*(*int* page_number) 
>
> (details of the definition is long, omitted here)
>
> Declare the method in baseapi.h
>
>
>  Then in /jin/com_googlecode_tesseract_android/tessbaseapi.cpp, I add:
>
> - - - - - - - - - - - - - - - - 
>
> jstring *
> Java_com_googlecode_tesseract_android_TessBaseAPI_nativeGetLineWordConfidenceText
> *(JNIEnv *env,
>
> jobject thiz,
>
> jint page_number) {
>
>
>  native_data_t *nat = get_native_data(env, thiz);
>
>
>  *char* *text = nat->api.GetLineWordConfidenceText((*int*) page_number);
>
>
>  jstring result = env->NewStringUTF(text);
>
>
>  free(text);
>
>
>  *return* result;
>
> }
>
> - - - - - - - - - - - - - - - - 
>
>
>  Then in /src/com.googlecode.tesseract.android/TesssBaseAPI.java, I add
>
> - - - - - - - - - - - - - - - - 
>
> *public* String getLineWordConfidenceText() {
>
> // Trim because the text will have extra line breaks at the end
>
> String text = nativeGetLineWordConfidenceText(0);
>
>
>  *return* text.trim();
>
> } 
>
> - - - - - - - - - - - - - - - - 
>
>
>  Now in my android app (based on SimpleAndroidOCR from 
> https://github.com/GautamGupta/Simple-Android-OCR), I add:
>
> - - - - - - - - - - - - - - - - 
>
> String *str_LineWordConfidenceText* = baseApi.getLineWordConfidenceText();
>
> - - - - - - - - - - - - - - - - 
>
>
>  I run the app.
>
> LogCat shows the error information:
>
> - - - - - - - - - - - - - - - - 
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): FATAL EXCEPTION: main
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): 
> java.lang.UnsatisfiedLinkError: nativeGetLineWordConfidenceText
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> com.googlecode.tesseract.android.TessBaseAPI.nativeGetLineWordConfidenceText(Native
>  
> Method)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> com.googlecode.tesseract.android.TessBaseAPI.getLineWordConfidenceText(TessBaseAPI.java:378)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> com.datumdroid.android.ocr.simple.SimpleAndroidOCRActivity.onPhotoTaken(SimpleAndroidOCRActivity.java:360)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> com.datumdroid.android.ocr.simple.SimpleAndroidOCRActivity.onActivityResult(SimpleAndroidOCRActivity.java:168)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> android.app.Activity.dispatchActivityResult(Activity.java:3997)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> android.app.ActivityThread.deliverResults(ActivityThread.java:2905)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> android.app.ActivityThread.handleSendResult(ActivityThread.java:2961)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> android.app.ActivityThread.access$2000(ActivityThread.java:132)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> android.app.ActivityThread$H.handleMessage(ActivityThread.java:1068)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> android.os.Handler.dispatchMessage(Handler.java:99)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> android.os.Looper.loop(Looper.java:150)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> android.app.ActivityThread.main(ActivityThread.java:4263)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> java.lang.reflect.Method.invokeNative(Native Method)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> java.lang.reflect.Method.invoke(Method.java:507)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:839)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> com.android.internal.os.ZygoteInit.main(ZygoteInit.java:597)
>
> 11-18 10:36:34.334: E/AndroidRuntime(12775): at 
> dalvik.system.NativeStart.main(Native Method)
>
> - - - - - - - - - - - - - - - - 
>
>
>  How to correct it? (Sorry I am new to Android, Java JNI)
>
> Thanks a lot in advance!
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to