I solved it and come here to write it down, which might be useful for somebody else.
I did not build the new tess-two! After I rebuilt it, I get a new API. I must say following examples of other apis in the code makes it easy to implement a new API, which saves me a lot of time to read the JNI tutorial. BTW, I used some methods to support the new api. Just let you know, do not include the supporting methods in baseapi.h On Sunday, November 18, 2012 10:46:20 AM UTC-6, Linda Li wrote: > > I am using tesseract-ocr-android library. > > See https://github.com/rmtheis/tess-two > > Now I want to add a new api. > > (1) Current output api getUTF8Text() > > The current output api getUTF8Text() outputs the text, where the words > are separated by a space. However, I find that a word can be space too. So > actually we can not tell word indexes from the getUTF8Text output. Without > word indexes, we can not match wordBoundingBoxes, wordConfidences apis to > each word. > > > For example, for this line: > > line 8: > > “ @- 7- - @ - - @ ” > > How can you tell how many words in the line? > > In the hOCR text, we can see: > > <span class='ocr_line' id='line_8' title="bbox 0 771 2876 834"> > > <span class='ocrx_word' id='word_125' title="bbox 0 805 70 834"></span> > > <span class='ocrx_word' id='word_126' title="bbox 184 773 318 > 831">@-</span> > > <span class='ocrx_word' id='word_127' title="bbox 387 820 389 822"> > </span> > > <span class='ocrx_word' id='word_128' title="bbox 452 771 550 > 816">7-</span> > > <span class='ocrx_word' id='word_129' title="bbox 583 823 585 825"> > </span> > > <span class='ocrx_word' id='word_130' title="bbox 616 808 622 > 815">-</span> > > <span class='ocrx_word' id='word_131' title="bbox 758 819 812 832"></span> > > <span class='ocrx_word' id='word_132' title="bbox 839 821 841 823"> > </span> > > <span class='ocrx_word' id='word_133' title="bbox 865 818 888 831"></span> > > <span class='ocrx_word' id='word_134' title="bbox 923 802 1016 > 830"></span> > > <span class='ocrx_word' id='word_135' title="bbox 1214 816 1216 > 819">@</span> > > <span ….. > > > Some words are spaces, some are empty. > > > (2) I want to add a new api to output a string of the result, in the > following format: > > - - - - - - - - - - - - - - - - > > line, 1, left, top, right, bottom, word1, word2, word3, word4, word5, > meanConfidenceOfThisLine \n > > line, 2, left, top, right, bottom, word1, word2, meanConfidenceOfThisLine > \n > > - - - - - - - - - - - - - - - - > > Using the tess-two as the base. > > > I define a new api in > /jin/com_googlecode_tesseract_android/src/api/baseapi.cpp > > *char** TessBaseAPI::*GetLineWordConfidenceText*(*int* page_number) > > (details of the definition is long, omitted here) > > Declare the method in baseapi.h > > > Then in /jin/com_googlecode_tesseract_android/tessbaseapi.cpp, I add: > > - - - - - - - - - - - - - - - - > > jstring * > Java_com_googlecode_tesseract_android_TessBaseAPI_nativeGetLineWordConfidenceText > *(JNIEnv *env, > > jobject thiz, > > jint page_number) { > > > native_data_t *nat = get_native_data(env, thiz); > > > *char* *text = nat->api.GetLineWordConfidenceText((*int*) page_number); > > > jstring result = env->NewStringUTF(text); > > > free(text); > > > *return* result; > > } > > - - - - - - - - - - - - - - - - > > > Then in /src/com.googlecode.tesseract.android/TesssBaseAPI.java, I add > > - - - - - - - - - - - - - - - - > > *public* String getLineWordConfidenceText() { > > // Trim because the text will have extra line breaks at the end > > String text = nativeGetLineWordConfidenceText(0); > > > *return* text.trim(); > > } > > - - - - - - - - - - - - - - - - > > > Now in my android app (based on SimpleAndroidOCR from > https://github.com/GautamGupta/Simple-Android-OCR), I add: > > - - - - - - - - - - - - - - - - > > String *str_LineWordConfidenceText* = baseApi.getLineWordConfidenceText(); > > - - - - - - - - - - - - - - - - > > > I run the app. > > LogCat shows the error information: > > - - - - - - - - - - - - - - - - > > 11-18 10:36:34.334: E/AndroidRuntime(12775): FATAL EXCEPTION: main > > 11-18 10:36:34.334: E/AndroidRuntime(12775): > java.lang.UnsatisfiedLinkError: nativeGetLineWordConfidenceText > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > com.googlecode.tesseract.android.TessBaseAPI.nativeGetLineWordConfidenceText(Native > > Method) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > com.googlecode.tesseract.android.TessBaseAPI.getLineWordConfidenceText(TessBaseAPI.java:378) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > com.datumdroid.android.ocr.simple.SimpleAndroidOCRActivity.onPhotoTaken(SimpleAndroidOCRActivity.java:360) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > com.datumdroid.android.ocr.simple.SimpleAndroidOCRActivity.onActivityResult(SimpleAndroidOCRActivity.java:168) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > android.app.Activity.dispatchActivityResult(Activity.java:3997) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > android.app.ActivityThread.deliverResults(ActivityThread.java:2905) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > android.app.ActivityThread.handleSendResult(ActivityThread.java:2961) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > android.app.ActivityThread.access$2000(ActivityThread.java:132) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > android.app.ActivityThread$H.handleMessage(ActivityThread.java:1068) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > android.os.Handler.dispatchMessage(Handler.java:99) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > android.os.Looper.loop(Looper.java:150) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > android.app.ActivityThread.main(ActivityThread.java:4263) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > java.lang.reflect.Method.invokeNative(Native Method) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > java.lang.reflect.Method.invoke(Method.java:507) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:839) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > com.android.internal.os.ZygoteInit.main(ZygoteInit.java:597) > > 11-18 10:36:34.334: E/AndroidRuntime(12775): at > dalvik.system.NativeStart.main(Native Method) > > - - - - - - - - - - - - - - - - > > > How to correct it? (Sorry I am new to Android, Java JNI) > > Thanks a lot in advance! > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

