Hi, i did the merge and also updated my repo with build instructions.
changes: - extended ETEXT_DESC with PROGRESS_FUNC field. So users of the api can register a callback function to get notified of progress percentage as well as word bounding boxes. - (Most people i have shown my app really liked how it highlighted the current word when doing the ocr) - changed the percentage progress values to start with 0% instead of 30% - added row attributes to hocr output so that i can make more straight lines when creating the pdf files Cheers Renard Am Montag, 20. Mai 2013 11:42:17 UTC+2 schrieb Nick White: > > Hi Renard, > > Great, I'm glad you're merging them into the latest Tesseract > revision. Could you then post the patches into the Tesseract bug > tracker? http://code.google.com/p/tesseract-ocr/issues/list > > > But one change that i really need is the option to pass in a monitor to > the > > api. One reason is to cancel the ocr process from my app and the other > reason > > is so that i can show a progress bar to the user. > > That sounds like the sort of thing that others using the API would > find useful too. If you can make the change in such a way that > existing code using the API continues to work, I expect that could > be incorporated into Tesseract too. > > Nick > > On Sat, May 18, 2013 at 07:01:52AM -0700, Renard Wellnitz wrote: > > Hi Nick, > > > > thanks for the diff. I made the changes a long time ago and i did not > really > > remember everything. But the diff greatly helped me to remember :-) > > Iam currently merging my changes to the latest revision of tesseract. > Iam also > > trying to move as much of my code out of your source files. > > I will also create a new git repo that should be much smaller. I will > also > > mavenize the whole project so that it can be built with just one > command. > > > > Cheers > > Renard > > > > Am Donnerstag, 16. Mai 2013 12:40:15 UTC+2 schrieb Nick White: > > > > That looks right, thanks for that. > > > > I'll try to take a proper look soon and figure out how best to > > upstream stuff, and where it's worth doing so. In the meantime I'll > > attach the .diff (very small; only 200 lines), in case anyone else > > is interested, and so I don't forget ;) > > > > Nick > > > > On Wed, May 15, 2013 at 07:18:42AM -0700, Renard Wellnitz wrote: > > > Hi Nick, > > > > > > here is the console output: > > > > > > > > > localhost:tesseract-ocr-3.02 renard$ svn log -r COMMITTED > > > ------------------------------------------------------------ > > ------------ > > > r705 | zde...@gmail.com | 2012-03-15 22:05:12 +0100 (Thu, 15 > Mar > > 2012) | 1 > > > line > > > > > > fixed build in java directory; create documentation package > with > > 'make > > > doc-pack' > > > ------------------------------------------------------------ > > ------------ > > > > > > > > > Cheers > > > Renard > > > > > > > > > Am Mittwoch, 15. Mai 2013 14:28:35 UTC+2 schrieb Nick White: > > > > > > I'm no expert with SVN, but I think this command will tell me > what I > > > want to know: > > > > > > svn log -r COMMITTED > > > > > > Thanks. > > > > > > On Wed, May 15, 2013 at 04:02:34AM -0700, Renard Wellnitz > wrote: > > > > Hi Nick, > > > > > > > > i'm not really proficient with svn. Maybe this helps? If you > want > > me to > > > run a > > > > specific svn command i'll gladly do it. > > > > > > > > > > > > localhost:tesseract-ocr-3.02 renard$ svn ls "^/tags" > > > > release-2.04/ > > > > release-3.00/ > > > > release-3.00.1/ > > > > release-3.01/ > > > > release-3.02.01/ > > > > release-3.02.02/ > > > > localhost:tesseract-ocr-3.02 renard$ svnversion . > > > > 705M > > > > localhost:tesseract-ocr-3.02 renard$ > > > > > > > > > > > > I do not remember the exact changes. But my main goals was > the get > > > progress > > > > information during the ocr process so that my app could show > the > > bounding > > > boxes > > > > of the currently processed word. > > > > > > > > Cheers > > > > Renard > > > > > > > > > > > > Am Mittwoch, 15. Mai 2013 11:37:26 UTC+2 schrieb Nick White: > > > > > > > > Ah, I see it's pretty close to 3.02.01 (now only > available as > > an SVN > > > > tag). Am I correct in thinking that's the release you > used? Or > > was > > > > it a SVN revision near it? > > > > > > > > Thanks again, > > > > > > > > Nick > > > > > > > > On Wed, May 15, 2013 at 10:30:29AM +0100, Nick White > wrote: > > > > > Hi Renard, > > > > > > > > > > This is awesome, great job :) > > > > > > > > > > I was interested to see what changes you'd made to > tesseract, > > so > > > ran > > > > > 'diff -r' on the tesseract-ocr-3.02 directory in > github, but > > a > > > quick > > > > > look made it seem quite different to the > > > > > tesseract-ocr-3.02.02.tar.gz currently available from > > Tesseract. > > > > > > > > > > Am I correct in thinking that? Is it based on a > version from > > SVN? > > > If > > > > > so, which? If not, I'll just have to spend more time > with > > diff ;-) > > > > > > > > > > I'd be keen to try and isolate and generalise any > changes you > > made > > > > > and get them back into the core code, if I can. > > > > > > > > > > Thanks for all this lovely free code! > > > > > > > > > > Nick > > > > > > > > > > On Tue, May 14, 2013 at 01:51:15PM -0700, Renard > Wellnitz > > wrote: > > > > > > Hi Tom, > > > > > > > > > > > > i decided to publish the code of the app under the > Apache 2 > > > licence. > > > > However > > > > > > the c++ code that deals with image processing uses > the > > stricter > > > GLP v3 > > > > since > > > > > > that is the place where i put a lot of effort into. > > > > > > > > > > > > The project still needs a readme and instructions on > how to > > build > > > the > > > > binaries. > > > > > > For someone with a bit of Android/NDK experience it > should > > be not > > > a big > > > > problem > > > > > > however. > > > > > > Readme and build instructions will follow in a > couple of > > days. > > > > > > > > > > > > https://github.com/renard314/textfairy > > > > > > > > > > > > Cheers! > > > > > > Renard > > > > > > > > -- > > > > -- > > > > You received this message because you are subscribed to the > Google > > > > Groups "tesseract-ocr" group. > > > > To post to this group, send email to > tesser...@googlegroups.com > > > > To unsubscribe from this group, send email to > > > > tesseract-oc...@googlegroups.com > > > > For more options, visit this group at > > > > http://groups.google.com/group/tesseract-ocr?hl=en > > > > > > > > --- > > > > You received this message because you are subscribed to the > Google > > Groups > > > > "tesseract-ocr" group. > > > > To unsubscribe from this group and stop receiving emails > from it, > > send an > > > email > > > > to tesseract-oc...@googlegroups.com. > > > > For more options, visit > https://groups.google.com/groups/opt_out. > > > > > > > > > > > > > > -- > > > -- > > > You received this message because you are subscribed to the Google > > > Groups "tesseract-ocr" group. > > > To post to this group, send email to tesser...@googlegroups.com > > > To unsubscribe from this group, send email to > > > tesseract-oc...@googlegroups.com > > > For more options, visit this group at > > > http://groups.google.com/group/tesseract-ocr?hl=en > > > > > > --- > > > You received this message because you are subscribed to the Google > Groups > > > "tesseract-ocr" group. > > > To unsubscribe from this group and stop receiving emails from it, > send an > > email > > > to tesseract-oc...@googlegroups.com. > > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > > > > > > -- > > -- > > You received this message because you are subscribed to the Google > > Groups "tesseract-ocr" group. > > To post to this group, send email to > > tesser...@googlegroups.com<javascript:> > > To unsubscribe from this group, send email to > > tesseract-oc...@googlegroups.com <javascript:> > > For more options, visit this group at > > http://groups.google.com/group/tesseract-ocr?hl=en > > > > --- > > You received this message because you are subscribed to the Google > Groups > > "tesseract-ocr" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email > > to tesseract-oc...@googlegroups.com <javascript:>. > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to tesseract-ocr@googlegroups.com To unsubscribe from this group, send email to tesseract-ocr+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
Index: ccmain/ltrresultiterator.h =================================================================== --- ccmain/ltrresultiterator.h (revision 844) +++ ccmain/ltrresultiterator.h (working copy) @@ -110,6 +110,8 @@ int* pointsize, int* font_id) const; + void RowAttributes(float* row_height, float* descenders, float* ascenders) const; + // Return the name of the language used to recognize this word. // On error, NULL. Do not delete this pointer. const char* WordRecognitionLanguage() const; Index: ccmain/ltrresultiterator.cpp =================================================================== --- ccmain/ltrresultiterator.cpp (revision 844) +++ ccmain/ltrresultiterator.cpp (working copy) @@ -161,6 +161,14 @@ return 0.0f; } +void LTRResultIterator::RowAttributes( float* row_height, + float* descenders, + float* ascenders) const{ + *row_height = it_->row()->row->x_height() + it_->row()->row->ascenders() - it_->row()->row->descenders(); + *descenders = it_->row()->row->descenders(); + *ascenders = it_->row()->row->ascenders(); +} + // Returns the font attributes of the current word. If iterating at a higher // level object than words, eg textlines, then this will return the // attributes of the first word in that textline. Index: ccmain/control.cpp =================================================================== --- ccmain/control.cpp (revision 844) +++ ccmain/control.cpp (working copy) @@ -243,7 +243,11 @@ word_index++; if (monitor != NULL) { monitor->ocr_alive = TRUE; - monitor->progress = 30 + 50 * word_index / stats_.word_count; + monitor->progress = 70 * word_index / stats_.word_count; + if (monitor->progress_callback!=NULL){ + TBOX box = page_res_it.word()->word->bounding_box(); + (*monitor->progress_callback)(monitor->progress,box.left(), box.right(), box.top(), box.bottom()); + } if (monitor->deadline_exceeded() || (monitor->cancel != NULL && (*monitor->cancel)(monitor->cancel_this, stats_.dict_words))) @@ -316,7 +320,10 @@ word_index++; if (monitor != NULL) { monitor->ocr_alive = TRUE; - monitor->progress = 80 + 10 * word_index / stats_.word_count; + monitor->progress = 70 + 30 * word_index / stats_.word_count; + if (monitor->progress_callback!=NULL){ + (*monitor->progress_callback)(monitor->progress,0,0,0,0); + } if (monitor->deadline_exceeded() || (monitor->cancel != NULL && (*monitor->cancel)(monitor->cancel_this, stats_.dict_words))) Index: ccutil/ocrclass.h =================================================================== --- ccutil/ocrclass.h (revision 844) +++ ccutil/ocrclass.h (working copy) @@ -101,6 +101,7 @@ * the OCR engine is storing its output to shared memory. * During progress, all the buffer info is -1. * Progress starts at 0 and increases to 100 during OCR. No other constraint. + * Additionally the progress callback contains the bounding box of the word that is currently being processed * Every progress callback, the OCR engine must set ocr_alive to 1. * The HP side will set ocr_alive to 0. Repeated failure to reset * to 1 indicates that the OCR engine is dead. @@ -108,6 +109,7 @@ * user words found. If it returns true then operation is cancelled. **********************************************************************/ typedef bool (*CANCEL_FUNC)(void* cancel_this, int words); +typedef bool (*PROGRESS_FUNC)(int progress, int left, int right, int top, int bottom ); class ETEXT_DESC { // output header public: @@ -117,6 +119,7 @@ volatile inT8 ocr_alive; // ocr sets to 1, HP 0 inT8 err_code; // for errcode use CANCEL_FUNC cancel; // returns true to cancel + PROGRESS_FUNC progress_callback;/*called whenever progress increases*/ void* cancel_this; // this or other data for cancel struct timeval end_time; // time to stop. expected to be set only by call // to set_deadline_msecs() Index: tessdata/Makefile.am =================================================================== --- tessdata/Makefile.am (revision 844) +++ tessdata/Makefile.am (working copy) @@ -1,4 +1,4 @@ -datadir = @datadir@/tessdata +dir = @datadir@/tessdata SUBDIRS = configs tessconfigs Index: api/baseapi.cpp =================================================================== --- api/baseapi.cpp (revision 844) +++ api/baseapi.cpp (working copy) @@ -71,6 +71,9 @@ #include "version.h" #endif +/* Version number of package */ +#define VERSION "3.02" + namespace tesseract { /** Minimum sensible image size to be worth running tesseract. */ @@ -1062,17 +1065,32 @@ * STL removed from original patch submission and refactored by rays. */ char* TessBaseAPI::GetHOCRText(int page_number) { + return GetHOCRText(NULL,page_number); +} + + +/** + * Make a HTML-formatted string with hOCR markup from the internal + * data structures. + * page_number is 0-based but will appear in the output as 1-based. + * Image name/input_file_ can be set by SetInputName before calling + * GetHOCRText + * STL removed from original patch submission and refactored by rays. + */ +char* TessBaseAPI::GetHOCRText(struct ETEXT_DESC* monitor, int page_number) { if (tesseract_ == NULL || - (page_res_ == NULL && Recognize(NULL) < 0)) + (page_res_ == NULL && Recognize(monitor) < 0)) return NULL; int lcnt = 1, bcnt = 1, pcnt = 1, wcnt = 1; int page_id = page_number + 1; // hOCR uses 1-based page numbers. + float row_height, descenders, ascenders; STRING hocr_str(""); - if (input_file_ == NULL) + if (input_file_ == NULL) { SetInputName(NULL); + } #ifdef _WIN32 // convert input name from ANSI encoding to utf-8 @@ -1121,6 +1139,11 @@ } if (res_it->IsAtBeginningOf(RIL_TEXTLINE)) { hocr_str.add_str_int("\n <span class='ocr_line' id='line_", lcnt); + res_it->RowAttributes(&row_height,&descenders, &ascenders); + hocr_str.add_str_int("' font='", 15); + hocr_str.add_str_int("' size='", row_height); + hocr_str.add_str_int("' descenders='", descenders * -1); + hocr_str.add_str_int("' ascenders='", ascenders); AddBoxTohOCR(res_it, RIL_TEXTLINE, &hocr_str); } Index: api/baseapi.h =================================================================== --- api/baseapi.h (revision 844) +++ api/baseapi.h (working copy) @@ -521,8 +521,20 @@ * Make a HTML-formatted string with hOCR markup from the internal * data structures. * page_number is 0-based but will appear in the output as 1-based. + * monitor can be used to + * cancel the regocnition + * receive progress callbacks */ + char* GetHOCRText(struct ETEXT_DESC* monitor, int page_number); + + /** + * Make a HTML-formatted string with hOCR markup from the internal + * data structures. + * page_number is 0-based but will appear in the output as 1-based. + */ char* GetHOCRText(int page_number); + + /** * The recognized text is returned as a char* which is coded in the same * format as a box file used in training. Returned string must be freed with Index: api/capi.cpp =================================================================== --- api/capi.cpp (revision 844) +++ api/capi.cpp (working copy) @@ -319,7 +319,7 @@ TESS_API char* TESS_CALL TessBaseAPIGetHOCRText(TessBaseAPI* handle, int page_number) { - return handle->GetHOCRText(page_number); + return handle->GetHOCRText(NULL,page_number); } TESS_API char* TESS_CALL TessBaseAPIGetBoxText(TessBaseAPI* handle, int page_number)