Hacking ocropus. I've gotten to point of being able to hack ocropus moderately efficiently, below are the steps to get to where I'm at. It involves configuring ocropus/iulib with additional debug information, configuring the project in NetBeans, setting up logger/realtime display, and using debug mode to step through the code. I started in Eclipse CDT, but NetBeans proved to have better C++ support
To ocropus developers: mercurial repository has ocr-tesseract directory which breaks build. Missing graphviz causes exception, add "sudo apt-get dot" to "ubuntu". IUlib autoconf can't find "bithacks.h". What is recommended way of submitting ocropus bug- fixes? (ie, what are patches?) ============== Note, paths are taken relative to the root of iulib and ocropus installation. -Download and Install Ubuntu 9.04 http://ubuntu.osuosl.org/releases/jaunty/ubuntu-9.04-desktop-i386.iso -Install JDK sudo apt-get install openjdk-6-jdk -Install NetBeans 6.7 RC2 -Install Mercurial sudo apt-get install mercurial -Install scons sudo apt-get install scons -Install autoconf sudo apt-get install autoconf -Install ocropus hg clone http://mercurial.iupr.org/iulib hg clone http://mercurial.iupr.org/ocropus sudo sh -x ocropus/ubuntu ... (answer Yes to all questions) cd iulib -- Add debugging information to iulib library In SConstruct file, search for "Compiler flags", change them to -g3 - fno-inline -O0. This increases size of libiulib.a from 2892454 to 4902210, but it also makes stepping through the code a lot easier. - Continue installing ocropus scons sudo scons install cd ../ocropus rm -r ocr-tesseract ./build -- Add debugging information to ocropus (by default, compilation inlines functions, which makes step-by-step debugging more difficult) Open Makefile, modify line starting with "CXXFLAGS" to "CXXFLAGS = -g3 -O0 -fno-inline -fno-eliminate-unused-debug-types -fopenmp" execute make from ./ocropus -In NetBeans create new project File->New Project->C/C++->C/C++ project with existing sources- >Automatic configuration Add new project, use automatic config. Right click on project in left-hand pane, "Set as Main Project" -Get a sample dataset download from http://ocropus.googlegroups.com/web/lines.tgz, save to . tar -zxvf lines.tgz -Train model Either train simple model on the lines dataset, see instructions in http://code.google.com/p/ocropus/wiki/Using, or use existing model (http://yaroslavvb.com/upload/ocropus/lines.model or ./ocropus/data/ models/default.model) -Set Run properties Create directory ./ocrolog Right-click on your project, select Properties, go into Run options. Set Run directory to the root of iulib&ocropus installation, also add following Environment variables: ocrolog=glr, ocrologdir=ocrolog. For Arguments, set "recognize1 lines.model lines/0000/0001.png" -Debug project and find bug: Set breakpoint on line 1313 of ocr-commands.cc (use Navigate/Go to Line, Navigate/Go to File), click in the left margin to set breakpoint. Click "Debug Main Project" button. When NetBeans asks for main executable, choose "ocropus" in ./ocropus (not in ./ocropus/commands) When debugger runs to line, use "Step Into" to go inside recognize1, step over to continue inside the method. Make sure Variables view is open (Window/Debugging/Variables). Observe value of argc in that view. If it doesn't show, click "Add Watch" button, and add a watch for "argc". You can also add Watches for more general expressions. Continue stepping over the function, notice that the main loop inside recognize1 is never entered because of argc value. Correct starting point for loop, instead of "for(int i=3;", put in "for (int i=2;" -Displaying information: Set environment variable debug to "info,detail" to log two kinds of messages. To find more valid messages, search for "debugf" (Edit->Find in Projects). To have realtime graphical display of intermediate results, set environemnt variable dgraphics. For instance to "setline". To get more valid keywords, search for "dsection". Run the program in debug mode to get graphical window to show (in regular mode, I have no graphics window, and the program is waiting for something forever). Logging information will go to ./ocrolog/ index.html, real-time graphical information will show in a separate window that pauses program and needs keyboard input to continue - General hacking notes: You can get idea for what some pieces do by looking at Mercurial history. For instance, right click on linerec.cc->Mercurial->History, choose "Diff" view Most navigation can be done with "Go to Definition". Sometimes this fails, and you need to Navigate->Go to Symbol. If that fails, Edit- >Find in Projects. When debugging, you can add Watch for an arbitrary expression, not just variable name. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
