Hacking ocropus.

I've gotten to point of being able to hack ocropus moderately
efficiently, below are the steps to get to where I'm at. It involves
configuring ocropus/iulib with additional debug information,
configuring the project in NetBeans, setting up logger/realtime
display, and using debug mode to step through the code. I started in
Eclipse CDT, but NetBeans proved to have better C++ support

To ocropus developers: mercurial repository has ocr-tesseract
directory which breaks build. Missing graphviz causes exception, add
"sudo apt-get dot" to "ubuntu". IUlib autoconf can't find
"bithacks.h".  What is recommended way of submitting ocropus bug-
fixes? (ie, what are patches?)
==============
Note, paths are taken relative to the root of iulib and ocropus
installation.

-Download and Install Ubuntu 9.04
http://ubuntu.osuosl.org/releases/jaunty/ubuntu-9.04-desktop-i386.iso

-Install JDK
sudo apt-get install openjdk-6-jdk

-Install NetBeans 6.7 RC2

-Install Mercurial
sudo apt-get install mercurial

-Install scons
sudo apt-get install scons

-Install autoconf
sudo apt-get install autoconf

-Install ocropus
hg clone http://mercurial.iupr.org/iulib
hg clone http://mercurial.iupr.org/ocropus
sudo sh -x ocropus/ubuntu  ... (answer Yes to all questions)
cd iulib

-- Add debugging information to iulib library
In SConstruct file, search for "Compiler flags", change them to -g3 -
fno-inline -O0. This increases size of libiulib.a from 2892454 to
4902210, but it also makes stepping through the code a lot easier.

- Continue installing ocropus
scons
sudo scons install
cd ../ocropus
rm -r ocr-tesseract
./build

-- Add debugging information to ocropus
(by default, compilation inlines functions, which makes step-by-step
debugging more difficult)
Open Makefile, modify line starting with "CXXFLAGS" to "CXXFLAGS = -g3
-O0 -fno-inline -fno-eliminate-unused-debug-types -fopenmp"

execute make from ./ocropus

-In NetBeans create new project
File->New Project->C/C++->C/C++ project with existing sources-
>Automatic configuration
Add new project, use automatic config.
Right click on project in left-hand pane, "Set as Main Project"

-Get a sample dataset
download from http://ocropus.googlegroups.com/web/lines.tgz, save to .
tar -zxvf lines.tgz

-Train model
Either train simple model on the lines dataset, see instructions in
http://code.google.com/p/ocropus/wiki/Using, or use existing model
(http://yaroslavvb.com/upload/ocropus/lines.model or ./ocropus/data/
models/default.model)

-Set Run properties
Create directory ./ocrolog
Right-click on your project, select Properties, go into Run options.
Set Run directory to the root of iulib&ocropus installation, also add
following Environment variables: ocrolog=glr, ocrologdir=ocrolog. For
Arguments, set "recognize1 lines.model lines/0000/0001.png"

-Debug project and find bug:
Set breakpoint on line 1313 of ocr-commands.cc (use Navigate/Go to
Line, Navigate/Go to File), click in the left margin to set
breakpoint.
Click "Debug Main Project" button. When NetBeans asks for main
executable, choose "ocropus" in ./ocropus (not in ./ocropus/commands)
When debugger runs to line, use "Step Into" to go inside recognize1,
step over to continue inside the method.
Make sure Variables view is open (Window/Debugging/Variables).
Observe value of argc in that view. If it doesn't show, click "Add
Watch" button, and add a watch for "argc". You can also add Watches
for more general expressions.
Continue stepping over the function, notice that the main loop inside
recognize1 is never entered because of argc value.
Correct starting point for loop, instead of "for(int i=3;", put in "for
(int i=2;"

-Displaying information:
Set environment variable debug to "info,detail" to log two kinds of
messages. To find more valid messages, search for "debugf" (Edit->Find
in Projects). To have realtime graphical display of intermediate
results, set environemnt variable dgraphics. For instance to
"setline". To get more valid keywords, search for "dsection". Run the
program in debug mode to get graphical window to show (in regular
mode, I have no graphics window, and the program is waiting for
something forever). Logging information will go to ./ocrolog/
index.html, real-time graphical information will show in a separate
window that pauses program and needs keyboard input to continue

- General hacking notes:
You can get idea for what some pieces do by looking at Mercurial
history. For instance, right click on linerec.cc->Mercurial->History,
choose "Diff" view

Most navigation can be done with "Go to Definition". Sometimes this
fails, and you need to Navigate->Go to Symbol. If that fails, Edit-
>Find in Projects.

When debugging, you can add Watch for an arbitrary expression, not
just variable name.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to