I'm having similar issues and hope you get an answer to this question.

Thomas L. Packer
~~~~~~~~~~~~~~~~~~~~


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf
Of groupmeister
Sent: Tuesday, March 09, 2010 9:30 PM
To: ocropus
Subject: introduction and request for help getting up and running

Howdy,

I posted a message a few days earlier but haven't seen it show up,
asking for some help getting up and running. I notice now that there
is a note about posting an introduction of yourself first before
posting to the group. Maybe that's why it hasn't shown up. Sorry I
didnt notice that before, I've been using usenet for 21 years and am
not used to things working that way. Moderated groups have existed
forever, but I hadn't come across one that requires a personal
introduction first. I'm happy to do that though.

I'm a physician, and used to be a computer programmer. I have a
medical services software company that may have a use for OCR in
medical records management. We are interested in OCRopus. I have a
small development team I lead, I want to try out OCRopus myself
personally and see what it's capable of. If it looks good enough to
help us, I will start some projects on our end to make use of it,
which would almost certainly lead to code contributions from us. We
have made significant contributions to a number of other open source
projects already in the course of our work.

The developer I was going to assign this to looked into things and he
said he thought the project was in an unstable alpha phase and was not
likely to be useful to us. Looking through things myself, I don't
think that's true. Also I see that the next release (0.5) will be
easier to integrate into other software products. Anyway, i want to
see for myself if this can be helpful to us or not.

At any rate, I have been trying to do a proof of concept install of
OCRopus to evaluate what it can do, and I can't get it to produce non-
garbage output. I believe I have followed the step-by-step install
instructions for ubuntu properly. I'm wondering if there's something
I'm missing - do I need to train the system for a while before using
it? It seems to come with pre-formed models. Are there settings I can
tweak for processor speed, etc? Maybe the fact that I'm running it in
a VM affects things.

If anyone could help me get up and running, I'd appreciate it. The
full details of my installation/get up and running problem is pasted
below, from my first post attempt.

Thanks very much-

Jack



Howdy.... I am wondering if anyone can help me get up and running on
ubuntu 9.1. I have a company that has some customized IT, we run some
custom-made software packages for my company on our servers, I'm
wondering if there's a chance this OCR package could be useful to us,
if so I may ask our IT people to look at integrating it into some
tools we use. There is definitely potential for us to contribute back
to the project if we wind up using it. I myself used to be a hobbyist
computer programmer until around 1995, after which I didnt have any
time for it.

Anyway, sorry for all the irrelevant babble. I have a VMware "virtual
applicance" image of ubuntu 9.1 desktop version which I downloaded
from the operating systems section of the virtual appliance
marketplace section of VMware's website. I installed ocropus on that
virtual Ubuntu applicance which appeared to go without a hitch, though
frankly there are one or two errors which appeared to be non-fatal and
which I shall profusely apologize, I didn't copy down.

At any rate, I have ocropus fully up and running, but it always comes
up with garbage whatever I throw at it, whether with the one-line page
command or the book2lines,etc., sequence of commands which is
recommended.

My questions:

(1) Does it come out-of-the-box trained on a recognition model, or is
it necessary to train it on a model to get it to work?

(2) When I test it out-of-the-box on a sample document that contains
what I think is easily recognizable test, I get this:

d...@ubuntu:~/ocropus$ ocropus page test.png
[info] got 895 bboxes
[info] all = 0
-~-- -. .
c|T 5G
[error] beam search failed
r2#,:-Y//>Ni| o :
%d : 6!5## ## #a Bk|v|
%25` ?Yl,?67o7 ao`|2/#|/e
9"q . "|`'+,oo<1|n|c
%;: .e;::|:rt: |--- e(< |`42 ?6: 5a:
tt+c:~ EMn-- //. ~#
;#; ..%::;:::- "-"##i
o,OB< y2~/<X#<, #|e.#--|
[error] beam search failed
#;;;. .;:o:::
ew7.$;:L#]i:I,r5t{- ..s7AT -- o3/||/|q ot24..
[error] beam search failed
## a: |;;A,%|2:2:# t
Tn%=c~
g;V;#U%-|r:#Y6:" ## #
;aa2<5J"| ::Li%Y:Ic <
f? ;;z%;;a-. #`-""9## ` ` @`|h "boo| | RR| su|
##V-.-.
;Jg
l e|i ' ` ,"d "||V,,#| |,| ,,t / Z  ~ J 4#/ s
#g;a :+. c
|osr Acct Ac7 ` 1a:#/ ! .Z -2 # / l ~ lll]yl l!l[g# l
[error] beam search failed
.---- #9" "66.` ` eer|#arent a-aa|ohe
# i 5`- | c7 R|An wi7Rout covro #j ]l;l+l#l[Ill:l
[error] beam search failed
[error] beam search failed
or. ="n--
'#4=7#%:G;|;7-f:|::b
`o|< #~#), >/<, "##,. #-,'~
[error] beam search failed
;<.';:L:#i:|-r5t|-
[error] beam search failed
g;#<;U|%; :uc::-# e ge. ~
[error] beam search failed
,c-.- .
.
-~-.
---. 9. ..- "` .=9&` ##o$o#>|
vad...@ubuntu:~/ocropus$



(2) When I execute the command to check out the default model, I see:

vad...@ubuntu:~/ocropus$ ocropus cin
model'
Linerec
linerec_verbose=0
linerec_grouper=SimpleGrouper
linerec_use_reject=1
linerec_use_priors=0
linerec_invert=1
linerec_space_fractile=0.5
linerec_space_min=0.2
linerec_minheight=10
linerec_maxheight=300
linerec_space_max=1.1
linerec_space_yes=1
linerec_maxaspect=1
linerec_segmenter=DpSegmenter
linerec_classifier=latin
linerec_space_multiplier=2
linerec_extractor=scaledfe
linerec_cpreload=none
linerec_space_no=5
linerec_minclass=32
linerec_maxcost=20
linerec_maxrange=5
linerec_minprob=1e-06
segmenter: curved cut segmenter
grouper: SimpleGrouper
counts: 126 2309208
CHARCLASS MODEL
MLP
mlp_normalization=-1
mlp_hidden_hi=80
mlp_noopt=0
mlp_hidden_lo=20
mlp_cv_max=5000
mlp_cds=rowdataset8
mlp_eta=0.5
mlp_miters=8
mlp_hidden_varlog=1.2
mlp_sparse=-1
mlp_hidden_min=5
mlp_hidden_max=300
mlp_rounds=8
mlp_nensemble=4
mlp_%error=0.0267507
mlp_eta_varlog=1.5
mlp_eta_init=0.5
mlp_crossvalidate=1
mlp_extractor=none
mlp_cv_split=0.8
mlp_%nsamples=2.30921e+06
ninput 900 nhidden 90 noutput 93
w1 [-15.0474,25.0991] b1 [-24.2937,
w2 [-24.3254,14.7334] b2 [-5.92876,
JUNKCLASS MODEL
MLP
mlp_normalization=-1
mlp_hidden_hi=80
mlp_noopt=0
mlp_hidden_lo=20
mlp_cv_max=5000
mlp_cds=rowdataset8
mlp_eta=0.5
mlp_miters=8
mlp_hidden_varlog=1.2
mlp_sparse=-1
mlp_hidden_min=5
mlp_hidden_max=300
mlp_rounds=8
mlp_nensemble=4
mlp_%error=0.0232608
mlp_eta_varlog=1.5
mlp_eta_init=0.5
mlp_crossvalidate=1
mlp_extractor=none
mlp_cv_split=0.8
mlp_%nsamples=5.27385e+06
ninput 900 nhidden 103 noutput 2
w1 [-34.5243,30.0879] b1 [-29.1538,
w2 [-12.044,12.044] b2 [-0.0241522,
ULCLASS MODEL
MLP
mlp_normalization=-1
mlp_hidden_hi=80
mlp_noopt=0
mlp_hidden_lo=20
mlp_cv_max=5000
mlp_cds=rowdataset8
mlp_eta=0.5
mlp_miters=8
mlp_hidden_varlog=1.2
mlp_sparse=-1
mlp_hidden_min=5
mlp_hidden_max=300
mlp_rounds=8
mlp_nensemble=4
mlp_eta_varlog=1.5
mlp_eta_init=0.5
mlp_crossvalidate=1
mlp_extractor=none
mlp_cv_split=0.8
ninput 0 nhidden 0 noutput 0
vad...@ubuntu:~/ocropus$



Question: Am I doing something wrong? I dont understand why the
results of ocropus page are so garbagey. I see a few places where
there are 3 characters in a row I suspect may have been correctly
identified.

Should I be able to run it right out of the box, following the install
instructions, and running it using the page option immediately?

The VMware appliance is running on a 2.4 ghz core 2 duo mac running os
10.6.2, vmare v 3.02


-- 
You received this message because you are subscribed to the Google Groups
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/ocropus?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to