Howdy.... I am wondering if anyone can help me get up and running on
ubuntu 9.1. I have a company that has some customized IT, we run some
custom-made software packages for my company on our servers, I'm
wondering if there's a chance this OCR package could be useful to us,
if so I may ask our IT people to look at integrating it into some
tools we use. There is definitely potential for us to contribute back
to the project if we wind up using it. I myself used to be a hobbyist
computer programmer until around 1995, after which I didnt have any
time for it.

Anyway, sorry for all the irrelevant babble. I have a VMware "virtual
applicance" image of ubuntu 9.1 desktop version which I downloaded
from the operating systems section of the virtual appliance
marketplace section of VMware's website. I installed ocropus on that
virtual Ubuntu applicance which appeared to go without a hitch, though
frankly there are one or two errors which appeared to be non-fatal and
which I shall profusely apologize, I didn't copy down.

At any rate, I have ocropus fully up and running, but it always comes
up with garbage whatever I throw at it, whether with the one-line page
command or the book2lines,etc., sequence of commands which is
recommended.

My questions:

(1) Does it come out-of-the-box trained on a recognition model, or is
it necessary to train it on a model to get it to work?

(2) When I test it out-of-the-box on a sample document that contains
what I think is easily recognizable test, I get this:

d...@ubuntu:~/ocropus$ ocropus page test.png
[info] got 895 bboxes
[info] all = 0
-~-- -. .
c|T 5G
[error] beam search failed
r2#,:-Y//>Ni| o :
%d : 6!5## ## #a Bk|v|
%25` ?Yl,?67o7 ao`|2/#|/e
9"q . "|`'+,oo<1|n|c
%;: .e;::|:rt: |--- e(< |`42 ?6: 5a:
tt+c:~ EMn-- //. ~#
;#; ..%::;:::- "-"##i
o,OB< y2~/<X#<, #|e.#--|
[error] beam search failed
#;;;. .;:o:::
ew7.$;:L#]i:I,r5t{- ..s7AT -- o3/||/|q ot24..
[error] beam search failed
## a: |;;A,%|2:2:# t
Tn%=c~
g;V;#U%-|r:#Y6:" ## #
;aa2<5J"| ::Li%Y:Ic <
f? ;;z%;;a-. #`-""9## ` ` @`|h "boo| | RR| su|
##V-.-.
;Jg
l e|i ' ` ,"d "||V,,#| |,| ,,t / Z  ~ J 4#/ s
#g;a :+. c
|osr Acct Ac7 ` 1a:#/ ! .Z -2 # / l ~ lll]yl l!l[g# l
[error] beam search failed
.---- #9" "66.` ` eer|#arent a-aa|ohe
# i 5`- | c7 R|An wi7Rout covro #j ]l;l+l#l[Ill:l
[error] beam search failed
[error] beam search failed
or. ="n--
'#4=7#%:G;|;7-f:|::b
`o|< #~#), >/<, "##,. #-,'~
[error] beam search failed
;<.';:L:#i:|-r5t|-
[error] beam search failed
 g;#<;U|%; :uc::-# e ge. ~
[error] beam search failed
,c-.- .
.
-~-.
---. 9. ..- "` .=9&` ##o$o#>|
vad...@ubuntu:~/ocropus$



(2) When I execute the command to check out the default model, I see:

vad...@ubuntu:~/ocropus$ ocropus cin
model'
Linerec
linerec_verbose=0
linerec_grouper=SimpleGrouper
linerec_use_reject=1
linerec_use_priors=0
linerec_invert=1
linerec_space_fractile=0.5
linerec_space_min=0.2
linerec_minheight=10
linerec_maxheight=300
linerec_space_max=1.1
linerec_space_yes=1
linerec_maxaspect=1
linerec_segmenter=DpSegmenter
linerec_classifier=latin
linerec_space_multiplier=2
linerec_extractor=scaledfe
linerec_cpreload=none
linerec_space_no=5
linerec_minclass=32
linerec_maxcost=20
linerec_maxrange=5
linerec_minprob=1e-06
segmenter: curved cut segmenter
grouper: SimpleGrouper
counts: 126 2309208
 CHARCLASS MODEL
 MLP
 mlp_normalization=-1
 mlp_hidden_hi=80
 mlp_noopt=0
 mlp_hidden_lo=20
 mlp_cv_max=5000
 mlp_cds=rowdataset8
 mlp_eta=0.5
 mlp_miters=8
 mlp_hidden_varlog=1.2
 mlp_sparse=-1
 mlp_hidden_min=5
 mlp_hidden_max=300
 mlp_rounds=8
 mlp_nensemble=4
 mlp_%error=0.0267507
 mlp_eta_varlog=1.5
 mlp_eta_init=0.5
 mlp_crossvalidate=1
 mlp_extractor=none
 mlp_cv_split=0.8
 mlp_%nsamples=2.30921e+06
 ninput 900 nhidden 90 noutput 93
 w1 [-15.0474,25.0991] b1 [-24.2937,
 w2 [-24.3254,14.7334] b2 [-5.92876,
 JUNKCLASS MODEL
 MLP
 mlp_normalization=-1
 mlp_hidden_hi=80
 mlp_noopt=0
 mlp_hidden_lo=20
 mlp_cv_max=5000
 mlp_cds=rowdataset8
 mlp_eta=0.5
 mlp_miters=8
 mlp_hidden_varlog=1.2
 mlp_sparse=-1
 mlp_hidden_min=5
 mlp_hidden_max=300
 mlp_rounds=8
 mlp_nensemble=4
 mlp_%error=0.0232608
 mlp_eta_varlog=1.5
 mlp_eta_init=0.5
 mlp_crossvalidate=1
 mlp_extractor=none
 mlp_cv_split=0.8
 mlp_%nsamples=5.27385e+06
 ninput 900 nhidden 103 noutput 2
 w1 [-34.5243,30.0879] b1 [-29.1538,
 w2 [-12.044,12.044] b2 [-0.0241522,
 ULCLASS MODEL
 MLP
 mlp_normalization=-1
 mlp_hidden_hi=80
 mlp_noopt=0
 mlp_hidden_lo=20
 mlp_cv_max=5000
 mlp_cds=rowdataset8
 mlp_eta=0.5
 mlp_miters=8
 mlp_hidden_varlog=1.2
 mlp_sparse=-1
 mlp_hidden_min=5
 mlp_hidden_max=300
 mlp_rounds=8
 mlp_nensemble=4
 mlp_eta_varlog=1.5
 mlp_eta_init=0.5
 mlp_crossvalidate=1
 mlp_extractor=none
 mlp_cv_split=0.8
 ninput 0 nhidden 0 noutput 0
vad...@ubuntu:~/ocropus$



Question: Am I doing something wrong? I dont understand why the
results of ocropus page are so garbagey. I see a few places where
there are 3 characters in a row I suspect may have been correctly
identified.

Should I be able to run it right out of the box, following the install
instructions, and running it using the page option immediately?

The VMware appliance is running on a 2.4 ghz core 2 duo mac running os
10.6.2, vmare v 3.02

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to