[ocropus] "Meaning" of OCR costs?

Benjamin Lambert Fri, 28 Jan 2011 16:00:26 -0800

Hi all,

I have been successfully playing around with a few toy language models for 
OCRopus.  I'm unsure of how to set the weights in the model because I'm not 
sure how the costs from the optical part of the recognition are determined.  I 
think I read somewhere that they should aim to be negative log likelihoods?  
Any particular base?


I suppose the answer to this may also depend on the model(s) you're using.   
Right now, I'm still just using the character/line model.  But I noticed that 
there are some parameters in plain text in that model (e.g. below).  Is there 
any documentation or information on what these mean, etc?  If not, perhaps you 
could point me to the relevant source code?

Best,
Ben

From, ocropus/data/models/default.model:

<object>
linerec
linerecverbose=0
grouper=SimpleGrouper
use_reject=1
use_priors=0
invert=1
space_fractile=0.5
space_min=0.2
minheight=10
maxheight=300
space_max=1.1
space_yes=1
maxaspect=1
segmenter=DpSegmenter
classifier=latin
space_multiplier=2
extractor=scaledfe
cpreload=none
space_no=5
minclass=32
maxcost=20
maxrange=5
minprob=1e-06
END_OF_PARAMETERS=HERE

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

[ocropus] "Meaning" of OCR costs?

Reply via email to