Re: Appending new ground truth to the default language model

Tom Fri, 03 Sep 2010 23:09:06 -0700

Well, it's trained, it's just not discriminatively trained.  What's
shipping is just a model using word frequencies; we have tried n-gram
with back-off, but that's not ready for prime time yet.


Tom

On Aug 20, 9:53 pm, Marcin <[email protected]> wrote:
> That's what I feared. It's not the end of the world, though. I can
> live with small models created from scratch for now. Thanks again for
> your time Ilya.
>
> On Aug 20, 2:33 am, Ilya Mezhirov <[email protected]> wrote:
>
>
>
> > The language model isn't exactly trained, at least AFAIK, more like
> > constructed.
> > It's similar to a regexp like ((a | aaron | abacus | ... | zygote)
> > ( |,|.|!|?))* except more complicated and with probabilities on arcs.
> > One can't just add stuff to it, it has to be recreated from scratch. I
> > don't know how this is done currently.
>
> > On Aug 20, 8:37 am, Marcin <[email protected]> wrote:> Thanks for your 
> > reply Ilya, but I'm afraid I'm still none the wiser
> > > here. I know I can create a deterministic and minimal model from raw
> > > text files, but how do I add it to the default model that comes with
> > > Ocropus? I don't want to have to create a new comprehensive one from
> > > scratch because I don't have enough training data. Are there any other
> > > tools you know of?

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Re: Appending new ground truth to the default language model

Reply via email to