For specific questions of what you can and cannot do with OpenFST, you
might also want to try the OpenFST mailing list; people there have
much more experience with creating complex language models.

Tom

On Aug 19, 11:37 pm, Marcin <[email protected]> wrote:
> Thanks for your reply Ilya, but I'm afraid I'm still none the wiser
> here. I know I can create a deterministic and minimal model from raw
> text files, but how do I add it to the default model that comes with
> Ocropus? I don't want to have to create a new comprehensive one from
> scratch because I don't have enough training data. Are there any other
> tools you know of?
>
> On Aug 19, 9:45 am, Ilya Mezhirov <[email protected]> wrote:
>
>
>
> > Yes, default.fst can't be determinized. There are some conditions
> > (which I don't remember) on an FST to determinize it, but an acyclic
> > FST should always work. So you can make a word model first,
> > determinize/minimize it, and then create cycles to get a line model.
>
> > On Aug 19, 9:06 am, Marcin <[email protected]> wrote:
>
> > > I'm trying to build my own language model by extending the default one
> > > at /usr/local/share/ocropus/models/default.fst. Following the example
> > > of ocropus-linefst and fstutils, I'm doing the following:
>
> > > fst = openfst.StdVectorFst.Read("/usr/local/share/ocropus/models/
> > > default.fst")
> > > filenames = glob.glob("training/*.gt.txt")
> > > for filename in filenames:
> > >     file = open(filename)
> > >     for line in file.readlines():
> > >         l = line.strip()
> > >         if not l:
> > >             continue
> > >         fstutils.add_line(fst, l)
>
> > > det = Fst()
> > > openfst.Determinize(fst, det)
> > > (...)
>
> > > The rest is truncated because I never get there. The Determinize
> > > function aborts the program with the message:
>
> > > FATAL: StringWeight::Plus: unequal arguments (non-functional FST?)
>
> > > Is this even supposed to work? The same crash happens when I run
> > > Determinize on the original model, i.e. without running the for loop
> > > above. I suppose I should load the default model into an Ocropus
> > > container created with ocropy.make_OcroFST(), but then I can't use the
> > > functions in fstutils, which expect StdVectorFst. Does anyone have any
> > > advice here?

-- 
You received this message because you are subscribed to the Google Groups 
"ocropus" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/ocropus?hl=en.

Reply via email to