@ondrej - Yes, Yulia's lattices look like confusion networks in disguised
so there will be a large number of paths through the lattice.

the memory explosion is due to my code creating an object for every path.
It was mainly for the reason mention previously above, ie:
   I want to give each feature function the opprtunity to score with full
knowledge of the path.

However, the old binary phrase-table doesn't require these objects to do
the lookups. Therefore, to enable Yulia and anyone else to decode large
lattices, my code will not run when
   1. decoding lattice/confusion networks, AND
   2. using the old binary phrase table.

@Liang - thanks for the suggestions. I'm not sure how our lattice were
created. Lexi knows

thanks for all who responded, was very useful.



On 4 October 2013 22:20, Ondrej Bojar <[email protected]> wrote:

> Hi,
>
> while you can always run rmepsilon from openfst or other toolkit, epsilon
> edges will be probably particularly useful if one would use different
> semirings for different components of the score vector. With generic
> toolkits, all the components of the score vector are processed in a single
> manner. Depending on whether Moses features do the "plus" of their
> respective scores on their own, each feature can use its own semiring.
>
> The probably (in some sense) maximal explosion in the number of paths is
> achieved when the lattice has the form of a confusion network (no
> epsilons). You get the full cartesian product of choices of the first
> token, the second token etc.
>
> Cheers, Ondrej.
>
> "Hieu Hoang" <[email protected]> wrote:
>
> >@nicola - i didn't see a reason either but some lattices from a speech
> >recognizer contains them so was just curious. I think chris has a point -
> >they may be easier to create.
> >
> >I think they may also more efficient to decode. In a non-deterministic
> >lattice, you might have the 2 edges with the same symbol coming out of 1
> >node. Each would have to be decoded separately.
> >
> >However, its a pain to decode epsilons and there might be weird edge
> cases,
> >eg. consecutive, beginning and end epsilons, entirely epsiloms.
> >
> >@chris - cheers for the explanation. i might use victor's code and see how
> >it goes.
> >
> >Do you have an example (large) lattice that blows up memory that you can
> >share?
> >
> >Yes - i've changed the code to extract all possible paths. In fact, i
> >extract all paths from beginning to end of sentence, without limit. 2
> >reasons for this
> >   1. I also divorced extracting the path creation from the phrase-table
> >lookup. In the general case there's multiple phrase-tables so it's
> >difficult to keep track of the tries. Also, the intertwinning of the
> binary
> >pt loookup with lattices made it difficult to read.
> >   2. I want to give each feature function the opprtunity to score with
> >full knowledge of the path.
> >
> >This may have to be altered if the memory explosion is too drastic
> >
> >
> >
> >
> >On 4 October 2013 17:49, Chris Dyer <[email protected]> wrote:
> >
> >> It's useful to have epsilons since it simplifies the creation of
> >> lattices in some cases. Yes, you can convert them to a deterministic
> >> equivalent, but that involves implementing FSA determinatization (or
> >> using a tool like https://pypi.python.org/pypi/pyfst), which may not
> >> be convenient.
> >>
> >> Btw, I've also noticed that memory usage with lattices/CNs explodes
> >> with non-binarized phrase tables (maybe also with binarized PTs?).
> >> This is independent of the size of the phrase table and only seems to
> >> be a function of the lattice structure. I'm not sure what's going on
> >> (the code has changed substantially since I last looked at it). But,
> >> you should always match paths in the lattice with paths in the phrase
> >> table trie- maybe moses is now trying to extract all possible paths in
> >> the lattice up to max-phrase-size or something?
> >>
> >> On Fri, Oct 4, 2013 at 11:22 AM, Nicola Bertoldi <[email protected]>
> wrote:
> >> > I don't see any reason why a lattice should contain an EPSILON edge.
> >> >
> >> > In a confusion network, EPSILON are needed to allow the translation of
> >> input of different lengths.
> >> > The sausage structure of the CN imposes the same amount of source
> words,
> >> > and the EPSILONs overcome this constraint.
> >> >
> >> > This is not the case for lattice, because you can have any number of
> >> edges/words in a complete source path.
> >> >
> >> >
> >> > cheers,
> >> > Nicola
> >> >
> >> >
> >> >
> >> > On Oct 4, 2013, at 2:52 PM, Hieu Hoang wrote:
> >> >
> >> > I'm just looking at the lattices decoding, as implemented in moses.
> >> >
> >> > for confusion networks, it's fair to have EPSILON words (that
> represent
> >> blank words). However, I don't see the point of them in lattices.
> >> >
> >> > Anyone have an opinion? How is it implemented in cdec & joshua?
> >> >
> >> > --
> >> > Hieu Hoang
> >> > Research Associate
> >> > University of Edinburgh
> >> > http://www.hoang.co.uk/hieu
> >> >
> >> > _______________________________________________
> >> > Moses-support mailing list
> >> > [email protected]<mailto:[email protected]>
> >> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >> >
> >> >
> >> > _______________________________________________
> >> > Moses-support mailing list
> >> > [email protected]
> >> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> Groups
> >> "cdec users" group.
> >> To unsubscribe from this group and stop receiving emails from it, send
> an
> >> email to [email protected].
> >> For more options, visit https://groups.google.com/groups/opt_out.
> >>
> >
> >
> >
> >--
> >Hieu Hoang
> >Research Associate
> >University of Edinburgh
> >http://www.hoang.co.uk/hieu
> >_______________________________________________
> >Moses-support mailing list
> >[email protected]
> >http://mailman.mit.edu/mailman/listinfo/moses-support
>
> --
> Ondrej Bojar
> http://www.cuni.cz/~obo
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>



-- 
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to