I will get the blowup but since its generating phrases for a sentence only it can fit in memory. Ofc the time would be more but im eager to see what happens.
Sent from Samsung Mobile -------- Original message -------- From: Barry Haddow <[email protected]> Date: 24/10/2014 20:09 (GMT+09:00) To: Raj Dabre <[email protected]> Cc: [email protected] Subject: Re: [Moses-support] Phrase pair generation at run time using Source-Pivot and Pivot-Target phrase tables On 24/10/14 11:06, Raj Dabre wrote: > That is a good starting point suggestion. > Many thanks. > > Perhaps an easier option would be to do the synthesis offline. In > other words, take your two tables and create a pivot table from them, > and then use it like a normal phrase table. > > This I have been doing for 6 months but the phrase tables generated > are super-huge where the size almost is a square of the original size. > I end up having to keep a threshold frequency and kill potentially > good phrase pairs. Thats why I want to generate it online and keep all > the pairs. But if you synthesise the phrase pairs inside the decoder, then won't you get the same n^2 blow-up? I think you have to find a good way to prune however you implement the synthesis. > > Thanks again. > > On Fri, Oct 24, 2014 at 6:58 PM, Barry Haddow > <[email protected] <mailto:[email protected]>> wrote: > > Hi Raj > > You could create a custom phrase table implementation to produce > your synthesised phrase pairs. Have a look at the existing phrase > table implementations in moses/TranslationModel. In particular, > you need to subclass PhraseDictionary. The method > GetTargetPhraseCollectionLEGACY() returns a collection of phrase > pairs, given a source phrase. > > Perhaps an easier option would be to do the synthesis offline. In > other words, take your two tables and create a pivot table from > them, and then use it like a normal phrase table. > > cheers - Barry > > > On 21/10/14 11:14, Raj Dabre wrote: > > Hello, > I am currently doing research on using pivot languages for > Phrase based SMT. > > My current method involves the usage of alternate decoding > paths feature to combine multiple synthesized Source-Target > phrase tables. (I have noticed that not many people exploit > this method or even if they do.... they don't mention it clearly). > > However pre-synthesized phrase tables need to be pruned to > remove low probability phrase pairs and I would like to > generate phrase pairs via a pivot at run time. I am ok with > taking additional decoding time. > > I am aware that Bertoldi (2008) had already mentioned that he > had used this method but this is not present in the moses > decoder release. > I would very much like to implement this but do not know where > to start. > If someone could tell me the section of code that reads in > phrase pairs given a source phrase I think I might be able to > do something. > Any help would be appreciated. > > Thanks in advance. > > -- > Raj Dabre. > Research Student, > Graduate School of Informatics, > Kyoto University. > CSE MTech, IITB., 2011-2014 > > > > _______________________________________________ > Moses-support mailing list > [email protected] <mailto:[email protected]> > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > -- > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. > > > > > -- > Raj Dabre. > Research Student, > Graduate School of Informatics, > Kyoto University. > CSE MTech, IITB., 2011-2014 > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
