You could also use the following WFST tools: http://code.google.com/p/m2m-aligner/ http://www.isi.edu/licensed-sw/carmel/
Jörg On Fri, Aug 24, 2012 at 11:14 AM, Francois Yvon <[email protected]> wrote: > I concurr with Marcin - GIZA has been repeatedly used for grapheme / phoneme > alignment - I do not remember seing complaints about non-monotonicity. See eg. > > Gerosa, M., Federico, M.: Coping with out-of-vocabulary words:open versus huge > vocabulary ASR. In: ICASSP, pp. 4313-4316 (2009) > > Laurent, A., Deleglise, P., Meignier, S.: Grapheme to phoneme conversion > using a > SMT system. In Proc. of Interspeech (2009) > > Taraka Rama, Anil Kumar Singh, Sudheer Kolachina. Modeling Letter-to-Phoneme > Conversion as a Phrase Based Statistical Machine Translation Problem with > Minimum Error Rate Training. Proc NAACL-HTL 2009 > > F > On 24/08/12 08:23, Marcin Junczys-Dowmunt wrote: >> I have been using Giza (and Moses) quite successfully for letter<->phone >> transcriptions and transcriptions between two different phonetic >> alphabets just with the standard settings. If the data is monotone then >> it rather improbable that Giza will produce crossing alignments. I'd >> guess it's just worth a try. >> >> W dniu 24.08.2012 04:10, Chris Dyer pisze: >>> It should be possible to adapt Giza's HMM implementation to produce >>> monotone alignments. These are the changes that would be necessary >>> (and which should be fairly easy, if you can figure out the code): >>> >>> 1) alignment distribution initialization. by default Giza initializes >>> the HMM transition probabilities to be uniform (effectively making the >>> first iteration of HMM training the same as one more iteration of >>> Model 1). You would need to alter this to make "reverse" jumps have >>> probability 0. >>> >>> 2) smoothing. by default, Giza does something to prevent probabilities >>> from ending up zero (maybe add alpha?). This is fine for monotone >>> jumps, but you want to make sure that "backward" jumps end up zero. >>> >>> I think adding this would be have tremendous value. >>> >>> -Chris >>> >>> On Thu, Aug 23, 2012 at 7:53 PM, Philipp Koehn <[email protected]> wrote: >>>> Hi, >>>> >>>> the IBM Models of GIZA++ are too complicated to be used >>>> for simple monotone alignment. I am not aware of any >>>> switches that would allow this either. >>>> >>>> I suggest to look at finite state machine tools such as >>>> OpenFST - http://www.openfst.org/ >>>> >>>> -phi >>>> >>>> On Wed, Aug 22, 2012 at 5:29 AM, Dario Ernst <[email protected]> wrote: >>>>> Hello dear list, >>>>> >>>>> first off, i'm not quite sure this is the correct list to ask GIZA++ >>>>> questions - if not, please just tell me ;). I'm sorry for the trouble in >>>>> that case. >>>>> >>>>> Anyways, my question. I'm currently trying to use GIZA++ together with >>>>> PISA (http://pisa.googlecode.com/) to create monotone (linear?) >>>>> alignments of words and phoneme-strings. For PISA i believe i've already >>>>> found a way (thanks to the nice help of the author!), but for GIZA i'm a >>>>> bit at loss. Is there some external parameter that i can set, or would >>>>> digging the source be necessary? If so (and i've already started to try >>>>> to familiarize myself a bit with the GIZA internals), what would be a >>>>> good starting point to look at? Unfortunately i'm not that good with >>>>> SMT internals yet, so it'd be a bit hard for me ... so at this point any >>>>> help, input and tips would be greatly appreciated! >>>>> >>>>> Best Regards from Germany (and, please excuse my bad english ;P), thanks >>>>> for reading this ;) >>>>> >>>>> -- >>>>> Man's mind, once stretched by a new idea, never regains its original >>>>> dimensions. -- Oliver Wendell Holmes >>>>> _______________________________________________ >>>>> Moses-support mailing list >>>>> [email protected] >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> _______________________________________________ >>>> Moses-support mailing list >>>> [email protected] >>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support -- ********************************************************************************** Jörg Tiedemann [email protected] Dep. of Linguistics and Philology http://stp.lingfil.uu.se/~joerg/ Uppsala University tel: +46 (0)18 - 471 1412 Box 635, SE-751 26 Uppsala/SWEDEN fax: +46 (0)18 - 471 1094 _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
