Hi David Note that the extract and extract.inv files are both obtained from the symmetrised alignments (aligned.grow-diag-final-and). In fact they both contain the same information, they are just ordered differently since one is used for p(f|e) and the other for p(e|f),
cheers - Barry On 14/01/13 17:38, David Wilson-Parr wrote: > Hi, > > I was wondering if there was any way to get a list of sentence ids in > the final phrase table corresponding to where that phrase occurred? > > I noticed that the 'extract' program used in step (5) takes the argument > '--IncludeSentenceId' and I tried this and it does include the ID (line > number in corpus) in the extract.sorted and extract.inv.sorted however I > don't suppose that these are still completely valid after the final > phrase table is calculated after the score phrases (6.6) step which > consolidates the normal and the inverse files together. Is there any > 'idiots' process description of what the consolidate process does? I > found the source code quite hard to follow. > > Also I didn't understand why the 'aligned.grow-diag-final-and' file is > generated earlier which is an already combined version of the normal and > inverse word alignments (I think - at least it seems to have many to > many relationships in it!) if the processing then needs to go back to > using them both separately. > > Sorry if I misunderstood something, I am just scratching the surface at > the moment. > > Kind regards, > > Dave > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
