Thanks for looking at the problem. A test for alignment info is a good idea.
I didn't think there's more verbose messages than before but I'll look at it again On 4 July 2013 22:15, Marcin Junczys-Dowmunt <[email protected]> wrote: > OK, I think the mystery is solved. The text version does not contain > alignment information. The standard algorithm for the compact phrase table > requires alignment information to work properly. > If alignments are not present, you should use the "-encoding None > -no-alignment-info" options (bigger, but still quite compact). It's even > mentioned in the documentation, but I think I should add a test to the > binarization tool, that croaks if alignment data is missing. A test with > your phrase table and "-encoding None -no-alignment-info" works fine and > produces the correct translation now. This also explains why the > compression was so cruelly slow and the results is even smaller than the > incorrectly built one. You wrote you used a version from July 2012 for > training, with a recent moses version, this issue would not have arisen. > Included alignment is now standard in the training scripts and then you can > use the standard procedure for compact binarization, this should save some > additional 30%. > > BTW: New moses is very verbose, is this on purpose? > Best, > Marcin > > W dniu 04.07.2013 22:01, Marcin Junczys-Dowmunt pisze: > > The binary format in the main branch actually never changed from the > moment I released it. So it should not be an issue of binary > incompatibility. I am planning to add version numbers with the first change > in the binary This format other than versioning itself :) > > W dniu 04.07.2013 21:56, Hieu Hoang pisze: > > does your binary files have version numbers embedded in them? I would > highly recommend they do. > > kenlm has it, it's even human readable by doing > head -1 > on any kenlm binary files. The decoder throws errors if running with > incompatible version > If > On 4 July 2013 20:52, Marcin Junczys-Dowmunt <[email protected]> wrote: > >> I had a similar issue like that a few days ago with a quite old moses >> version, recompiling and rebuilding the phrase table seemed to solve it, so >> I did not investigate. However I am not quite sure what I actually did to >> fix it. Currently I am building the binary phrase table from the text >> version to compare. This will take a while, more fun tomorrow. >> >> W dniu 04.07.2013 21:46, Hieu Hoang pisze: >> >> it's a bit strange. Many words are unknown in the compact-pt version, eg. >> this 1 word sentence is unknown: >> un >> could it be encoding issues? or the wrong phrase table was binarized? >> >> On 4 July 2013 18:14, Hieu Hoang <[email protected]> wrote: >> >>> u can download my version >>> http://statmt.org/~s0565741/download/alex/ >>> I've also filtered the text phrase table so that it can run >>> >>> >>> On 4 July 2013 17:47, Marcin Junczys-Dowmunt <[email protected]>wrote: >>> >>>> Hi Alexander, >>>> I am able to log in, but then it hangs infinitly while trying to >>>> retrieve the directory list. >>>> Best, >>>> Marcin >>>> >>>> W dniu 04.07.2013 16:59, Fishkov, Alexander pisze: >>>> >>>> Hi Hieu and Marcin! >>>> >>>> >>>> >>>> >> If either if you have a model (no matter how big) that reproduces >>>> the problem, that i can download, I look into it >>>> >>>> I have setup an ftp to share the model, so I send this message in >>>> private (not to the mailing list). >>>> >>>> >>>> >>>> ftp://hoang:[email protected]/ >>>> >>>> >>>> >>>> The folder structure is as follows: >>>> >>>> /lm – contains binary language model (just in case) >>>> >>>> /model.fr-en – contains translation model in text format with moses.ini >>>> file >>>> >>>> /compact-model.fr-en – contains compact model produced from the >>>> previous one with moses.ini >>>> >>>> >>>> >>>> P.S. I will be out of office until 16 of July. >>>> >>>> >>>> >>>> Best regards, Alexander. >>>> >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> Hieu Hoang >>> Research Associate >>> University of Edinburgh >>> http://www.hoang.co.uk/hieu >>> >>> >> >> >> -- >> Hieu Hoang >> Research Associate >> University of Edinburgh >> http://www.hoang.co.uk/hieu >> >> >> > > > -- > Hieu Hoang > Research Associate > University of Edinburgh > http://www.hoang.co.uk/hieu > > > > -- Hieu Hoang Research Associate University of Edinburgh http://www.hoang.co.uk/hieu
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
