Hi Thanh-Le
We've been having problem adding in the alignment information into the
file format. I think we are going to settle on this file format:
source ||| target ||| scores ||| [alignment] ||| [counts]
eg.
Mushariff letzer Act ||| Mushariff 's last act ||| 0.5 0.2 0.7 0.8
2.718 ||| ||| 1 1
This should be backward compatable with old pharoah code, and allows us
to add more extension in the future.
Can I ask you to svn update your training scripts and decoder, and run
your experiments again, the problems should be fixed. Let me know how it
goes. Sorry about the bugs
Hieu
On 27/07/2010 08:24, Thanh-Le Nguyen wrote:
> Hi Hieu,
>
> I am working through the Moses tutorial
> (http://www.statmt.org/moses_steps.html) and got stuck at some steps.
> The first problem was at the step Confirm Setup Success:
>
> $ ../../../tools/moses/moses-cmd/src/moses -f moses.ini< in> out
> Defined parameters (per moses.ini or switch):
> DPR-file: DPR-file.txt
> config: moses.ini
> input-factors: 0
> lmodel-file: 0 0 3
> /home/ntle/__probin/demo/data/sample-models/lm/europarl.srilm.gz
> mapping: T 0
> n-best-list: nbest.txt 100
> ttable-file: 0 0 0 1
> /home/ntle/__probin/demo/data/sample-models/phrase-model/phrase-table
> ttable-limit: 10
> weight-DPR: 3
> weight-d: 1
> weight-l: 1
> weight-t: 1
> weight-w: 0
> Loading lexical distortion models...have 0 models
> Start loading LanguageModel
> /home/ntle/__probin/demo/data/sample-models/lm/europarl.srilm.gz :
> [0.000] seconds
> Finished loading LanguageModels : [1.000] seconds
> Start loading PhraseTable
> /home/ntle/__probin/demo/data/sample-models/phrase-model/phrase-table
> : [1.000] seconds
> filePath:
> /home/ntle/__probin/demo/data/sample-models/phrase-model/phrase-table
> ERROR:Size of scoreVector != number (0!=1) of score components on line 1
> Aborted
>
> It seems that the format of the phrase table was not correctly
> recognized. So I edited the phrase-table by replacing "der ||| the |||
> ||| ||| 0.3" with "der ||| the ||| 0.3". It works fine then.
>
>
> At the step "Sanity Check Trained Model" I am having the following problem:
>
> $ echo "c' est une petite maison ." | tools/moses/moses-cmd/src/moses
> -f work/model/moses.ini
> Defined parameters (per moses.ini or switch):
> config: work/model/moses.ini
> distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6
> /home/ntle/__probin/demo/work/model/reordering-table.wbe-msd-bidirectional-fe.gz
> distortion-limit: 6
> input-factors: 0
> lmodel-file: 0 0 3 /home/ntle/__probin/demo/work/lm/news-commentary.lm
> mapping: 0 T 0
> ttable-file: 0 0 0 5 /home/ntle/__probin/demo/work/model/phrase-table.gz
> ttable-limit: 20
> weight-d: 0.3 0.3 0.3 0.3 0.3 0.3 0.3
> weight-l: 0.5000
> weight-t: 0.2 0.2 0.2 0.2 0.2
> weight-w: -1
> Loading lexical distortion models...have 1 models
> Creating lexical reordering...
> weights: 0.300 0.300 0.300 0.300 0.300 0.300
> Loading table into memory...done.
> Start loading LanguageModel
> /home/ntle/__probin/demo/work/lm/news-commentary.lm : [36.000] seconds
> /home/ntle/__probin/demo/work/lm/news-commentary.lm: line 1476:
> warning: non-zero probability for<unk> in closed-vocabulary LM
> Finished loading LanguageModels : [36.000] seconds
> Start loading PhraseTable
> /home/ntle/__probin/demo/work/model/phrase-table.gz : [36.000] seconds
> filePath: /home/ntle/__probin/demo/work/model/phrase-table.gz
> moses: PhraseDictionaryMemory.cpp:79: bool
> Moses::PhraseDictionaryMemory::Load(const std::vector<long unsigned
> int, std::allocator<long unsigned int> >&, const std::vector<long
> unsigned int, std::allocator<long unsigned int> >&, const
> std::string&, const std::vector<float, std::allocator<float> >&,
> size_t, const Moses::LMList&, float): Assertion `numElement == 3 ||
> numElement == 5' failed.
> Aborted
>
>
> I get a very similar error when I try to create a binary phrase table:
>
> $ gzip -cd work/model/phrase-table.gz | LC_ALL=C sort |
> tools/moses/misc/processPhraseTable -ttable 0 0 - -nscores 5 -out
> work/model/phrase-table
> processing ptree for stdin
> processPhraseTable: PhraseDictionaryTree.cpp:479: int
> Moses::PhraseDictionaryTree::Create(std::istream&, const
> std::string&): Assertion `numElement == 3 || numElement == 5' failed.
> Aborted
>
>
> Does it have anything to do with the new format of the phrase table? I
> have the latest version of Moses (revision 3362).
>
>
> Best,
> Le
>
> 2010/7/26 Hieu Hoang<[email protected]>:
>> Hi rico
>>
>> the format of the phrase table is in flux @ the moment 'cos we've added
>> alignment info so the extraction and decoder won't work unless it's all
>> same version. we're trying to sort this out.
>>
>> my suggestion is to update the extraction scripts and decoder to all the
>> same version. if you still have problems, let me know. apologies
>>
>> On 26/07/2010 12:56, Rico Sennrich wrote:
>>> Hi all,
>>>
>>> One of the latest commits (the one reintroducing alignment info to the
>>> phrasetable) shuffles the info in the phrasetable around.
>>>
>>> Here a quick comparison:
>>>
>>> commit 3357:
>>> ! ||| ! ||| 0.863961 0.642857 0.604773 0.529412 2.718
>>>
>>> commit 3362:
>>> ! ||| ! ||| 0.604773 0.529412 ||| 0.863961 0.642857 10 2.718
>>>
>>> Unsurprisingly, the decoder can't handle the new format and crashes.
>>>
>>> I don't know if this change is intended and you're planning to adjust the
>>> decoder, or if you consider this a bug that you're going to revert, but I
>>> strongly plead for keeping the phrasetables backward compatible. I can't
>>> think
>>> of any reason to shuffle the probabilities around that justifies breaking
>>> compatibility, with all the disadvantages this entails.
>>>
>>> best regards,
>>> Rico
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support