Re: [Moses-support] phrase extraction step

Philipp Koehn Sun, 31 Mar 2013 09:36:11 -0700

Hi,

I think I sent you actually the wrong commands, look at your
"training15.out" file for the right command. Before all this, you should
try to use full paths, though.


-phi


On Sun, Mar 31, 2013 at 5:33 PM, Philipp Koehn <[email protected]> wrote:

> Hi,
>
> since something goes wrong in the phrase extraction step, please try to
> run the commands by hand and check where something fails. The commands are
> reported in STDERR of the step.
>
> In your case:
>
> /home/nikhila/project/mosesdecoder/scripts/generic/score-parallel.perl 1
> "sort    " /home/nikhila/project/mosesdecoder/scripts/../bin/score
> ./model/extract.sorted.gz ./model/lex.f2e ./model/phrase-table.half.f2e.gz
> 0
>
> ln -s ./model/extract.sorted.gz ./model/tmp.7551/extract.0.gz
>
> /home/nikhila/project/mosesdecoder/scripts/../bin/score
> ./model/tmp.7551/extract.0.gz ./model/lex.f2e
> ./model/tmp.7551/phrase-table.half.00000.gz
>
> ./model/tmp.7551/run.0.shmv ./model/tmp.7551/phrase-table.half.00000.gz
>
> From looking at this, my guess is that there is problem with
> not specifying full paths, but rather "." as root directory.
>
> -phi
>
>
>
> On Fri, Mar 29, 2013 at 7:25 AM, Nikhila Achukatla <
> [email protected]> wrote:
>
>> Hi,
>>
>> yes, alignment file is correctly generated.
>> No, my data doesn't contain any special characters.
>> I ran each step in isolation and I attached them.
>> Please check them once.
>> In fifth step itself, extract files are not generated.
>> I cleaned the data before proceeding.
>> And I am working on Telugu(Indian language).
>> Will Moses support those languages?
>>
>> And also, I executed with the data provided by Moses website.
>> With that data also same problem occurred.
>> phrase-table.gz,extract.sorted.gz,extract.inv.gz files are just empty.
>> extract.o.sorted.gz file is not at all created.
>>
>> Do it requires any extra softwares to be installed??
>>
>>
>> On 28 March 2013 09:30, Philipp Koehn <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> > I'm hereby attaching a file. I got it when executed 5th step.
>>> > I don't why phrase table,extract.sorted.gz etc. files are not
>>> extracted.
>>> > please help me.
>>>
>>> How do the input files to the extract step look like. Is the
>>> word alignment file correct and has the same number of
>>> lines as the others?
>>>
>>> Do you have any forbidden characters (especially "|") in your
>>> data that may cause problems?
>>>
>>> You can run each step in isolation by running the train-model.perl
>>> with specifying the --first-step and --last-step switches.
>>> The numbers of the steps are listed here:
>>> http://www.statmt.org/moses/?n=FactoredTraining.HomePage
>>>
>>> A common mistake is to forget to clean the parallel corpus
>>> (throw out long sentences or length-mismatched sentence pairs)
>>> which causes faulty word alignment which then causes
>>> phrase extraction to fail.
>>>
>>> > And also I want to know about tokenization step.
>>> > In tokenization step, rather than dividing a sentence into tokens,
>>> will any
>>> > extra
>>> > processing is done?
>>>
>>> A typical additional step is lowercasing or truecasing, which
>>> normalizes words that occur at the beginning at the sentence ("The")
>>> or in all caps ("THE") to a common form ("the").
>>>
>>> -phi
>>>
>>> On Thu, Mar 28, 2013 at 6:14 AM, Nikhila Achukatla
>>> <[email protected]> wrote:
>>> > Hi,
>>> >
>>>
>>> >
>>> > _______________________________________________
>>> > Moses-support mailing list
>>> > [email protected]
>>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>>> >
>>>
>>
>>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] phrase extraction step

Reply via email to