Hi, I think I sent you actually the wrong commands, look at your "training15.out" file for the right command. Before all this, you should try to use full paths, though.
-phi On Sun, Mar 31, 2013 at 5:33 PM, Philipp Koehn <[email protected]> wrote: > Hi, > > since something goes wrong in the phrase extraction step, please try to > run the commands by hand and check where something fails. The commands are > reported in STDERR of the step. > > In your case: > > /home/nikhila/project/mosesdecoder/scripts/generic/score-parallel.perl 1 > "sort " /home/nikhila/project/mosesdecoder/scripts/../bin/score > ./model/extract.sorted.gz ./model/lex.f2e ./model/phrase-table.half.f2e.gz > 0 > > ln -s ./model/extract.sorted.gz ./model/tmp.7551/extract.0.gz > > /home/nikhila/project/mosesdecoder/scripts/../bin/score > ./model/tmp.7551/extract.0.gz ./model/lex.f2e > ./model/tmp.7551/phrase-table.half.00000.gz > > ./model/tmp.7551/run.0.shmv ./model/tmp.7551/phrase-table.half.00000.gz > > From looking at this, my guess is that there is problem with > not specifying full paths, but rather "." as root directory. > > -phi > > > > On Fri, Mar 29, 2013 at 7:25 AM, Nikhila Achukatla < > [email protected]> wrote: > >> Hi, >> >> yes, alignment file is correctly generated. >> No, my data doesn't contain any special characters. >> I ran each step in isolation and I attached them. >> Please check them once. >> In fifth step itself, extract files are not generated. >> I cleaned the data before proceeding. >> And I am working on Telugu(Indian language). >> Will Moses support those languages? >> >> And also, I executed with the data provided by Moses website. >> With that data also same problem occurred. >> phrase-table.gz,extract.sorted.gz,extract.inv.gz files are just empty. >> extract.o.sorted.gz file is not at all created. >> >> Do it requires any extra softwares to be installed?? >> >> >> On 28 March 2013 09:30, Philipp Koehn <[email protected]> wrote: >> >>> Hi, >>> >>> > I'm hereby attaching a file. I got it when executed 5th step. >>> > I don't why phrase table,extract.sorted.gz etc. files are not >>> extracted. >>> > please help me. >>> >>> How do the input files to the extract step look like. Is the >>> word alignment file correct and has the same number of >>> lines as the others? >>> >>> Do you have any forbidden characters (especially "|") in your >>> data that may cause problems? >>> >>> You can run each step in isolation by running the train-model.perl >>> with specifying the --first-step and --last-step switches. >>> The numbers of the steps are listed here: >>> http://www.statmt.org/moses/?n=FactoredTraining.HomePage >>> >>> A common mistake is to forget to clean the parallel corpus >>> (throw out long sentences or length-mismatched sentence pairs) >>> which causes faulty word alignment which then causes >>> phrase extraction to fail. >>> >>> > And also I want to know about tokenization step. >>> > In tokenization step, rather than dividing a sentence into tokens, >>> will any >>> > extra >>> > processing is done? >>> >>> A typical additional step is lowercasing or truecasing, which >>> normalizes words that occur at the beginning at the sentence ("The") >>> or in all caps ("THE") to a common form ("the"). >>> >>> -phi >>> >>> On Thu, Mar 28, 2013 at 6:14 AM, Nikhila Achukatla >>> <[email protected]> wrote: >>> > Hi, >>> > >>> >>> > >>> > _______________________________________________ >>> > Moses-support mailing list >>> > [email protected] >>> > http://mailman.mit.edu/mailman/listinfo/moses-support >>> > >>> >> >> >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
