Hi,
Finally, I've succeeded to prune the phrase-table. With a size of 6 % of
the original the translation has actually improved!
Now the phrase-table fits in memory and the translation is fast, or at
least acceptable.
It took me a while to figure out how to write the TARGET and SOURCE for
the pruning-command, as it is displayed at the Advanced Features page:
cat phrase-table | ./filter-pt -e TARGET -f SOURCE -l FILTER-VALUE >
phrase-table.pruned
The readme file in the sigtest-filter folder helped me somewhat:
cat phrase-table.txt | ./filter-pt -e TARG.suffix -f SOURCE.suffix \
-l <FILTER-VALUE>
But I had to try several times until I realized that I was supposed to
supply only the "stem" of the filenames, the program ads the suffixes. I
successfully used the following:
cat /home/per/working/train/model/phrase-table | ./filter-pt -e
/home/per/corpora/Total1.sv-fr.clean_urval.fr -f
/home/per/corpora/Total1.sv-fr.clean_urval.sv -l a+e -n 30 >
/home/per/working/train/model/phrase-table.pruned
The files needed by the program are the ones created by SALM:
Writing corpus to file:
/home/per/corpora/Total1.sv-fr.clean_urval.sv.sa_corpus
Writing offset to file:
/home/per/corpora/Total1.sv-fr.clean_urval.sv.sa_offset
Writing suffix information to file:
/home/per/corpora/Total1.sv-fr.clean_urval.sv.sa_suffix
and the corresponding files for the target language.
One more "hidden" prerequisite, is that the phrase-table (in a
gzip-file) has to be unpacked before pruning. A got a clue from command
above, taken from the readme-file: cat phrase-table.txt
Maybe someone could add some more explicit information to the Advanced
Features page? It might prove very helpful to novice users like me.
Yours,
Per Tunedal
On Mon, Apr 8, 2013, at 10:35, Per Tunedal wrote:
> Hi again Rico,
> I've succeeded in building SALM thanks to you. I got the same error
> message as you.
> Successfully built sigtest-filter as well. Will proceed with next step.
> Yours,
> Per Tunedal
>
> On Thu, Apr 4, 2013, at 14:48, Rico Sennrich wrote:
> > Per Tunedal <per.tunedal@...> writes:
> >
> > > Unfortunately, my efforts to build SALM have failed. I tried building
> > > according to the instructions in the SALM readme file:
> > > make allO64
--snip--
> > Try Jonathan Clark's repository, which fixes the compilation issues:
> > https://github.com/jhclark/salm
> >
> > (it still stops with an error message for me [no rule to make target
> > `../../Bin/Linux/Search/SampleNGramIns.O64'], but the necessary binary is
> > successfully compiled (salm/Bin/Linux/Index/IndexSA.*)
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support