Hi Per > Maybe someone could add some more explicit information to the Advanced > Features page? It might prove very helpful to novice users like me.
Yes, we're always happy to receive contributions to the documentation. cheers - Barry On 08/04/13 15:25, Per Tunedal wrote: > Hi, > Finally, I've succeeded to prune the phrase-table. With a size of 6 % of > the original the translation has actually improved! > Now the phrase-table fits in memory and the translation is fast, or at > least acceptable. > > It took me a while to figure out how to write the TARGET and SOURCE for > the pruning-command, as it is displayed at the Advanced Features page: > cat phrase-table | ./filter-pt -e TARGET -f SOURCE -l FILTER-VALUE > > phrase-table.pruned > The readme file in the sigtest-filter folder helped me somewhat: > cat phrase-table.txt | ./filter-pt -e TARG.suffix -f SOURCE.suffix \ > -l <FILTER-VALUE> > But I had to try several times until I realized that I was supposed to > supply only the "stem" of the filenames, the program ads the suffixes. I > successfully used the following: > cat /home/per/working/train/model/phrase-table | ./filter-pt -e > /home/per/corpora/Total1.sv-fr.clean_urval.fr -f > /home/per/corpora/Total1.sv-fr.clean_urval.sv -l a+e -n 30 > > /home/per/working/train/model/phrase-table.pruned > > The files needed by the program are the ones created by SALM: > Writing corpus to file: > /home/per/corpora/Total1.sv-fr.clean_urval.sv.sa_corpus > Writing offset to file: > /home/per/corpora/Total1.sv-fr.clean_urval.sv.sa_offset > Writing suffix information to file: > /home/per/corpora/Total1.sv-fr.clean_urval.sv.sa_suffix > and the corresponding files for the target language. > > One more "hidden" prerequisite, is that the phrase-table (in a > gzip-file) has to be unpacked before pruning. A got a clue from command > above, taken from the readme-file: cat phrase-table.txt > > Maybe someone could add some more explicit information to the Advanced > Features page? It might prove very helpful to novice users like me. > > Yours, > Per Tunedal > > > On Mon, Apr 8, 2013, at 10:35, Per Tunedal wrote: >> Hi again Rico, >> I've succeeded in building SALM thanks to you. I got the same error >> message as you. >> Successfully built sigtest-filter as well. Will proceed with next step. >> Yours, >> Per Tunedal >> >> On Thu, Apr 4, 2013, at 14:48, Rico Sennrich wrote: >>> Per Tunedal <per.tunedal@...> writes: >>> >>>> Unfortunately, my efforts to build SALM have failed. I tried building >>>> according to the instructions in the SALM readme file: >>>> make allO64 > --snip-- > >>> Try Jonathan Clark's repository, which fixes the compilation issues: >>> https://github.com/jhclark/salm >>> >>> (it still stops with an error message for me [no rule to make target >>> `../../Bin/Linux/Search/SampleNGramIns.O64'], but the necessary binary is >>> successfully compiled (salm/Bin/Linux/Index/IndexSA.*) >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
