Re: [Moses-support] Building the server with bjam
Hi, If xmlrpc-c is installed in standard paths including the abyss server option, then the following command should return zero when run from bash: g++ -include xmlrpc-c/base.hpp -lxmlrpc_server_abyss++ -x c++ - 'int main() {}' and server will be compiled automatically. Otherwise, it expects a path as in --with-xmlrpc-c=/path/to/xmlrpc-c to which it will append to run: /path/to/xmlrpc-c/bin/xmlrpc-c-config c++2 abyss-server --libs /path/to/xmlrpc-c/bin/xmlrpc-c-config c++2 abyss-server --cflags Expecting a path to the top-level directory of installation is consistent with the behavior of other --with flags but different from how autotools did it. This was documented in --help; I just committed a change that performs more error checking. Also, the --install option is now dead (a no-op) and replaced with --prefix by popular demand. Note that this installs in bin and lib directories. Kenneth On 11/30/11 01:48, Kádár Tamás (KTamas) wrote: Hi Sorry for the newbie question, but I can't quite figure out how to build moses server under the new bjam building system. Right now I've built it with: ./bjam --with-irstlm=/usr/local/irstlm --with-xmlrpc-c -j2 --install=/usr/local --with-giza=/home/ubuntu/bin That compiles moses and most of the stuff, but not the server. Thanks and best regards, Tamas ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] add new feature into decoder
Hi everyone, currently, I want to add new feature into moses decoder with chart decoding, I did as document, but I found that not work, the value of total score and element of score in ScoreComponent doesn't change. I used GetScoreBreakDown() and print all value in vector, then I saw that value of new feature didn't changed.Could you please give me some suggest to add new feature into moses decoder with chart decoding. Thank you very much. -- Thu. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Removing duplicates when merging nbest lists for MERT
Hi Thomas Yes, you're correct, mert doesn't remove duplicates in the nbest lists. It's something that we intended to do (and probably mentioned in the mert paper) but somehow never got around to it. As Lane pointed out, you have to be careful to do the duplicate removal correctly. You can only consider hypotheses to be duplicates if they have the same target text, and the same feature values. The mert optimisation actually does duplicate removal implicitly, during the optimisation, since duplicate hypotheses contribute the same line to the envelope. However removing duplicates in the extractor could potentially be more efficient. For pro however, duplicates could make a difference to the optimisation, as they affect the sampling. I recently re-implemented the pro extraction to make it more efficient, and again did intend to do de-duping, but haven't got around to it yet. It would be interesting to know if de-duping makes a difference to the outcome. cheers - Barry On Tuesday 29 Nov 2011 20:06:20 Thomas Schoenemann wrote: Hi everyone! We all know that MERT gets slower in the later iterations. This is not surprising as the n-best lists of all previous iterations are merged. I believe this is quite important for translation performance. Still, it seems important to me to get the merged lists as small as possible. A quick inspection of mert/extractor indicates that duplicates are _not_ removed. Can anyone confirm this? And is this really not done anywhere else, e.g. in mert/mert ? Removing duplicates in the extractor should be easy to implement and I don't think it will take more running time than one gains from smaller list. Best, Thomas (currently University of Pisa) ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] add new feature into decoder
Hi Thu Have you verified that your feature is being called, and inserting values into the feature vector? What do you mean when you say that element of score in ScoreComponent doesn't change ? Do you mean that your new feature has value 0? Or is it missing? To add a feature to moses chart decoder, you have to implement EvaluateChart() (which may not be documented) and make sure that StaticData constructs and registers the feature (which is documented), cheers - Barry On Wednesday 30 Nov 2011 10:55:31 Hoai-Thu Vuong wrote: Hi everyone, currently, I want to add new feature into moses decoder with chart decoding, I did as document, but I found that not work, the value of total score and element of score in ScoreComponent doesn't change. I used GetScoreBreakDown() and print all value in vector, then I saw that value of new feature didn't changed.Could you please give me some suggest to add new feature into moses decoder with chart decoding. Thank you very much. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Building the server with bjam
Hi This is what I get: g++ -include xmlrpc-c/base.hpp -lxmlrpc_server_abyss++ -x c++ - 'int main() {}' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registryPtr::get() const' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetKeepaliveMaxConn' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerRunConn' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girmem::autoObjectPtr::operator=(girmem::autoObjectPtr const)' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerFree' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registryPtr::registryPtr()' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registry::c_registry() const' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerCreate' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registryPtr::operator-() const' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girmem::autoObjectPtr::~autoObjectPtr()' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerRunOnce' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girerr::throwf(char const*, ...)' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerInit' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetKeepaliveTimeout' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerCreateNoAccept' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `DateInit' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_server_abyss_set_handlers2' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girmem::autoObjectPtr::~autoObjectPtr()' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetTimeout' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerCreateSocket' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetAdvertise' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerRun' collect2: ld returned 1 exit status However, I have these packages installed: libxmlrpc-c3, libxmlrpc-c3-dev, libxmlrpc-core-c3, libxmlrpc-core-c3-dev I also downloaded and compiled and installed the lastest xmlrpc-c from svn... KT On Wed, Nov 30, 2011 at 9:59 AM, Kenneth Heafield mo...@kheafield.com wrote: Hi, If xmlrpc-c is installed in standard paths including the abyss server option, then the following command should return zero when run from bash: g++ -include xmlrpc-c/base.hpp -lxmlrpc_server_abyss++ -x c++ - 'int main() {}' and server will be compiled automatically. Otherwise, it expects a path as in --with-xmlrpc-c=/path/to/xmlrpc-c to which it will append to run: /path/to/xmlrpc-c/bin/xmlrpc-c-config c++2 abyss-server --libs /path/to/xmlrpc-c/bin/xmlrpc-c-config c++2 abyss-server --cflags Expecting a path to the top-level directory of installation is consistent with the behavior of other --with flags but different from how autotools did it. This was documented in --help; I just committed a change that performs more error checking. Also, the --install option is now dead (a no-op) and replaced with --prefix by popular demand. Note that this installs in bin and lib directories. Kenneth On 11/30/11 01:48, Kádár Tamás (KTamas) wrote: Hi Sorry for the newbie question, but I can't quite figure out how to build moses server under the new bjam building system. Right now I've built it with: ./bjam --with-irstlm=/usr/local/irstlm --with-xmlrpc-c -j2 --install=/usr/local --with-giza=/home/ubuntu/bin That compiles moses and most of the stuff, but not the server. Thanks and best regards, Tamas ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Building the server with bjam
Fair enough. I've rewritten the test. Does it work now? This library is harder to link against than SRILM. . . Kenneth On 11/30/11 12:19, Kádár Tamás (KTamas) wrote: Hi This is what I get: g++ -include xmlrpc-c/base.hpp -lxmlrpc_server_abyss++ -x c++ - 'int main() {}' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registryPtr::get() const' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetKeepaliveMaxConn' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerRunConn' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girmem::autoObjectPtr::operator=(girmem::autoObjectPtr const)' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerFree' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registryPtr::registryPtr()' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registry::c_registry() const' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerCreate' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registryPtr::operator-() const' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girmem::autoObjectPtr::~autoObjectPtr()' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerRunOnce' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girerr::throwf(char const*, ...)' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerInit' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetKeepaliveTimeout' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerCreateNoAccept' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `DateInit' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_server_abyss_set_handlers2' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girmem::autoObjectPtr::~autoObjectPtr()' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetTimeout' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerCreateSocket' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetAdvertise' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerRun' collect2: ld returned 1 exit status However, I have these packages installed: libxmlrpc-c3, libxmlrpc-c3-dev, libxmlrpc-core-c3, libxmlrpc-core-c3-dev I also downloaded and compiled and installed the lastest xmlrpc-c from svn... KT On Wed, Nov 30, 2011 at 9:59 AM, Kenneth Heafieldmo...@kheafield.com wrote: Hi, If xmlrpc-c is installed in standard paths including the abyss server option, then the following command should return zero when run from bash: g++ -include xmlrpc-c/base.hpp -lxmlrpc_server_abyss++ -x c++ -'int main() {}' and server will be compiled automatically. Otherwise, it expects a path as in --with-xmlrpc-c=/path/to/xmlrpc-c to which it will append to run: /path/to/xmlrpc-c/bin/xmlrpc-c-config c++2 abyss-server --libs /path/to/xmlrpc-c/bin/xmlrpc-c-config c++2 abyss-server --cflags Expecting a path to the top-level directory of installation is consistent with the behavior of other --with flags but different from how autotools did it. This was documented in --help; I just committed a change that performs more error checking. Also, the --install option is now dead (a no-op) and replaced with --prefix by popular demand. Note that this installs in bin and lib directories. Kenneth On 11/30/11 01:48, Kádár Tamás (KTamas) wrote: Hi Sorry for the newbie question, but I can't quite figure out how to build moses server under the new bjam building system. Right now I've built it with: ./bjam --with-irstlm=/usr/local/irstlm --with-xmlrpc-c -j2 --install=/usr/local --with-giza=/home/ubuntu/bin That compiles moses and most of the stuff, but not the server. Thanks and best regards, Tamas ___ Moses-support mailing list Moses-support@mit.edu
Re: [Moses-support] Building the server with bjam
Yup, that fixed it for me, just compiled a fresh moses and the server is compiled and working. Thanks a lot! It's great to see an open-source software with such an awesome community and support. Best regards, Tamas On Wed, Nov 30, 2011 at 1:53 PM, Kenneth Heafield mo...@kheafield.com wrote: Fair enough. I've rewritten the test. Does it work now? This library is harder to link against than SRILM. . . Kenneth On 11/30/11 12:19, Kádár Tamás (KTamas) wrote: Hi This is what I get: g++ -include xmlrpc-c/base.hpp -lxmlrpc_server_abyss++ -x c++ - 'int main() {}' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registryPtr::get() const' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetKeepaliveMaxConn' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerRunConn' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girmem::autoObjectPtr::operator=(girmem::autoObjectPtr const)' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerFree' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registryPtr::registryPtr()' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registry::c_registry() const' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerCreate' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_c::registryPtr::operator-() const' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girmem::autoObjectPtr::~autoObjectPtr()' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerRunOnce' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girerr::throwf(char const*, ...)' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerInit' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetKeepaliveTimeout' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerCreateNoAccept' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `DateInit' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `xmlrpc_server_abyss_set_handlers2' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `girmem::autoObjectPtr::~autoObjectPtr()' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetTimeout' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerCreateSocket' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerSetAdvertise' /usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib/libxmlrpc_server_abyss++.so: undefined reference to `ServerRun' collect2: ld returned 1 exit status However, I have these packages installed: libxmlrpc-c3, libxmlrpc-c3-dev, libxmlrpc-core-c3, libxmlrpc-core-c3-dev I also downloaded and compiled and installed the lastest xmlrpc-c from svn... KT On Wed, Nov 30, 2011 at 9:59 AM, Kenneth Heafieldmo...@kheafield.com wrote: Hi, If xmlrpc-c is installed in standard paths including the abyss server option, then the following command should return zero when run from bash: g++ -include xmlrpc-c/base.hpp -lxmlrpc_server_abyss++ -x c++ -'int main() {}' and server will be compiled automatically. Otherwise, it expects a path as in --with-xmlrpc-c=/path/to/xmlrpc-c to which it will append to run: /path/to/xmlrpc-c/bin/xmlrpc-c-config c++2 abyss-server --libs /path/to/xmlrpc-c/bin/xmlrpc-c-config c++2 abyss-server --cflags Expecting a path to the top-level directory of installation is consistent with the behavior of other --with flags but different from how autotools did it. This was documented in --help; I just committed a change that performs more error checking. Also, the --install option is now dead (a no-op) and replaced with --prefix by popular demand. Note that this installs in bin and lib directories. Kenneth On 11/30/11 01:48, Kádár Tamás (KTamas) wrote: Hi Sorry for the newbie question, but I can't quite figure out how to build moses server under the new bjam building system. Right now I've built it with: ./bjam
[Moses-support] Scripts that are not installed
Dear Moses, The following files exist in scripts or are compiled in scripts but were not installed by the pre-existing Makefile. The Jamfile currently just installs the same files when passed --install-scripts. Should these files be added? Kenneth Directories: regression-testing, tests, and other ems/web/spinner.gif ems/web/bilingual-concordance.css ems/web/close.gif ems/web/general.css ems/web/hierarchical-segmentation.css ems/web/hierarchical-segmentation.js ems/example/data/weight.ini ems/support/split-sentences.perl ems/biconcur/biconcur training/exodus.perl training/symal/cmd.c training/train-global-lexicon-model.perl training/wrappers/filter-excluded-lines.perl training/wrappers/make-factor-suffix.perl training/wrappers/find-unparseable.perl training/phrase-extract/consolidate-direct training/phrase-extract/extract-lex training/phrase-extract/statistics training/phrase-extract/consolidate-reverse training/phrase-extract/relax-parse training/analyse_moses_model.pl generic/giza-parallel.perl generic/fsa-sample.fsa generic/extract-parallel.perl generic/fsa2plf.pl analysis/perllib analysis/perllib/Error.pm analysis/show-phrases-used.pl analysis/oov.pl analysis/suspicious_tokenization.pl analysis/extract-target-trees.py analysis/sg2dot.perl analysis/weight-scan.pl analysis/bootstrap-hypothesis-difference-significance.pl analysis/smtgui analysis/smtgui/newsmtgui.cgi analysis/smtgui/file-descriptions analysis/smtgui/file-factors analysis/smtgui/Corpus.pm analysis/smtgui/filter-phrase-table.pl analysis/smtgui/README analysis/nontranslated_words.pl ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Removing duplicates when merging nbest lists for MERT
Hi! Well, it doesn't have to be the same target translation, just the exact same score vector (and the same feature vector, of course). I agree that MERT is working correctly, my mail was always about the efficiency. In my experiments MERT took the major part of the running time, and I believe others have the same problem. So I care about getting it faster. If you don't plan on an implementation, I will write my own (i.e. modify the existing one). I can then get back to you once I know it is working (and faster). Concerning pro-mode, I think that behavior could be simulated by storing weights for each list-entry. It would be a little more complicated to implement, though. Cheers, Thomas Von: Barry Haddow bhad...@staffmail.ed.ac.uk An: moses-support@mit.edu; Thomas Schoenemann thomas_schoenem...@yahoo.de Gesendet: 12:35 Mittwoch, 30.November 2011 Betreff: Re: [Moses-support] Removing duplicates when merging nbest lists for MERT Hi Thomas Yes, you're correct, mert doesn't remove duplicates in the nbest lists. It's something that we intended to do (and probably mentioned in the mert paper) but somehow never got around to it. As Lane pointed out, you have to be careful to do the duplicate removal correctly. You can only consider hypotheses to be duplicates if they have the same target text, and the same feature values. The mert optimisation actually does duplicate removal implicitly, during the optimisation, since duplicate hypotheses contribute the same line to the envelope. However removing duplicates in the extractor could potentially be more efficient. For pro however, duplicates could make a difference to the optimisation, as they affect the sampling. I recently re-implemented the pro extraction to make it more efficient, and again did intend to do de-duping, but haven't got around to it yet. It would be interesting to know if de-duping makes a difference to the outcome. cheers - Barry On Tuesday 29 Nov 2011 20:06:20 Thomas Schoenemann wrote: Hi everyone! We all know that MERT gets slower in the later iterations. This is not surprising as the n-best lists of all previous iterations are merged. I believe this is quite important for translation performance. Still, it seems important to me to get the merged lists as small as possible. A quick inspection of mert/extractor indicates that duplicates are _not_ removed. Can anyone confirm this? And is this really not done anywhere else, e.g. in mert/mert ? Removing duplicates in the extractor should be easy to implement and I don't think it will take more running time than one gains from smaller list. Best, Thomas (currently University of Pisa) ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Scripts that are not installed
The script merge_alignment.py should be in the scripts directory too to run with mgiza. On 11/30/11, Kenneth Heafield mo...@kheafield.com wrote: Dear Moses, The following files exist in scripts or are compiled in scripts but were not installed by the pre-existing Makefile. The Jamfile currently just installs the same files when passed --install-scripts. Should these files be added? Kenneth Directories: regression-testing, tests, and other ems/web/spinner.gif ems/web/bilingual-concordance.css ems/web/close.gif ems/web/general.css ems/web/hierarchical-segmentation.css ems/web/hierarchical-segmentation.js ems/example/data/weight.ini ems/support/split-sentences.perl ems/biconcur/biconcur training/exodus.perl training/symal/cmd.c training/train-global-lexicon-model.perl training/wrappers/filter-excluded-lines.perl training/wrappers/make-factor-suffix.perl training/wrappers/find-unparseable.perl training/phrase-extract/consolidate-direct training/phrase-extract/extract-lex training/phrase-extract/statistics training/phrase-extract/consolidate-reverse training/phrase-extract/relax-parse training/analyse_moses_model.pl generic/giza-parallel.perl generic/fsa-sample.fsa generic/extract-parallel.perl generic/fsa2plf.pl analysis/perllib analysis/perllib/Error.pm analysis/show-phrases-used.pl analysis/oov.pl analysis/suspicious_tokenization.pl analysis/extract-target-trees.py analysis/sg2dot.perl analysis/weight-scan.pl analysis/bootstrap-hypothesis-difference-significance.pl analysis/smtgui analysis/smtgui/newsmtgui.cgi analysis/smtgui/file-descriptions analysis/smtgui/file-factors analysis/smtgui/Corpus.pm analysis/smtgui/filter-phrase-table.pl analysis/smtgui/README analysis/nontranslated_words.pl ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support -- Regards, John J Morgan ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Scripts that are not installed
On the basis that the released files set probably wasn't maintained, I went ahead and threw most everything in. It now covers everything except these directories: tests, regression-testing, other, bin (generated by bjam) and these source files: *.(c|cpp|h|pbxproj|vcxproj|sln|xcodeproj|vcproj|missing_bin_dir) Jamfile On 11/30/11 19:51, John Morgan wrote: The script merge_alignment.py should be in the scripts directory too to run with mgiza. On 11/30/11, Kenneth Heafield mo...@kheafield.com wrote: Dear Moses, The following files exist in scripts or are compiled in scripts but were not installed by the pre-existing Makefile. The Jamfile currently just installs the same files when passed --install-scripts. Should these files be added? Kenneth Directories: regression-testing, tests, and other ems/web/spinner.gif ems/web/bilingual-concordance.css ems/web/close.gif ems/web/general.css ems/web/hierarchical-segmentation.css ems/web/hierarchical-segmentation.js ems/example/data/weight.ini ems/support/split-sentences.perl ems/biconcur/biconcur training/exodus.perl training/symal/cmd.c training/train-global-lexicon-model.perl training/wrappers/filter-excluded-lines.perl training/wrappers/make-factor-suffix.perl training/wrappers/find-unparseable.perl training/phrase-extract/consolidate-direct training/phrase-extract/extract-lex training/phrase-extract/statistics training/phrase-extract/consolidate-reverse training/phrase-extract/relax-parse training/analyse_moses_model.pl generic/giza-parallel.perl generic/fsa-sample.fsa generic/extract-parallel.perl generic/fsa2plf.pl analysis/perllib analysis/perllib/Error.pm analysis/show-phrases-used.pl analysis/oov.pl analysis/suspicious_tokenization.pl analysis/extract-target-trees.py analysis/sg2dot.perl analysis/weight-scan.pl analysis/bootstrap-hypothesis-difference-significance.pl analysis/smtgui analysis/smtgui/newsmtgui.cgi analysis/smtgui/file-descriptions analysis/smtgui/file-factors analysis/smtgui/Corpus.pm analysis/smtgui/filter-phrase-table.pl analysis/smtgui/README analysis/nontranslated_words.pl ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Scripts that are not installed
Hi John This script is part of mgiza, rather than moses, so it should be copied to the --with-giza directory, along with mgizapp cheers - Barry On Wednesday 30 Nov 2011 19:51:18 John Morgan wrote: The script merge_alignment.py should be in the scripts directory too to run with mgiza. On 11/30/11, Kenneth Heafield mo...@kheafield.com wrote: Dear Moses, The following files exist in scripts or are compiled in scripts but were not installed by the pre-existing Makefile. The Jamfile currently just installs the same files when passed --install-scripts. Should these files be added? Kenneth Directories: regression-testing, tests, and other ems/web/spinner.gif ems/web/bilingual-concordance.css ems/web/close.gif ems/web/general.css ems/web/hierarchical-segmentation.css ems/web/hierarchical-segmentation.js ems/example/data/weight.ini ems/support/split-sentences.perl ems/biconcur/biconcur training/exodus.perl training/symal/cmd.c training/train-global-lexicon-model.perl training/wrappers/filter-excluded-lines.perl training/wrappers/make-factor-suffix.perl training/wrappers/find-unparseable.perl training/phrase-extract/consolidate-direct training/phrase-extract/extract-lex training/phrase-extract/statistics training/phrase-extract/consolidate-reverse training/phrase-extract/relax-parse training/analyse_moses_model.pl generic/giza-parallel.perl generic/fsa-sample.fsa generic/extract-parallel.perl generic/fsa2plf.pl analysis/perllib analysis/perllib/Error.pm analysis/show-phrases-used.pl analysis/oov.pl analysis/suspicious_tokenization.pl analysis/extract-target-trees.py analysis/sg2dot.perl analysis/weight-scan.pl analysis/bootstrap-hypothesis-difference-significance.pl analysis/smtgui analysis/smtgui/newsmtgui.cgi analysis/smtgui/file-descriptions analysis/smtgui/file-factors analysis/smtgui/Corpus.pm analysis/smtgui/filter-phrase-table.pl analysis/smtgui/README analysis/nontranslated_words.pl ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] Scripts that are not installed
I'd prefer to handle GIZA++ like any other dependency: installed in a read-only directory. On 11/30/11 20:46, Barry Haddow wrote: Hi John This script is part of mgiza, rather than moses, so it should be copied to the --with-giza directory, along with mgizapp cheers - Barry On Wednesday 30 Nov 2011 19:51:18 John Morgan wrote: The script merge_alignment.py should be in the scripts directory too to run with mgiza. On 11/30/11, Kenneth Heafield mo...@kheafield.com wrote: Dear Moses, The following files exist in scripts or are compiled in scripts but were not installed by the pre-existing Makefile. The Jamfile currently just installs the same files when passed --install-scripts. Should these files be added? Kenneth Directories: regression-testing, tests, and other ems/web/spinner.gif ems/web/bilingual-concordance.css ems/web/close.gif ems/web/general.css ems/web/hierarchical-segmentation.css ems/web/hierarchical-segmentation.js ems/example/data/weight.ini ems/support/split-sentences.perl ems/biconcur/biconcur training/exodus.perl training/symal/cmd.c training/train-global-lexicon-model.perl training/wrappers/filter-excluded-lines.perl training/wrappers/make-factor-suffix.perl training/wrappers/find-unparseable.perl training/phrase-extract/consolidate-direct training/phrase-extract/extract-lex training/phrase-extract/statistics training/phrase-extract/consolidate-reverse training/phrase-extract/relax-parse training/analyse_moses_model.pl generic/giza-parallel.perl generic/fsa-sample.fsa generic/extract-parallel.perl generic/fsa2plf.pl analysis/perllib analysis/perllib/Error.pm analysis/show-phrases-used.pl analysis/oov.pl analysis/suspicious_tokenization.pl analysis/extract-target-trees.py analysis/sg2dot.perl analysis/weight-scan.pl analysis/bootstrap-hypothesis-difference-significance.pl analysis/smtgui analysis/smtgui/newsmtgui.cgi analysis/smtgui/file-descriptions analysis/smtgui/file-factors analysis/smtgui/Corpus.pm analysis/smtgui/filter-phrase-table.pl analysis/smtgui/README analysis/nontranslated_words.pl ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] --mgiza vs. --parallel
Er, no. As far as I can observe, --parallel parallelizes some tasks that are possible into 2 threads, one for e2f, one for f2e. For step 2, giza++, it starts two threads, one for e2f and one for f2e. Then _that_ is parallelized by mgiza into whatever number of processes I tell it to (the original idea for mgiza is one process per CPU/CPU core). I beleive the solution would be not to parallelize step 2 from train-model.perl but leave it to mgiza if we're training with that. (Until then the workaround is to halve the number of cpus in the mgiza parameters I guess.) I have yet to (and will soon) test the performance differences between 16 processes on a 8-core system versus 8 processes on a 8-core system. In theory guess the latter should fare somewhat better. Best regards, Tamas 2011/11/30 Kenneth Heafield mo...@kheafield.com: Wouldn't it run 64 processes? On 11/30/11 21:31, Kádár Tamás (KTamas) wrote: Hi With the option --parallel and running mgiza with whatever number of cpus, the training script parallelizes mgiza too... so if I run --mgiza --mgiza-cpus 8 --parallel on an 8-core machine, I actually get 16 giza processes. Shouldn't it either a) warn you about this behavior or b) not parallelize mgiza? Makes no sense to me to parallelize something that is already parallelized :) Just my $.02. Best regards, Tamas ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] --mgiza vs. --parallel
Hi Tamas The behaviour seems reasonable to me. The flag --parallel tells train-model.perl to run processes simultaneously if possible, so it will run giza in both directions at the same time. The --mgiza-cpus is just for mgiza. Why not just specify 4 cpus for mgiza, if you only have 8 in total? --mgiza --mgiza-cpus 8 --parallel on an 8-core machine, I actually get 16 giza processes. That's odd, I'd expect 2 giza processes, each with 8 threads. cheers - Barry On Wednesday 30 Nov 2011 22:05:40 Kádár Tamás (KTamas) wrote: Er, no. As far as I can observe, --parallel parallelizes some tasks that are possible into 2 threads, one for e2f, one for f2e. For step 2, giza++, it starts two threads, one for e2f and one for f2e. Then _that_ is parallelized by mgiza into whatever number of processes I tell it to (the original idea for mgiza is one process per CPU/CPU core). I beleive the solution would be not to parallelize step 2 from train-model.perl but leave it to mgiza if we're training with that. (Until then the workaround is to halve the number of cpus in the mgiza parameters I guess.) I have yet to (and will soon) test the performance differences between 16 processes on a 8-core system versus 8 processes on a 8-core system. In theory guess the latter should fare somewhat better. Best regards, Tamas 2011/11/30 Kenneth Heafield mo...@kheafield.com: Wouldn't it run 64 processes? On 11/30/11 21:31, Kádár Tamás (KTamas) wrote: Hi With the option --parallel and running mgiza with whatever number of cpus, the training script parallelizes mgiza too... so if I run --mgiza --mgiza-cpus 8 --parallel on an 8-core machine, I actually get 16 giza processes. Shouldn't it either a) warn you about this behavior or b) not parallelize mgiza? Makes no sense to me to parallelize something that is already parallelized :) Just my $.02. Best regards, Tamas ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] --mgiza vs. --parallel
Seems the right way would be to account for the number of CPUs used by a process. But then you've starting writing PBS. Wrap the mgiza command with flock? On 11/30/11 22:19, Barry Haddow wrote: Hi Tamas The behaviour seems reasonable to me. The flag --parallel tells train-model.perl to run processes simultaneously if possible, so it will run giza in both directions at the same time. The --mgiza-cpus is just for mgiza. Why not just specify 4 cpus for mgiza, if you only have 8 in total? --mgiza --mgiza-cpus 8 --parallel on an 8-core machine, I actually get 16 giza processes. That's odd, I'd expect 2 giza processes, each with 8 threads. cheers - Barry On Wednesday 30 Nov 2011 22:05:40 Kádár Tamás (KTamas) wrote: Er, no. As far as I can observe, --parallel parallelizes some tasks that are possible into 2 threads, one for e2f, one for f2e. For step 2, giza++, it starts two threads, one for e2f and one for f2e. Then _that_ is parallelized by mgiza into whatever number of processes I tell it to (the original idea for mgiza is one process per CPU/CPU core). I beleive the solution would be not to parallelize step 2 from train-model.perl but leave it to mgiza if we're training with that. (Until then the workaround is to halve the number of cpus in the mgiza parameters I guess.) I have yet to (and will soon) test the performance differences between 16 processes on a 8-core system versus 8 processes on a 8-core system. In theory guess the latter should fare somewhat better. Best regards, Tamas 2011/11/30 Kenneth Heafield mo...@kheafield.com: Wouldn't it run 64 processes? On 11/30/11 21:31, Kádár Tamás (KTamas) wrote: Hi With the option --parallel and running mgiza with whatever number of cpus, the training script parallelizes mgiza too... so if I run --mgiza --mgiza-cpus 8 --parallel on an 8-core machine, I actually get 16 giza processes. Shouldn't it either a) warn you about this behavior or b) not parallelize mgiza? Makes no sense to me to parallelize something that is already parallelized :) Just my $.02. Best regards, Tamas ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] add new feature into decoder
Hi thu Can u send me your code via personal mail and I'll try and see what's wrong Best if its on GitHub so I can just pull it Hieu Sent from my flying horse On 30 Nov 2011, at 06:52 PM, Hoai-Thu Vuong thuv...@gmail.com wrote: my feature has value 0, I did exactly as the manual, but the evaluate function of my feature isn't called On Wed, Nov 30, 2011 at 6:38 PM, Barry Haddow bhad...@staffmail.ed.ac.ukwrote: Hi Thu Have you verified that your feature is being called, and inserting values into the feature vector? What do you mean when you say that element of score in ScoreComponent doesn't change ? Do you mean that your new feature has value 0? Or is it missing? To add a feature to moses chart decoder, you have to implement EvaluateChart() (which may not be documented) and make sure that StaticData constructs and registers the feature (which is documented), cheers - Barry On Wednesday 30 Nov 2011 10:55:31 Hoai-Thu Vuong wrote: Hi everyone, currently, I want to add new feature into moses decoder with chart decoding, I did as document, but I found that not work, the value of total score and element of score in ScoreComponent doesn't change. I used GetScoreBreakDown() and print all value in vector, then I saw that value of new feature didn't changed.Could you please give me some suggest to add new feature into moses decoder with chart decoding. Thank you very much. -- Thu. ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
[Moses-support] creating LM with IRST toolkit
hi all can anyone tell me if creating LM with the IRST toolkit is integrated into the EMS yet? if not, is this the entirety of what has to be run? cat $CORPUSFILE | $IRSTLM/bin/add-start-end.sh | gzip -c temp/monolingual.setagged.gz $IRSTLM/bin/build-lm.sh -t stat4 -i gunzip -c temp/monolingual.setagged.gz -n 5 -p -o temp/iarpa.gz -k 10 $IRSTLM/bin/compile-lm temp/iarpa.gz --text yes /dev/stdout | gzip -c $LMFILE ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support
Re: [Moses-support] creating LM with IRST toolkit
Hi Hieu On Dec 1, 2011, at 8:34 AM, Hieu Hoang wrote: hi all can anyone tell me if creating LM with the IRST toolkit is integrated into the EMS yet? I let anyone else to answer this point. if not, is this the entirety of what has to be run? cat $CORPUSFILE | $IRSTLM/bin/add-start-end.sh | gzip -c temp/monolingual.setagged.gz $IRSTLM/bin/build-lm.sh -t stat4 -i gunzip -c temp/monolingual.setagged.gz -n 5 -p -o temp/iarpa.gz -k 10 $IRSTLM/bin/compile-lm temp/iarpa.gz --text yes /dev/stdout | gzip -c $LMFILE yes, this is the procedure to train a LM with IRSTLM. If your corpus is not too big and fits in the memory, you can use the tlm command to esimate the LM and directly store it in binary format (skipping the compile-lm step). Please, see the IRSTLM manual for details on its usage, and send further questions directly to the irstlm mailing list: user-irs...@list.fbk.eu best Nicola ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support ___ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support