GitHub user mjpost opened a pull request: https://github.com/apache/incubator-joshua/pull/43
Phrase-based decoder rewrite The phrase-based decoder used to add nonterminals to every phrase-based rule, treating all such rules as left-branching ones. This was a hassle because everything had to be converted, e.g., after extracting from Thrax. Now, the phrase tables have no nonterminals on the source and target sides. Instead, glue rules are used. This means this is not backwards compatible. Phrase-based language packs will have to be recompiled, but this needs to be done anyway. You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/incubator-joshua JOSHUA-284 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-joshua/pull/43.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #43 ---- commit dcc7e7ee72228de08b70003a49344c2614eaedbe Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-16T22:13:06Z large commit converting phrase-based decoding to new rule format Not working yet, but much of the code is redone and future estimates are being computed correctly commit 32504c47bbc90b3fd4a8d02298b9758fa8126a44 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-16T22:13:50Z updated scripts to work with the new format commit 48a9aad7873b969230aad90d6e0c61e13ae2d2b4 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-16T22:14:15Z repacked the grammar commit dac822d15145614c33f5fb12d2797e1f91825bb3 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-17T10:23:57Z missed file in commit commit b1ec62711d15f3b692b6a7026752123f75522f6e Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-17T10:24:07Z enabled test commit 1022699cc744fa9fbc21f4b19122f51e3985a371 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-17T10:24:46Z temporary commenting-out of very verbose output commit 2e746c1864ca7eb6be27f2fca3ab258c9ebe7adb Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-19T18:14:18Z changed order of assert() args commit 048b2e30f849de3f1ac82e6017ea2aab299f6b8d Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-19T18:15:18Z removed RHS nonterminal commit af4ef88d5a6a6a1cc4167ec421b4b6bd1a91dc0a Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-19T18:15:36Z added derived directories commit 9b73d6147a61580058cc57c86c1f623f44b7452a Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-19T18:16:47Z build two nodes over terminal productions commit 5719c8cff728499bffd1053462351340f1d91353 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-19T18:17:21Z fixed distortion computation to work with new format code now produces a translation on my test case, though it's not the correct one commit eb00223870c7683cf8e557ab689a1979fb36ec1d Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-20T00:43:58Z converted from span -> separate i, j commit 473b3016562677671f70a19cd15d67a2bc1a5c83 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-20T00:44:14Z off-by-one error in computing future estimates commit 574cb36b5e1b610e37eda81d6d76b4318c141a4c Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-20T00:44:44Z bugfix: this is (probably) supposed to return the pruning estimate commit 16d5647bee30345ffa56b5b7d5bebc1021afa3fa Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-20T00:45:12Z fixed computation of distortion commit 36cde50ba37df9c9b2ead6b063ac5935e3dd253d Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-20T13:30:42Z moved comparator into Candidate commit 49dbf8cbaf2f1e0c648f8eb705ab3887aa06b039 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-20T13:31:18Z removed nonterminals from OOV rules commit e3b60ca9a7fea7d25a8533b630a1a66d29349a6f Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-21T11:53:26Z minor cleanup of assignment logic commit 293db94c2853f7dc15bd6fecdf3b39bd3a4b4965 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-21T11:53:41Z Bug fix in reporting inside cost â everything now works commit 0e49bc537b05549930802bf6c187b849c4c67adb Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-21T11:55:55Z added debug-joshua which sets debugging and uses classes instead of the jar This means you can run the command-line version while in Eclipse without having to rebuild the jar file (which is time-consuming). commit d6820c6f3bc41ca87dfff4a8ed18172de4f849e6 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-21T12:01:17Z removed debugging output commit cd3ff0c6d0d2ad959cd4f292d9ee02f4e7da8b0a Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-21T12:01:38Z removed alignments from test, created new test with alignments (currently not working...) commit 25d28fe2ce32a4b130a4412e982d6e16d5af8afc Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-21T12:24:58Z small cleanup commit d28b4f39c578197803beba2c376db5ed95774576 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-21T17:36:37Z Merge branch 'master' into JOSHUA-284 commit 12b834e271a361417cbdabf79036538493cdb122 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-22T20:59:43Z Now building HGNode over the phrase when it's added This should be much quicker because the HGNode gets built only once, when the target phrases are added, instead of building it many times, each time they are used commit bf12adc8b8e130c9f9addc69f47e9cf7e0774f72 Author: Matt Post <p...@cs.jhu.edu> Date: 2016-08-22T21:26:44Z added note about phrase-based change to CHANGELOG ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---