speaking of cobbling together a good translation from imperfect parts: google:
A motorist heard on the radio the announcement: "Caution Caution On the N9 you will encounter a ghost driver Please drive far right and do not overtake!.!" The driver: "What do you mean a dozens dozens?!"" microsoft: "A motorist hears the announcement on the radio: 'warning! Caution! On the N9, a (s) satisfies you. Go quite right and not overtake!" The car driver: "what do you mean one? Dozens! Dozens!" :) ~amittai On 6/19/15 10:19, Marcin Junczys-Dowmunt wrote: > German joke: > > Ein Autofahrer hört im Radio die Durchsage: "Achtung! Achtung! Auf der > N9 kommt Ihnen ein Geisterfahrer entgegen. Fahren Sie bitte ganz rechts > und überholen Sie nicht!" > Der Autofahrer: "Was heißt hier einer? Dutzende! Dutzende!" > > Wdniu 2015-06-19 16:12, Read, James C napisał(a): > >> So we've gone from >> >> 1) Acknowledging that the search algorithm performs poorly with no LM, >> tuning or pruning despite the fact the search space clearly contains >> high quality translations >> >> 2) to a public display of en-masse reluctance to acknowledge that such >> is an undesirable quality of the system >> >> 3) to resorting to censorship not only in the literature but also on a >> public mailing list rather than acknowledge point 2. >> >> And your conclusion is that after being a witness to such behaviour I >> would still have a desire to contribute to this field?!? Why YES. I >> would love to keep banging my head against a brick wall. I have no >> other preferred past times. >> >> James >> >> >> >> ------------------------------------------------------------------------ >> *From:* Lane Schwartz <dowob...@gmail.com> >> *Sent:* Friday, June 19, 2015 5:04 PM >> *To:* Read, James C >> *Cc:* Philipp Koehn; Burger, John D.; moses-support@mit.edu >> *Subject:* Re: [Moses-support] Major bug found in Moses >> James, >> You may see the techniques that exist as outdated, wrong-headed, and >> inefficient. You have the right to hold that opinion. It may even be >> that history proves you right. Progress in science is made by people >> posing questions - often questions that challenge the status quo - and >> then doing experiments to answer those questions. >> However, it is incumbent upon you, the proponent of a new idea, to >> design good experiments to attempt to prove or disprove your new >> hypothesis. Dispassionately showing the relative merits and >> shortcomings of your technique with the existing state of the art is >> part of that process. >> I, along with numerous other people on this list, have attempted in >> good faith to answer your questions, and to provide you with our >> perspective based on our collective understanding of the problem. >> You, in turn, have responded belligerently. >> I suggest that you have a frank conversation with your academic >> advisor or other appropriate mentor regarding your future. If you >> intend to pursue a successful career in science, academia, government, >> or industry, you would do well to reconsider the manner in which you >> interact with other people, especially people with whom you disagree. >> In the meantime, I would respectfully request that until you learn how >> to respectfully interact with other adults that you refrain from >> posting to this mailing list. >> Sincerely, >> Lane Schwartz >> >> On Fri, Jun 19, 2015 at 8:45 AM, Read, James C <jcr...@essex.ac.uk >> <mailto:jcr...@essex.ac.uk>> wrote: >> >> According to your book which I have on my desk the job of the TM >> is to model the most likely translations and the job of >> the decoder is to intelligently search the space of translations >> to find the most likely one/s (I'm paraphrasing of course). >> >> Would you like to retract that position and republish a next >> edition of your book which openly states that Moses when used with >> no LM or tuning or pruning can and should be expected to perform >> very poorly and select only the least likely translations? >> >> Don't you in the slightest find it worrying that like at least 90% >> of you code base could be thrown out of the window and high >> scoring results can be obtained with a simple phrase pair based >> rule based system? >> >> Which would you prefer? Would you prefer to consume computational >> resources calculating probabilites or get straight to the answer >> with simple logic and low computational requirements? >> >> BE HONEST! >> >> James >> >> >> >> ------------------------------------------------------------------------ >> *From:* moses-support-boun...@mit.edu >> <mailto:moses-support-boun...@mit.edu> >> <moses-support-boun...@mit.edu >> <mailto:moses-support-boun...@mit.edu>> on behalf of Philipp Koehn >> <p...@jhu.edu <mailto:p...@jhu.edu>> >> *Sent:* Thursday, June 18, 2015 9:39 PM >> *To:* Burger, John D. >> *Cc:* moses-support@mit.edu <mailto:moses-support@mit.edu> >> >> *Subject:* Re: [Moses-support] Major bug found in Moses >> Hi, >> I am great fan of open source software, but there is a danger to >> view its inner workings as a black box - which leads to the >> strange theories of what is going on, instead of real understanding. >> But we can try to understand it. >> In the reported experiment, the language model was removed, >> while the rest of the system was left unchanged. >> The default untuned weights that train-model.perl assigns to a >> model are the following: >> WordPenalty0= -1 >> PhrasePenalty0= 0.2 >> TranslationModel0= 0.2 0.2 0.2 0.2 >> Distortion0= 0.3 >> Since no language model is used, a positive distortion cost will >> lead the decoder to not use any reordering at all. That's a >> good thing in this case. >> The word penalty is used to counteract the language model's >> preference for short translations. Unchecked, there is now a >> bias towards too long translations. >> Then there is the translation model with its equal weights for >> p(e|f) and p(f|e). The p(e|f) weight and scores are fine and well. >> However, p(f|e) only make sense if you have the Bayes theorem >> in your mind and a language model in your back. But in the >> reported setup, there is now a bias to translate into rare English >> phrases, since these will have high p(f|e) scores. >> My best guess is that the reported setup translates common >> function words (such as prepositions) into very long rare English >> phrases - word penalty likes it, p(f|e) likes it, p(e|f) does not mind >> enough - which produces a lot of rubbish. >> By filtering for p(e|f) those junky phrases are removed from the >> phrase table, restricting the decoder to more reasonable choices. >> I content that this is not a bug in the software, but a bug in usage. >> -phi >> >> On Thu, Jun 18, 2015 at 11:32 AM, Burger, John D. <j...@mitre.org >> <mailto:j...@mitre.org>> wrote: >> >> On Jun 17, 2015, at 11:54, Read, James C <jcr...@essex.ac.uk >> <mailto:jcr...@essex.ac.uk>> wrote: >> >> > The question remains why isn't the system capable of finding >> the most likely translations without the LM? >> >> Even if it weren't ill-posed, I don't find this to be an >> interesting question at all. This is like trying to improve >> automobile transmissions by disabling the steering. These are >> the parts we have, and they all work together. >> >> It's not as if human translators don't use their own internal >> language models. >> >> - John Burger >> MITRE >> >> > Evidently, if you filter the phrase table then the LM is not >> as important as you might feel. The question remains why isn't >> the system capable of finding the most likely translations >> without the LM? Why do I need to filter to help the system >> find them? This is undesirable behaviour. Clearly a bug. >> > >> > I include the code I used for filtering. As you can see the >> 4th score only was used as a filtering criteria. >> > >> > #!/usr/bin/perl -w >> > # >> > # Program filters phrase table to leave only phrase pairs >> > # with probability above a threshold >> > # >> > use strict; >> > use warnings; >> > use Getopt::Long; >> > >> > my $phrase; >> > my $min; >> > my $phrase_table; >> > my $filtered_table; >> > >> > GetOptions( 'min=f' => \$min, >> > 'out=s' => \$filtered_table, >> > 'in=s' => \$phrase_table); >> > die "ERROR: must give threshold and phrase table input file >> and output file\n" unless ($min && $phrase_table && >> $filtered_table); >> > die "ERROR: file $phrase_table does not exist\n" unless (-e >> $phrase_table); >> > open (PHRASETABLE, "<$phrase_table") or die "FATAL: Could >> not open phrase table $phrase_table\n";; >> > open (FILTEREDTABLE, ">$filtered_table") or die "FATAL: >> Could not open phrase table $filtered_table\n";; >> > >> > while (my $line = <PHRASETABLE>) >> > { >> > chomp $line; >> > my @columns = split ('\|\|\|', $line); >> > >> > # check that file is a well formatted phrase table >> > if (scalar @columns < 4) >> > { >> > die "ERROR: input file is not a well >> formatted phrase table. A phrase table must have at least four >> colums each column separated by |||\n"; >> > } >> > >> > # get the probability and check it is less than the >> threshold >> > my @scores = split /\s+/, $columns[2]; >> > if ($scores[3] > $min) >> > { >> > print FILTEREDTABLE $line."\n";; >> > } >> > } >> > >> > >> > >> > From: Matt Post <p...@cs.jhu.edu <mailto:p...@cs.jhu.edu>> >> > Sent: Wednesday, June 17, 2015 5:25 PM >> > To: Read, James C >> > Cc: Marcin Junczys-Dowmunt; moses-support@mit.edu >> <mailto:moses-support@mit.edu>; Arnold, Doug >> > Subject: Re: [Moses-support] Major bug found in Moses >> > >> > I think you are misunderstanding how decoding works. The >> highest-weighted translation of each source phrase is not >> necessarily the one with the best BLEU score. This is why the >> decoder retains many options, so that it can search among them >> (together with their reorderings). The LM is an important >> component in making these selections. >> > >> > Also, how did you weight the many probabilities attached to >> each phrase (to determine which was the most probable)? The >> tuning phase of decoding selects weights designed to optimize >> BLEU score. If you weighted them evenly, that is going to >> exacerbate this experiment. >> > >> > matt >> > >> > >> > >> >> On Jun 17, 2015, at 10:22 AM, Read, James C >> <jcr...@essex.ac.uk <mailto:jcr...@essex.ac.uk>> wrote: >> >> >> >> All I did was break the link to the language model and then >> perform filtering. How is that a methodoligical mistake? How >> else would one test the efficacy of the TM in isolation? >> >> >> >> I remain convinced that this is undersirable behaviour and >> therefore a bug. >> >> >> >> James >> >> >> >> >> >> From: Marcin Junczys-Dowmunt <junc...@amu.edu.pl >> <mailto:junc...@amu.edu.pl>> >> >> Sent: Wednesday, June 17, 2015 5:12 PM >> >> To: Read, James C >> >> Cc: Arnold, Doug; moses-support@mit.edu >> <mailto:moses-support@mit.edu> >> >> Subject: Re: [Moses-support] Major bug found in Moses >> >> >> >> Hi James >> >> No, not at all. I would say that is expected behaviour. >> It's how search spaces and optimization works. If anything >> these are methodological mistakes on your side, sorry. You >> are doing weird thinds to the decoder and then you are >> surprised to get weird results from it. >> >> W dniu 2015-06-17 16:07, Read, James C napisał(a): >> >>> >> >>> So, do we agree that this is undersirable behaviour and >> therefore a bug? >> >>> >> >>> James >> >>> >> >>> From: Marcin Junczys-Dowmunt <junc...@amu.edu.pl >> <mailto:junc...@amu.edu.pl>> >> >>> Sent: Wednesday, June 17, 2015 5:01 PM >> >>> To: Read, James C >> >>> Subject: Re: [Moses-support] Major bug found in Moses >> >>> >> >>> As I said. With an unpruned phrase table and an decoder >> that just optmizes some unreasonble set of weights all bets >> are off, so if you get very low BLEU point there, it's not >> surprising. It's probably jumping around in a very weird >> search space. With a pruned phrase table you restrict the >> search space VERY strongly. Nearly everything that will be >> produced is a half-decent translation. So yes, I can imagine >> that would happen. >> >>> Marcin >> >>> W dniu 2015-06-17 15:56, Read, James C napisał(a): >> >>> You would expect an improvement of 37 BLEU points? >> >>> >> >>> James >> >>> >> >>> >> >>> From: Marcin Junczys-Dowmunt <junc...@amu.edu.pl >> <mailto:junc...@amu.edu.pl>> >> >>> Sent: Wednesday, June 17, 2015 4:32 PM >> >>> To: Read, James C >> >>> Cc: Moses-support@mit.edu <mailto:Moses-support@mit.edu>; >> Arnold, Doug >> >>> Subject: Re: [Moses-support] Major bug found in Moses >> >>> >> >>> Hi James, >> >>> there are many more factors involved than just >> probability, for instance word penalties, phrase penalities >> etc. To be able to validate your own claim you would need to >> set weights for all those non-probabilities to zero. Otherwise >> there is no hope that moses will produce anything similar to >> the most probable translation. And based on that there is no >> surprise that there may be different translations. A pruned >> phrase table will produce naturally less noise, so I would say >> the behaviour you describe is quite exactly what I would >> expect to happen. >> >>> Best, >> >>> Marcin >> >>> W dniu 2015-06-17 15:26, Read, James C napisał(a): >> >>> Hi all, >> >>> >> >>> I tried unsuccessfully to publish experiments showing this >> bug in Moses behaviour. As a result I have lost interest in >> attempting to have my work published. Nonetheless I think you >> all should be aware of an anomaly in Moses' behaviour which I >> have thoroughly exposed and should be easy enough for you to >> reproduce. >> >>> >> >>> As I understand it the TM logic of Moses should select the >> most likely translations according to the TM. I would >> therefore expect a run of Moses with no LM to find sentences >> which are the most likely or at least close to the most likely >> according to the TM. >> >>> >> >>> To test this behaviour I performed two runs of Moses. One >> with an unfiltered phrase table the other with a filtered >> phrase table which left only the most likely phrase pair for >> each source language phrase. The results were truly startling. >> I observed huge differences in BLEU score. The filtered phrase >> tables produced much higher BLEU scores. The beam size used >> was the default width of 100. I would not have been surprised >> in the differences in BLEU scores where minimal but they were >> quite high. >> >>> >> >>> I have been unable to find a logical explanation for this >> behaviour other than to conclude that there must be some kind >> of bug in Moses which causes a TM only run of Moses to perform >> poorly in finding the most likely translations according to >> the TM when there are less likely phrase pairs included in the >> race. >> >>> >> >>> I hope this information will be useful to the Moses >> community and that the cause of the behaviour can be found and >> rectified. >> >>> >> >>> James >> >>> >> >>> _______________________________________________ >> >>> Moses-support mailing list >> >>> >> >>> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >> >>> http://mailman.mit.edu/mailman/listinfo/moses-support >> <http://mailman.mit.edu/mailman/listinfo/moses-support> >> >>> >> >>> >> >>> >> >>> >> >> >> >> >> >> _______________________________________________ >> >> Moses-support mailing list >> >> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >> >> http://mailman.mit.edu/mailman/listinfo/moses-support >> <http://mailman.mit.edu/mailman/listinfo/moses-support> >> > >> > _______________________________________________ >> > Moses-support mailing list >> > Moses-support@mit.edu <mailto:Moses-support@mit.edu> >> > http://mailman.mit.edu/mailman/listinfo/moses-support >> <http://mailman.mit.edu/mailman/listinfo/moses-support> >> >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >> >> -- >> When a place gets crowded enough to require ID's, social collapse is not >> far away. It is time to go elsewhere. The best thing about space travel >> is that it made it possible to go elsewhere. >> -- R.A. Heinlein, "Time Enough For Love" >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > > _______________________________________________ > Moses-support mailing list > Moses-support@mit.edu > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support