[Moses-support] Build failure

2016-11-16 Thread Tyler Wilbers
Dear Moses Community—

I have to install Moses for a SMT class this semester. I am on using macOS 
Sierra and installed dependencies with homebrew. I used the following build 
command:

  ./bjam —with-boost=/usr/local/Cellar/boost/1.62.0 --with-/usr/local/bin/cmph 
--with-xmlrpc-c=/usr/local/Cellar/xmlrpc-c/1.39.07/ 
--with-irstlm=~/SMT-dev/irstlm/irstlm-5.80.08/trunk --with-mm --with-probing-pt 
-j5 toolset=clang -q -d2

It produces the following build errors after failure (build log attached).

Best,

-T


build.log.gz
Description: GNU Zip compressed data
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Does PhraseDictionaryMultiModel require all models to contain all phrases?

2016-11-16 Thread Michael Denkowski
Hi Lane,

As Vito mentioned, PhraseDictionaryMultiModel is originally for linear
interpolation.  It has an option to output all scores from all models
rather than interpolating but I ended up writing PhraseDictionaryGroup to
have the specific functionality plus some additions.  One of the additions
is the "default-average-others" option specifically for disjoint phrase
tables.  It will fill in scores for missing phrase pairs in any model by
averaging the scores from other models.  Empirically, this works better
than zero-filling the scores or using a default score.

Another addition (which was never fully documented) is the
"model-bitmap-count" option that adds indicator features for which models
contain a phrase.  For instance, if this option is enabled for 2 models, A
and B, you have 3 features that will have the following scores when:
- A phrase is in model A but *not* model B: 1 0 0
- A phrase is in *not* model A but is in model B: 0 1 0
- A phrase is in *both* models A and B: 0 0 1
Empirically, these help as well and batch MIRA does a fine job of finding
weights for the expanded feature set.

Best,
Michael

On Wed, Nov 16, 2016 at 11:32 AM, Vito Mandorino <
vito.mandor...@linguacustodia.com> wrote:

> Hi Lane,
>
> as far as I know the PhraseDictionaryMultiModel does a linear
> interpolation of two or more phrase-tables at decoding time and is
> equivalent to linearly interpolate the phrase-table beforehand, except
> maybe for some pruning-related corner cases.
>
> If a translation option is not present in one of the phrase-tables, it is
> considered as having zero (or very low, e.g. 10e-6) probability for that
> phrase-table. I guess that the answer to your last question is yes, at
> least in the general case (phrase-tables coming from training on a
> bilingual corpus).
>
> Vito
>
>
> 2016-11-16 16:41 GMT+01:00 Lane Schwartz :
>
>> Hi,
>>
>> I'm potentially interested in using PhraseDictionaryMultiModel, and how
>> it differs from PhraseDictionaryGroup.
>>
>> With PhraseDictionaryMultiModel, is it OK to have disjoint phrase tables?
>> That is, can PhraseDictionaryMultiModel handle the situation where a
>> translation option is present in one table but not the other(s)?
>>
>> Thanks,
>> Lane
>>
>>
>> ___
>> Moses-support mailing list
>> Moses-support@mit.edu
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
>
> --
> *M**. Vito MANDORINO -- Chief Scientist*
>
>
> [image: Description : Description : lingua_custodia_final full logo]
>
>  *The Translation Trustee*
>
> *1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*
>
> *Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89
> <%2B33%206%2084%2065%2068%2089>*
>
> *Email :*  *vito.mandor...@linguacustodia.com
> *
>
> *Website :*
> *www.linguacustodia.finance *
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Does PhraseDictionaryMultiModel require all models to contain all phrases?

2016-11-16 Thread Vito Mandorino
Hi Lane,

as far as I know the PhraseDictionaryMultiModel does a linear interpolation
of two or more phrase-tables at decoding time and is equivalent to linearly
interpolate the phrase-table beforehand, except maybe for some
pruning-related corner cases.

If a translation option is not present in one of the phrase-tables, it is
considered as having zero (or very low, e.g. 10e-6) probability for that
phrase-table. I guess that the answer to your last question is yes, at
least in the general case (phrase-tables coming from training on a
bilingual corpus).

Vito


2016-11-16 16:41 GMT+01:00 Lane Schwartz :

> Hi,
>
> I'm potentially interested in using PhraseDictionaryMultiModel, and how it
> differs from PhraseDictionaryGroup.
>
> With PhraseDictionaryMultiModel, is it OK to have disjoint phrase tables?
> That is, can PhraseDictionaryMultiModel handle the situation where a
> translation option is present in one table but not the other(s)?
>
> Thanks,
> Lane
>
>
> ___
> Moses-support mailing list
> Moses-support@mit.edu
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
*M**. Vito MANDORINO -- Chief Scientist*


[image: Description : Description : lingua_custodia_final full logo]

 *The Translation Trustee*

*1, Place Charles de Gaulle, **78180 Montigny-le-Bretonneux*

*Tel : +33 1 30 44 04 23   Mobile : +33 6 84 65 68 89*

*Email :*  *vito.mandor...@linguacustodia.com
*

*Website :*
*www.linguacustodia.finance *
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


[Moses-support] Does PhraseDictionaryMultiModel require all models to contain all phrases?

2016-11-16 Thread Lane Schwartz
Hi,

I'm potentially interested in using PhraseDictionaryMultiModel, and how it
differs from PhraseDictionaryGroup.

With PhraseDictionaryMultiModel, is it OK to have disjoint phrase tables?
That is, can PhraseDictionaryMultiModel handle the situation where a
translation option is present in one table but not the other(s)?

Thanks,
Lane
___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support


Re: [Moses-support] Syntax-based Constrained Decoding

2016-11-16 Thread Hieu Hoang
good to know that the constrained decoding works. And yes, the 
reachability of the training data is only theoritical in the absence of 
pruning such as cube pruning, beams etc.



On 15/11/2016 20:00, Shuoyang Ding wrote:

Hi Hieu,

I’ve made change 1, 2, 4 before emailing you, and the coverage didn’t 
change much. It turns out the bottleneck is on beam-threshold — the 
default value was 1e-5, which is a pretty tough limit for constrained 
decoding.


After setting that to 0 I played around a little bit with cube-pruning 
limit. The coverage is around 25% to 40% depending on what number you 
use, but higher coverage comes with longer decoding time, which is 
what one would expect to happen.


Still, for string-to-tree constrained decoding the easiest way may 
still be decoding with phrase tables built per-sentence, since the 
decoding is generally slower. However, even for that, the default 
value of beam-threshold needs to be overridden in order to make it 
work properly.


Hope the info helps.

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding 

On Oct 28, 2016, at 9:27 AM, Hieu Hoang > wrote:


good point. The decoder is set up to translate quickly so there's a 
few pruning parameters which throws out low scoring rules or hypotheses.


These are some of the pruning parameters you'll need to change (there 
may be more):

  1. [feature]
  PhraseDictionaryWHATEVER table-limit=0
  2. [cube-pruning-pop-limit]
  100
  3. [beam-threshold]
  0
  4. [stack]
  100
Make the change 1 at a time in case it makes decoding too slow, even 
with constrained decoding.


It may be that you have to run the decoder with  phrase-tables that 
are trained only on 1 sentence at a time.


I'll be interested in knowing how you get on so let me know how it goes

On 26/10/2016 13:56, Shuoyang Ding wrote:

Hi All,

I’m trying to do syntax-based constrained decoding on the same data 
from which I extracted my rules, and I’m getting very low coverage 
(~12%). I’m using GHKM rule extraction which in theory should be 
able to reconstruct the target translation even only with minimal rules.


Judging from the search graph output, the decoder seems to prune out 
rules with very low scores, even if they are the only rule that can 
reconstruct the original reference.


I’m curious if there is a way in the current constrained decoding 
implementation such that I can disable pruning? Or at least, if it 
is feasible to do so?


Thanks!

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding 


___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support






___
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support