[jira] [Commented] (JOSHUA-297) List supported versions of Hadoop

2016-08-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15436177#comment-15436177
 ] 

Lewis John McGibbney commented on JOSHUA-297:
-

The supported version is 2.5.2
https://github.com/joshua-decoder/thrax/blob/master/.classpath#L8


> List supported versions of Hadoop
> -
>
> Key: JOSHUA-297
> URL: https://issues.apache.org/jira/browse/JOSHUA-297
> Project: Joshua
>  Issue Type: Task
>Reporter: Bob Paulin
>Assignee: Matt Post
>Priority: Minor
> Fix For: 6.1
>
> Attachments: thrax-hadoop0.20.2.log, thrax-hadoop2.6.4.log
>
>
> When working through the training tutorial I noticed that no version of 
> Hadoop was listed so I tried the latest Hadoop 2.6.4.  The Thrax Job failed 
> on this version.  It worked however with 0.20.2 .  I found this on 
> http://joshua.incubator.apache.org/6.0/pipeline.html by hovering over a link 
> on the Hadoop section.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-24 Thread Matt Post
It didn't regenerate. Try wiping out your rundir and starting over. 

matt (from my phone)

> On Aug 24, 2016, at 4:08 PM, Lewis John McGibbney (JIRA)  
> wrote:
> 
> 
>[ 
> https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435687#comment-15435687
>  ] 
> 
> Lewis John McGibbney commented on JOSHUA-304:
> -
> 
> [~post] unfortunately my local tests are still not coming up with anything 
> fruitful.
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua(JOSHUA-304) $ 
> $JOSHUA/bin/pipeline.pl --type hiero --rundir 8 --readme "Baseline Hiero run 
> 8 --lm-gen berkeleylm --lm berkeleylm --aligner berkeley proposed bug fixed 
> in ../../scripts/training/paralign.pl" --source es --target en --lm-gen 
> berkeleylm --lm berkeleylm --aligner berkeley --corpus 
> $SPANISH/corpus/asr/callhome_train --corpus $SPANISH/corpus/asr/fisher_train 
> --tune  $SPANISH/corpus/asr/fisher_dev --test  
> $SPANISH/corpus/asr/callhome_devtest
> [train-copy-and-filter] cached, skipping...
> [train-tokenize-es] cached, skipping...
> [train-tokenize-en] cached, skipping...
> [train-trim] cached, skipping...
> [train-lowercase-es] cached, skipping...
> [train-lowercase-en] cached, skipping...
> [train-vocab-es] cached, skipping...
> [train-vocab-en] cached, skipping...
> [tune-copy-and-filter] cached, skipping...
> [tune-tokenize-es] cached, skipping...
> [tune-tokenize-en.0] cached, skipping...
> [tune-tokenize-en.1] cached, skipping...
> [tune-tokenize-en.2] cached, skipping...
> [tune-tokenize-en.3] cached, skipping...
> [tune-lowercase-es] cached, skipping...
> [tune-lowercase-en.0] cached, skipping...
> [tune-lowercase-en.1] cached, skipping...
> [tune-lowercase-en.2] cached, skipping...
> [tune-lowercase-en.3] cached, skipping...
> [tune-vocab-es] cached, skipping...
> [tune-vocab-en.0] cached, skipping...
> [tune-vocab-en.1] cached, skipping...
> [tune-vocab-en.2] cached, skipping...
> [tune-vocab-en.3] cached, skipping...
> [test-copy-and-filter] cached, skipping...
> [test-tokenize-es] cached, skipping...
> [test-tokenize-en] cached, skipping...
> [test-lowercase-es] cached, skipping...
> [test-lowercase-en] cached, skipping...
> [test-vocab-es] cached, skipping...
> [test-vocab-en] cached, skipping...
> [source-numlines] cached, skipping...
> [source-numlines] retrieved cached result =>   151810
> [berkeley-aligner-chunk-0] rebuilding...
>  dep=alignments/0/word-align.conf [CHANGED]
>  dep=/usr/local/incubator-joshua/8/data/train/splits/corpus.es.0 [NOT FOUND]
>  dep=/usr/local/incubator-joshua/8/data/train/splits/corpus.en.0 [NOT FOUND]
>  dep=alignments/0/training.align [NOT FOUND]
>  cmd=java -d64 -Xmx10g -jar 
> /usr/local/incubator-joshua/ext/berkeleyaligner/distribution/berkeleyaligner.jar
>  ++alignments/0/word-align.conf
>  JOB FAILED (return code 1)
> [aligner-combine] rebuilding...
>  dep=alignments/0/training.en-es.align [NOT FOUND]
>  dep=alignments/training.align [CHANGED]
>  cmd=cat alignments/0/training.en-es.align > alignments/training.align
>  JOB FAILED (return code 1)
> cat: alignments/0/training.en-es.align: No such file or directory
> {code}
> 
>> word-align.conf alignment template file not compatible with berkeley aligner
>> 
>> 
>>Key: JOSHUA-304
>>URL: https://issues.apache.org/jira/browse/JOSHUA-304
>>Project: Joshua
>> Issue Type: Bug
>> Components: alignment, berkeley, templates
>>   Affects Versions: 6.0.5
>>   Reporter: Lewis John McGibbney
>>   Priority: Blocker
>>Fix For: 6.1
>> 
>> 
>> It takes me quite some time to debug what was going on and why pipeline's 
>> were failing when using the berkeley aligner.
>> It turns out that the word-align.conf template provided at
>> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
>> is not compatible with the berkeley aligner. 
>> In particular the following lines are non compatible
>> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
>> Evidence of this is provided below
>> {code}
>> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
>> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
>> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
>> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
>> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
>> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
>> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
>> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
>> 

[jira] [Resolved] (JOSHUA-305) joshua-6.1-SNAPSHOT-source-release.zip takes ages to build

2016-08-24 Thread Lewis John McGibbney (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved JOSHUA-305.
-
Resolution: Not A Bug

This was due to a large language model being present within the joshua 
directory. This is not an issue.

> joshua-6.1-SNAPSHOT-source-release.zip takes ages to build
> --
>
> Key: JOSHUA-305
> URL: https://issues.apache.org/jira/browse/JOSHUA-305
> Project: Joshua
>  Issue Type: Bug
>  Components: build, core
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> When someone runs mvn clean install, the 
> joshua-6.1-SNAPSHOT-source-release.zip step takes absolutely ages to build. 
> We should investigate why this is the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (JOSHUA-305) joshua-6.1-SNAPSHOT-source-release.zip takes ages to build

2016-08-24 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created JOSHUA-305:
---

 Summary: joshua-6.1-SNAPSHOT-source-release.zip takes ages to build
 Key: JOSHUA-305
 URL: https://issues.apache.org/jira/browse/JOSHUA-305
 Project: Joshua
  Issue Type: Bug
  Components: build, core
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Priority: Blocker
 Fix For: 6.1


When someone runs mvn clean install, the joshua-6.1-SNAPSHOT-source-release.zip 
step takes absolutely ages to build. We should investigate why this is the case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (JOSHUA-272) Simplify the packing and usage of phrase-based grammars

2016-08-24 Thread Matt Post (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Post resolved JOSHUA-272.
--
Resolution: Fixed

> Simplify the packing and usage of phrase-based grammars
> ---
>
> Key: JOSHUA-272
> URL: https://issues.apache.org/jira/browse/JOSHUA-272
> Project: Joshua
>  Issue Type: Improvement
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> For historical reasons, phrase-based grammars add some complexity to 
> decoding. The complete tree under each top-level trie node in packed grammars 
> has to fit within a single packed grammars slice, which is limited to 2 GB 
> due to constraints on the size of Java byte[] arrays. We used to sort on just 
> the first item in the trie, which was a problem for phrase-based decoding, 
> since phrase-based rules are implemented as left-branching hierarchical 
> rules. In order to pack large grammars, we packed them without the leading 
> [X,1], and then added it when loading the grammars, both for the packed and 
> memory-based grammars. This was a real mess.
> This was all fixed with a commit a while ago that packs and reads packed 
> grammars based on the first two symbols on the source side. So we should 
> remove all the complexity associated with phrases. They should just be 
> regular rules. There is also a lot of redundancy across the codebase in 
> parsing rules, converting them to different formats, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-272) Simplify the packing and usage of phrase-based grammars

2016-08-24 Thread Matt Post (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435616#comment-15435616
 ] 

Matt Post commented on JOSHUA-272:
--

Phrase-based decoding has been changed to no longer use left-branching rules, 
so this no longer applies.

> Simplify the packing and usage of phrase-based grammars
> ---
>
> Key: JOSHUA-272
> URL: https://issues.apache.org/jira/browse/JOSHUA-272
> Project: Joshua
>  Issue Type: Improvement
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> For historical reasons, phrase-based grammars add some complexity to 
> decoding. The complete tree under each top-level trie node in packed grammars 
> has to fit within a single packed grammars slice, which is limited to 2 GB 
> due to constraints on the size of Java byte[] arrays. We used to sort on just 
> the first item in the trie, which was a problem for phrase-based decoding, 
> since phrase-based rules are implemented as left-branching hierarchical 
> rules. In order to pack large grammars, we packed them without the leading 
> [X,1], and then added it when loading the grammars, both for the packed and 
> memory-based grammars. This was a real mess.
> This was all fixed with a commit a while ago that packs and reads packed 
> grammars based on the first two symbols on the source side. So we should 
> remove all the complexity associated with phrases. They should just be 
> regular rules. There is also a lot of redundancy across the codebase in 
> parsing rules, converting them to different formats, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (JOSHUA-272) Simplify the packing and usage of phrase-based grammars

2016-08-24 Thread Matt Post (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Post updated JOSHUA-272:
-
Fix Version/s: (was: 6.2)
   6.1

> Simplify the packing and usage of phrase-based grammars
> ---
>
> Key: JOSHUA-272
> URL: https://issues.apache.org/jira/browse/JOSHUA-272
> Project: Joshua
>  Issue Type: Improvement
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> For historical reasons, phrase-based grammars add some complexity to 
> decoding. The complete tree under each top-level trie node in packed grammars 
> has to fit within a single packed grammars slice, which is limited to 2 GB 
> due to constraints on the size of Java byte[] arrays. We used to sort on just 
> the first item in the trie, which was a problem for phrase-based decoding, 
> since phrase-based rules are implemented as left-branching hierarchical 
> rules. In order to pack large grammars, we packed them without the leading 
> [X,1], and then added it when loading the grammars, both for the packed and 
> memory-based grammars. This was a real mess.
> This was all fixed with a commit a while ago that packs and reads packed 
> grammars based on the first two symbols on the source side. So we should 
> remove all the complexity associated with phrases. They should just be 
> regular rules. There is also a lot of redundancy across the codebase in 
> parsing rules, converting them to different formats, and so on.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435615#comment-15435615
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

ACK will do.

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "5 5"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
>   at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
>   at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
>   at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-24 Thread Matt Post (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435601#comment-15435601
 ] 

Matt Post commented on JOSHUA-304:
--

I just pushed up some changes that should fix this. Give it a look? It's on the 
JOSHUA-309 branch. It passes my tests.

> word-align.conf alignment template file not compatible with berkeley aligner
> 
>
> Key: JOSHUA-304
> URL: https://issues.apache.org/jira/browse/JOSHUA-304
> Project: Joshua
>  Issue Type: Bug
>  Components: alignment, berkeley, templates
>Affects Versions: 6.0.5
>Reporter: Lewis John McGibbney
>Priority: Blocker
> Fix For: 6.1
>
>
> It takes me quite some time to debug what was going on and why pipeline's 
> were failing when using the berkeley aligner.
> It turns out that the word-align.conf template provided at
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf
> is not compatible with the berkeley aligner. 
> In particular the following lines are non compatible
> https://github.com/apache/incubator-joshua/blob/master/scripts/training/templates/alignment/word-align.conf#L12-L15
> Evidence of this is provided below
> {code}
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1, HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'MODEL1 HMM'; valid choices: MODEL1|MODEL2|HMM|SYNTACTIC|NONE
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Invalid enum: 'JOINT JOINT'; valid choices: FORWARD|REVERSE|BOTH_INDEP|JOINT
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Exception in thread "main" java.lang.NumberFormatException: For input string: 
> "5 5"
>   at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>   at java.lang.Integer.parseInt(Integer.java:580)
>   at java.lang.Integer.parseInt(Integer.java:615)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:143)
>   at 
> edu.berkeley.nlp.fig.basic.OptInfo.interpretValue(OptionsParser.java:240)
>   at edu.berkeley.nlp.fig.basic.OptInfo.set(OptionsParser.java:294)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.readOptionsFile(OptionsParser.java:555)
>   at 
> edu.berkeley.nlp.fig.basic.OptionsParser.doParse(OptionsParser.java:604)
>   at edu.berkeley.nlp.fig.exec.Execution.init(Execution.java:293)
>   at edu.berkeley.nlp.wordAlignment.Main.main(Main.java:149)
> lmcgibbn@LMC-032857 /usr/local/incubator-joshua/lib(master) $ java -d64 
> -Xmx10g -jar /usr/local/incubator-joshua/lib/berkeleyaligner.jar 
> ++/usr/local/incubator-joshua/experiments/fisher_callhome_experiment/6/alignments/0/word-align.conf
> Cannot create directory: alignments/0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-304) word-align.conf alignment template file not compatible with berkeley aligner

2016-08-24 Thread Lewis John McGibbney (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15435133#comment-15435133
 ] 

Lewis John McGibbney commented on JOSHUA-304:
-

It may help for me to post the options available within the current berkeley 
aligner jar which was built when I installed Joshua
{code}
lmcgibbn@LMC-032857 /usr/local/incubator-joshua(master) $ java -jar 
./lib/berkeleyaligner.jar  -help
Usage:
  log.maxIndLevel<  int> : Maximum indent level. [10]
  log.msPerLine  <  int> : Maximum number of milliseconds 
between consecutive lines of output. [1000]
  log.file   <  str> : File to write log. []
  log.stdout < bool> : Whether to output to the console. 
[true]
  log.note   <  str> : Dummy placeholder for a comment []
  log.forcePrint < bool> : Force printing from logs* [false]
  log.maxPrintErrors <  int> : Maximum number of errors (via 
error()) to print [1]
  EMWordAligner.nullProb <  dbl> : How to assign null-word 
probabilities (=1 means 1/n) [1.0E-6]
  EMWordAligner.usePosteriorDecoding < bool> : Use posterior decoding 
(recommended for best performance). [true]
  EMWordAligner.posteriorDecodingThreshold <  dbl> : Threshold in [0,1] for 
deciding whether an alignment should exist. [0.5]
  EMWordAligner.mergeConsiderNull < bool> : When merging expected sufficient 
statistics, take into account the NULL (fix). [false]
  EMWordAligner.handleUnknownWords < bool> : Don't crash with unknown words 
(better to train on test set). [false]
  EMWordAligner.priorFraction<  dbl> : Fraction of a count to add for links 
in dictionary prior (1 works well). [0.0]
  EMWordAligner.numThreads   <  int> : Number of concurrent threads to use 
during E-step (set to number of processors). [1]
  EMWordAligner.safeConcurrency  < bool> : Safe concurrency (gets rid of 
concurrency warnings at the expense of speed) [false]
  EMWordAligner.evaluateDuringTraining < bool> : Whether to evaluate the model 
after each training iteration (slower, more memory). [false]
  TreeWalkModel.usePushProbabilities < bool> : Separate parameters for moving 
and pushing. [true]
  TreeWalkModel.conditionOnTag   < bool> : Whether to condition distortion on 
the tag types. [true]
  TreeWalkModel.cacheTreePaths   < bool> : Whether to cache paths through trees 
(uses lots of memory; faster). [false]
  Evaluator.searchForThreshold   < bool> : Evaluate using line search [false]
  Evaluator.thresholdIntervals   <  int> : Sets the number of intervals for 
posterior threshold line search [20]
  Evaluator.saveAlignmentObjects < bool> : Save object files for proposed 
alignments (large files) [false]
  Main.trainSources  < str*> : Directories or files containing 
training files. [example/train]
  Main.testSources   < str*> : Directory or file containing testing 
files. [example/test]
  Main.sentences <  int> : Maximum number of the training 
sentences to use [2147483647]
  Main.offsetTrainingSentences   <  int> : Skip this number of the first 
training sentences [0]
  Main.maxTestSentences  <  int> : Maximum number of the test sentences 
to use [2147483647]
  Main.offsetTestSentences   <  int> : Skip this number of the first test 
sentences [0]
  Main.foreignSuffix <  str> : Foreign language file suffix [f]
  Main.englishSuffix <  str> : English language file suffix [e]
  Main.itgTrainTestSplitPoint<  int> : When writing test (ITG) posteriors, 
where to divide train/test data? [0]
  Main.itgInputDir   <  str> : What directory should we dump ITG 
test data to? []
  Main.reverseAlignments < bool> : Reverse test set alignments (i.e., 
foreign to english) [false]
  Main.oneIndexed< bool> : Are alignments one-indexed (default 
== no, 0-indexed) [false]
  Main.lowercaseWords< bool> : Convert all words to lowercase 
[false]
  Main.leaveTrainingOnDisk   < bool> : Don't load and store the training 
set upfront (slower, but less memory) [false]
  Main.saveRejects   < bool> : Save rejected sentence pairs [false]
  Main.forwardModels  : Which word alignment model to use in 
the forward direction. [MODEL1 HMM]
  Main.reverseModels  : Which word alignment model to use in 
the backward direction. [MODEL1 HMM]
  Main.iters < int*> : Number of iterations to run the 
model. [5 5]
  Main.mode   : Whether to train the two models 
jointly or independently. [JOINT JOINT]
  Main.trainingCacheMaxSize  <  int> : Max sentence length for caching the 
HMM trellis (efficiency only). [100]
  Main.loadParamsDir <  str> : Directory to load parameters from. []
  Main.loadLexicalModelOnly  < bool> : When true, the 

[GitHub] incubator-joshua pull request #:

2016-08-24 Thread mjpost
Github user mjpost commented on the pull request:


https://github.com/apache/incubator-joshua/commit/d4fdbfd88bab99e244d3ed1fc9cff4ba5e6d124c#commitcomment-18760447
  
I think maybe no, since all the feature functions are loaded using 
reflection. This could be a method available only on StatefulFF. Or maybe there 
is a better way to do it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Build failed in Jenkins: joshua_master #96

2016-08-24 Thread Matt Post
We are running out of space on builds...


> On Aug 23, 2016, at 10:15 PM, Apache Jenkins Server 
>  wrote:
> 
> See 
> 
> Changes:
> 
> [lewis.mcgibbney] Update examples README formatting and links.
> 
> [lewis.mcgibbney] Update examples README pipeline invocation parameters
> 
> --
> Started by an SCM change
> [EnvInject] - Loading node environment variables.
> Building remotely on ubuntu-us1 (Ubuntu golang-ppa ubuntu-us ubuntu) in 
> workspace 
>> git rev-parse --is-inside-work-tree # timeout=10
> Fetching changes from the remote Git repository
>> git config remote.origin.url 
>> https://git-wip-us.apache.org/repos/asf/incubator-joshua.git # timeout=10
> Fetching upstream changes from 
> https://git-wip-us.apache.org/repos/asf/incubator-joshua.git
>> git --version # timeout=10
>> git -c core.askpass=true fetch --tags --progress 
>> https://git-wip-us.apache.org/repos/asf/incubator-joshua.git 
>> +refs/heads/*:refs/remotes/origin/*
>> git rev-parse refs/remotes/origin/master^{commit} # timeout=10
>> git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
> Checking out Revision 0744ebf56906dbe70292737cd50a39652407869d 
> (refs/remotes/origin/master)
>> git config core.sparsecheckout # timeout=10
>> git checkout -f 0744ebf56906dbe70292737cd50a39652407869d
>> git rev-list ff410c297a149400db3cb553b11a930ad01dc7ed # timeout=10
> [joshua_master] $ /home/jenkins/tools/maven/latest3/bin/mvn clean install 
> javadoc:aggregate
> Java HotSpot(TM) 64-Bit Server VM warning: Insufficient space for shared 
> memory file:
>   26586
> Try using the -Djava.io.tmpdir= option to select an alternate temp location.
> 
> [INFO] Scanning for projects...
> [INFO]
>  
> [INFO] 
> 
> [INFO] Building Apache Joshua Machine Translation Toolkit 6.1-SNAPSHOT
> [INFO] 
> 
> [INFO] 
> [INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @ joshua ---
> [INFO] Deleting 
> [INFO] 
> [INFO] --- maven-remote-resources-plugin:1.2.1:process (default) @ joshua ---
> [INFO] 
> [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ joshua 
> ---
> [debug] execute contextualize
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 1 resource
> [INFO] Copying 3 resources
> [INFO] 
> [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ joshua ---
> [INFO] Compiling 266 source files to 
> 
> [INFO] 
> [INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ 
> joshua ---
> [debug] execute contextualize
> [INFO] Using 'UTF-8' encoding to copy filtered resources.
> [INFO] Copying 349 resources
> [INFO] Copying 3 resources
> [INFO] 
> [INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ 
> joshua ---
> [INFO] Compiling 38 source files to 
> 
> [INFO] 
> [INFO] --- maven-surefire-plugin:2.19.1:test (default-test) @ joshua ---
> 
> ---
> T E S T S
> ---
> Java HotSpot(TM) 64-Bit Server VM warning: Insufficient space for shared 
> memory file:
>   27154
> Try using the -Djava.io.tmpdir= option to select an alternate temp location.
> Running TestSuite
> 102030405060708090.100%
> ERROR - Can't find libken.so (libken.dylib on OS X) on the Java library path.
> tm_pt_0=-2.000 tm_glue_0=3.000 lm_0=-206.718 lm_0_oov=2.000 
> OOVPenalty=-200.000 | -198.000
> ERROR - Can't find libken.so (libken.dylib on OS X) on the Java library path.
> ERROR - Can't find libken.so (libken.dylib on OS X) on the Java library path.
> ERROR - Can't find libken.so (libken.dylib on OS X) on the Java library path.
> %
> %
> %
> %
> %
> %
> %
> %
> %
> Tests run: 48, Failures: 1, Errors: 0, Skipped: 9, Time elapsed: 4.219 sec 
> <<< FAILURE! - in TestSuite
> externalizeVocabulary(org.apache.joshua.util.io.BinaryTest)  Time elapsed: 
> 0.01 sec  <<< FAILURE!
> java.io.IOException: No space left on device
>   at 
> org.apache.joshua.util.io.BinaryTest.externalizeVocabulary(BinaryTest.java:56)
> 
> 
> Results :
> 
> Failed tests: 
>  BinaryTest.externalizeVocabulary:56 ยป IO No space left on device
> 
> Tests run: 46, Failures: 1, Errors: 0, Skipped: 7
> 
> [INFO]
>  
> [INFO] 
> 
> [INFO] 

[jira] [Resolved] (JOSHUA-278) Alignments printed incorrectly for phrase-based decoder

2016-08-24 Thread Matt Post (JIRA)

 [ 
https://issues.apache.org/jira/browse/JOSHUA-278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Post resolved JOSHUA-278.
--
Resolution: Fixed

> Alignments printed incorrectly for phrase-based decoder
> ---
>
> Key: JOSHUA-278
> URL: https://issues.apache.org/jira/browse/JOSHUA-278
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> Type this to see the bug:
> echo YUP | $JOSHUA/bin/joshua -lowercase -search stack -project-case 
> -output-format "%s ||| %f ||| %a"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (JOSHUA-278) Alignments printed incorrectly for phrase-based decoder

2016-08-24 Thread Matt Post (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15434883#comment-15434883
 ] 

Matt Post commented on JOSHUA-278:
--

This has been fixed in master by removing the alignment points in the 
BEGIN_RULE and END_RULE.

> Alignments printed incorrectly for phrase-based decoder
> ---
>
> Key: JOSHUA-278
> URL: https://issues.apache.org/jira/browse/JOSHUA-278
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
>Assignee: Matt Post
> Fix For: 6.1
>
>
> Type this to see the bug:
> echo YUP | $JOSHUA/bin/joshua -lowercase -search stack -project-case 
> -output-format "%s ||| %f ||| %a"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)