This is strange. I haven't looked into this again but don't have any insights. Thanks for the followup.
> On Oct 21, 2016, at 3:35 PM, lewis john mcgibbney <[email protected]> wrote: > > Hi Folks, > Follow up. > It seems that when I clean the .cachepipe as well as all of the existing > alignments, etc from the previous run and re-run the entire pipeline then > this issue disappears. > I have no real reason why this happened. All i can say is that it is of > course best to run experiments in different directories when you make a > tweak to a pipeline. > Lewis > > On Thu, Oct 20, 2016 at 12:20 AM, lewis john mcgibbney <[email protected]> > wrote: > >> Hi dev@, >> >> Sitting facing some issues with Thrax using Joshua master branch. >> I invoke Joshua as follows >> >> /usr/local/incubator-joshua/bin/pipeline.pl --rundir . --type hiero >> --corpus >> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en >> --tune >> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune >> --test >> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test >> --source en --target ru --readme "Experiment 1 Run 1 of ru --> en model >> training" --aligner berkeley --tmp /usr/local/hadoop-2.5.2/hadoop_tmp_dir >> --first-step thrax --no-prepare --alignment alignments/training.align >> --hadoop-mem 10g >> >> I make the first step thrax as I have previously computed my alignment as >> indicated by the arguments. >> My Thrax log is available at https://www.dropbox.com/s/ >> pxld70ki656fn13/thrax.log?dl=0. In the log you will see an exception as >> follows >> >> 16/10/19 22:56:59 WARN mapred.LocalJobRunner: job_local1314413872_0002 >> java.lang.Exception: java.lang.RuntimeException: Word id 2146928632 out >> of range 0 1727042 >> at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks( >> LocalJobRunner.java:462) >> at org.apache.hadoop.mapred.LocalJobRunner$Job.run( >> LocalJobRunner.java:522) >> Caused by: java.lang.RuntimeException: Word id 2146928632 out of range 0 >> 1727042 >> at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat >> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:133) >> at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat >> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:121) >> at org.apache.hadoop.mapred.MapTask$NewOutputCollector. >> write(MapTask.java:692) >> at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write( >> TaskInputOutputContextImpl.java:89) >> at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context. >> write(WrappedMapper.java:112) >> at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat >> or$Map.map(WordLexicalProbabilityCalculator.java:82) >> at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat >> or$Map.map(WordLexicalProbabilityCalculator.java:28) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) >> at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run( >> LocalJobRunner.java:243) >> at java.util.concurrent.Executors$RunnableAdapter. >> call(Executors.java:511) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at java.util.concurrent.ThreadPoolExecutor.runWorker( >> ThreadPoolExecutor.java:1142) >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >> ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> >> I see no other issues until the end of the Thrax log where I see >> >> class edu.jhu.thrax.hadoop.jobs.TargetWordGivenSourceWordProbabilityJob >> FAILED >> class edu.jhu.thrax.hadoop.jobs.OutputJob PREREQ_FAILED >> class edu.jhu.thrax.hadoop.features.annotation.AnnotationFeatureJob >> PREREQ_FAILED >> class edu.jhu.thrax.hadoop.features.mapred.TargetPhraseGivenSourceFeature >> SUCCESS >> class edu.jhu.thrax.hadoop.jobs.ExtractionJob SUCCESS >> class edu.jhu.thrax.hadoop.features.mapred.SourcePhraseGivenTargetFeature >> SUCCESS >> class edu.jhu.thrax.hadoop.jobs.VocabularyJob SUCCESS >> class edu.jhu.thrax.hadoop.jobs.SourceWordGivenTargetWordProbabilityJob >> FAILED >> >> This issue has previously been reported by Matt over on >> https://github.com/joshua-decoder/thrax/issues/10 >> >> Debugging right now folks. >> Lewis >> >> -- >> http://home.apache.org/~lewismc/ >> @hectorMcSpector >> http://www.linkedin.com/in/lmcgibbney >> > > > > -- > http://home.apache.org/~lewismc/ > @hectorMcSpector > http://www.linkedin.com/in/lmcgibbney
