Hi Folks,
Follow up.
It seems that when I clean the .cachepipe as well as all of the existing
alignments, etc from the previous run and re-run the entire pipeline then
this issue disappears.
I have no real reason why this happened. All i can say is that it is of
course best to run experiments in different directories when you make a
tweak to a pipeline.
Lewis

On Thu, Oct 20, 2016 at 12:20 AM, lewis john mcgibbney <[email protected]>
wrote:

> Hi dev@,
>
> Sitting facing some issues with Thrax using Joshua master branch.
> I invoke Joshua as follows
>
> /usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero
> --corpus 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en
> --tune 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune
> --test 
> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test
> --source en --target ru --readme "Experiment 1 Run 1 of ru --> en model
> training" --aligner berkeley --tmp /usr/local/hadoop-2.5.2/hadoop_tmp_dir
> --first-step thrax --no-prepare --alignment alignments/training.align
> --hadoop-mem 10g
>
> I make the first step thrax as I have previously computed my alignment as
> indicated by the arguments.
> My Thrax log is available at https://www.dropbox.com/s/
> pxld70ki656fn13/thrax.log?dl=0. In the log you will see an exception as
> follows
>
> 16/10/19 22:56:59 WARN mapred.LocalJobRunner: job_local1314413872_0002
> java.lang.Exception: java.lang.RuntimeException: Word id 2146928632 out
> of range 0 1727042
>     at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(
> LocalJobRunner.java:462)
>     at org.apache.hadoop.mapred.LocalJobRunner$Job.run(
> LocalJobRunner.java:522)
> Caused by: java.lang.RuntimeException: Word id 2146928632 out of range 0
> 1727042
>     at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:133)
>     at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:121)
>     at org.apache.hadoop.mapred.MapTask$NewOutputCollector.
> write(MapTask.java:692)
>     at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(
> TaskInputOutputContextImpl.java:89)
>     at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.
> write(WrappedMapper.java:112)
>     at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
> or$Map.map(WordLexicalProbabilityCalculator.java:82)
>     at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
> or$Map.map(WordLexicalProbabilityCalculator.java:28)
>     at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>     at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(
> LocalJobRunner.java:243)
>     at java.util.concurrent.Executors$RunnableAdapter.
> call(Executors.java:511)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:745)
>
> I see no other issues until the end of the Thrax log where I see
>
> class edu.jhu.thrax.hadoop.jobs.TargetWordGivenSourceWordProbabilityJob
> FAILED
> class edu.jhu.thrax.hadoop.jobs.OutputJob    PREREQ_FAILED
> class edu.jhu.thrax.hadoop.features.annotation.AnnotationFeatureJob
> PREREQ_FAILED
> class edu.jhu.thrax.hadoop.features.mapred.TargetPhraseGivenSourceFeature
> SUCCESS
> class edu.jhu.thrax.hadoop.jobs.ExtractionJob    SUCCESS
> class edu.jhu.thrax.hadoop.features.mapred.SourcePhraseGivenTargetFeature
> SUCCESS
> class edu.jhu.thrax.hadoop.jobs.VocabularyJob    SUCCESS
> class edu.jhu.thrax.hadoop.jobs.SourceWordGivenTargetWordProbabilityJob
> FAILED
>
> This issue has previously been reported by Matt over on
> https://github.com/joshua-decoder/thrax/issues/10
>
> Debugging right now folks.
> Lewis
>
> --
> http://home.apache.org/~lewismc/
> @hectorMcSpector
> http://www.linkedin.com/in/lmcgibbney
>



-- 
http://home.apache.org/~lewismc/
@hectorMcSpector
http://www.linkedin.com/in/lmcgibbney

Reply via email to