This is strange. I haven't looked into this again but don't have any insights. 
Thanks for the followup.


> On Oct 21, 2016, at 3:35 PM, lewis john mcgibbney <[email protected]> wrote:
> 
> Hi Folks,
> Follow up.
> It seems that when I clean the .cachepipe as well as all of the existing
> alignments, etc from the previous run and re-run the entire pipeline then
> this issue disappears.
> I have no real reason why this happened. All i can say is that it is of
> course best to run experiments in different directories when you make a
> tweak to a pipeline.
> Lewis
> 
> On Thu, Oct 20, 2016 at 12:20 AM, lewis john mcgibbney <[email protected]>
> wrote:
> 
>> Hi dev@,
>> 
>> Sitting facing some issues with Thrax using Joshua master branch.
>> I invoke Joshua as follows
>> 
>> /usr/local/incubator-joshua/bin/pipeline.pl  --rundir . --type hiero
>> --corpus 
>> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en
>> --tune 
>> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.tune
>> --test 
>> /usr/local/joshua_resources/russian_experiments/data/commoncrawl.ru-en.test
>> --source en --target ru --readme "Experiment 1 Run 1 of ru --> en model
>> training" --aligner berkeley --tmp /usr/local/hadoop-2.5.2/hadoop_tmp_dir
>> --first-step thrax --no-prepare --alignment alignments/training.align
>> --hadoop-mem 10g
>> 
>> I make the first step thrax as I have previously computed my alignment as
>> indicated by the arguments.
>> My Thrax log is available at https://www.dropbox.com/s/
>> pxld70ki656fn13/thrax.log?dl=0. In the log you will see an exception as
>> follows
>> 
>> 16/10/19 22:56:59 WARN mapred.LocalJobRunner: job_local1314413872_0002
>> java.lang.Exception: java.lang.RuntimeException: Word id 2146928632 out
>> of range 0 1727042
>>    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(
>> LocalJobRunner.java:462)
>>    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(
>> LocalJobRunner.java:522)
>> Caused by: java.lang.RuntimeException: Word id 2146928632 out of range 0
>> 1727042
>>    at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
>> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:133)
>>    at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
>> or$Partition.getPartition(WordLexicalProbabilityCalculator.java:121)
>>    at org.apache.hadoop.mapred.MapTask$NewOutputCollector.
>> write(MapTask.java:692)
>>    at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(
>> TaskInputOutputContextImpl.java:89)
>>    at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.
>> write(WrappedMapper.java:112)
>>    at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
>> or$Map.map(WordLexicalProbabilityCalculator.java:82)
>>    at edu.jhu.thrax.hadoop.features.WordLexicalProbabilityCalculat
>> or$Map.map(WordLexicalProbabilityCalculator.java:28)
>>    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>>    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(
>> LocalJobRunner.java:243)
>>    at java.util.concurrent.Executors$RunnableAdapter.
>> call(Executors.java:511)
>>    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>    at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1142)
>>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:617)
>>    at java.lang.Thread.run(Thread.java:745)
>> 
>> I see no other issues until the end of the Thrax log where I see
>> 
>> class edu.jhu.thrax.hadoop.jobs.TargetWordGivenSourceWordProbabilityJob
>> FAILED
>> class edu.jhu.thrax.hadoop.jobs.OutputJob    PREREQ_FAILED
>> class edu.jhu.thrax.hadoop.features.annotation.AnnotationFeatureJob
>> PREREQ_FAILED
>> class edu.jhu.thrax.hadoop.features.mapred.TargetPhraseGivenSourceFeature
>> SUCCESS
>> class edu.jhu.thrax.hadoop.jobs.ExtractionJob    SUCCESS
>> class edu.jhu.thrax.hadoop.features.mapred.SourcePhraseGivenTargetFeature
>> SUCCESS
>> class edu.jhu.thrax.hadoop.jobs.VocabularyJob    SUCCESS
>> class edu.jhu.thrax.hadoop.jobs.SourceWordGivenTargetWordProbabilityJob
>> FAILED
>> 
>> This issue has previously been reported by Matt over on
>> https://github.com/joshua-decoder/thrax/issues/10
>> 
>> Debugging right now folks.
>> Lewis
>> 
>> --
>> http://home.apache.org/~lewismc/
>> @hectorMcSpector
>> http://www.linkedin.com/in/lmcgibbney
>> 
> 
> 
> 
> -- 
> http://home.apache.org/~lewismc/
> @hectorMcSpector
> http://www.linkedin.com/in/lmcgibbney

Reply via email to