It looks like you have a very small corpus. Can you tell me what number this command reports?
gzip -cd grammar.gz | grep Infinity | wc -l matt > On Nov 22, 2016, at 5:28 PM, Fernando E Alva Manchego > <fealvamanche...@sheffield.ac.uk> wrote: > > Hello, > > I'm using Hadoop 2.7.3 and Java 8. Apparently, the Hadoop setup is OK, > according to the instructions given in: > > https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html#Standalone_Operation > > <https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SingleCluster.html#Standalone_Operation> > > I'll try and earlier version of Hadoop and see how it goes. > > Cheers, > Fernando > > On 22 November 2016 at 19:06, John Hewitt <john...@seas.upenn.edu > <mailto:john...@seas.upenn.edu>> wrote: > Grepping through the log file, I found the following problem: > > class edu.jhu.thrax.hadoop.features.annotation.AnnotationFeatureJob FAILED > > This is a prereq of OutputJob, hence OutputJob failed. > > Here's a link to a useful closed issue with an almost identical problem. > https://issues.apache.org/jira/browse/JOSHUA-297 > <https://issues.apache.org/jira/browse/JOSHUA-297> > > +1 on the hadoop setup question, as well as the version of Java you're using, > for good measure. > > -John > > On Tue, Nov 22, 2016 at 1:28 PM, Fernando E Alva Manchego > <fealvamanche...@sheffield.ac.uk <mailto:fealvamanche...@sheffield.ac.uk>> > wrote: > I'm attaching the file because it's big to paste all its content here. The > size of data/train/thrax-input-file is 4.9M. I'll check the hadoop setup. > > Cheers, > Fernando > > On 22 November 2016 at 18:15, Matt Post <p...@cs.jhu.edu > <mailto:p...@cs.jhu.edu>> wrote: > Okay, that is the size of a compressed empty file. So the grammar did not > extract properly. Did you setup Hadoop properly? Can you paste the contents > of thrax.log? What is the file size of data/train/thrax-input-file? > > > > >> On Nov 22, 2016, at 1:12 PM, Fernando E Alva Manchego >> <fealvamanche...@sheffield.ac.uk <mailto:fealvamanche...@sheffield.ac.uk>> >> wrote: >> >> Hello, >> >> It's 20 Bytes. >> >> Best, >> Fernando >> >> On 22 November 2016 at 18:00, Matt Post <p...@cs.jhu.edu >> <mailto:p...@cs.jhu.edu>> wrote: >> eigen3 is not necessary. What is the file size of grammar.gz? >> >> >>> On Nov 22, 2016, at 7:54 AM, Fernando E Alva Manchego >>> <fealvamanche...@sheffield.ac.uk <mailto:fealvamanche...@sheffield.ac.uk>> >>> wrote: >>> >>> Hello, >>> >>> Well, I ran that command and it went fine: build 100% >>> >>> However, now I ran the tutorial command again and I get: >>> >>> * Packing grammar at "grammar.gz" to >>> "../joshua-tutorial/runs/1/tune/model/grammar.gz.packed" >>> * Running the grammar-packer.pl <http://grammar-packer.pl/> script with the >>> command: $JOSHUA/scripts/support/grammar-packer.pl >>> <http://grammar-packer.pl/> -a -T /tmp -g grammar.gz -o >>> ../joshua-tutorial/runs/1/tune/model/grammar.gz.packed >>> Exception in thread "main" java.util.NoSuchElementException >>> at org.apache.joshua.util.io >>> <http://org.apache.joshua.util.io/>.LineReader.next(LineReader.java:276) >>> at >>> org.apache.joshua.tools.GrammarPacker.getGrammarReader(GrammarPacker.java:239) >>> at org.apache.joshua.tools.GrammarPacker.pack(GrammarPacker.java:184) >>> at >>> org.apache.joshua.tools.GrammarPackerCli.run(GrammarPackerCli.java:120) >>> at >>> org.apache.joshua.tools.GrammarPackerCli.main(GrammarPackerCli.java:137) >>> * FATAL: Couldn't pack the grammar. >>> * Copying sorted grammars (/tmp/grammar.gzR7NI) to current directory. >>> * __init__() takes at least 3 arguments (2 given) >>> >>> One thing I noticed is this "error" message when compiling: >>> >>> -- Could NOT find Eigen3 (missing: EIGEN3_INCLUDE_DIR EIGEN3_VERSION_OK) >>> (Required is at least version "2.91.0") >>> CMake Warning at lm/interpolate/CMakeLists.txt:65 (message): >>> Not building interpolation. Eigen3 was not found. >>> >>> Is Eigen3 really necessary? >>> >>> Cheers, >>> Fernando >>> >>> On 18 November 2016 at 18:15, Matt Post <p...@cs.jhu.edu >>> <mailto:p...@cs.jhu.edu>> wrote: >>> Okay, it looks like KenLM is not building. This is a perennial pain. You >>> can see the KenLM build lines in download_deps.sh. What is output when you >>> run >>> >>> ./jni/build_kenlm.sh >>> >>> matt >>> >>> >>> >>>> On Nov 18, 2016, at 12:24 PM, Fernando E Alva Manchego >>>> <fealvamanche...@sheffield.ac.uk <mailto:fealvamanche...@sheffield.ac.uk>> >>>> wrote: >>>> >>>> Hello, >>>> >>>> UPDATE: I added $JOSHUA/lib to LD_LIBRARY_PATH because I saw that >>>> libken.so is there. Now, I run the command again and what I get is the >>>> same error that Lewis pointed out: >>>> >>>> [lm-sort-uniq] rebuilding... >>>> dep= ../joshua-tutorial/runs/1/data/train/corpus.en [CHANGED] >>>> dep= ../joshua-tutorial/runs/1/data/train/corpus.en.uniq [NOT FOUND] >>>> cmd= $JOSHUA/scripts/training/scat >>>> /export/data/falva/joshua-tutorial/runs/1/data/train/corpus.en | sort -u >>>> -T /tmp -S 8G | gzip -9n >>>> >.../joshua-tutorial/runs/1/data/train/corpus.en.uniq >>>> took 1 seconds (1s) >>>> * FATAL: $JOSHUA/bin/lmplz (for building LMs) does not exist. >>>> This is often a problem with the boost libraries (particularly threaded >>>> versus unthreaded). >>>> >>>> Cheers, >>>> Fernando >>>> >>>> On 18 November 2016 at 16:40, Fernando E Alva Manchego >>>> <fealvamanche...@sheffield.ac.uk <mailto:fealvamanche...@sheffield.ac.uk>> >>>> wrote: >>>> Hello, >>>> >>>> Sorry for the late reply. I have downloaded joshua again and followed the >>>> updated procedure, but I still get the same error when running the >>>> following command: >>>> >>>> $JOSHUA/bin/pipeline.pl <http://pipeline.pl/> \ >>>> --rundir 1 \ >>>> --readme "Baseline Hiero run" \ >>>> --source es \ >>>> --target en \ >>>> --type hiero \ >>>> --corpus $FISHER/corpus/asr/fisher_train \ >>>> --tune $FISHER/corpus/asr/fisher_dev \ >>>> --test $FISHER/corpus/asr/fisher_dev2 \ >>>> --maxlen 11 \ >>>> --maxlen-tune 11 \ >>>> --maxlen-test 11 \ >>>> --tuner-iterations 1 \ >>>> --lm-order 3 >>>> >>>> The error is still: >>>> [pack-grammar] rebuilding... >>>> dep= $HOME/joshua-tutorial/runs/1/grammar.packed/vocabulary [NOT FOUND] >>>> dep= $HOME/joshua-tutorial/runs/1/grammar.packed/encoding [NOT FOUND] >>>> dep= $HOME/joshua-tutorial/runs/1/grammar.packed/slice_00000.source [NOT >>>> FOUND] >>>> cmd= $JOSHUA/scripts/support/grammar-packer.pl >>>> <http://grammar-packer.pl/> -a -T /tmp -m 8g -g grammar.gz -o >>>> $HOME/joshua-tutorial/runs/1/grammar.packed >>>> JOB FAILED (return code 1) >>>> Exception in thread "main" java.util.NoSuchElementException >>>> at org.apache.joshua.util.io >>>> <http://org.apache.joshua.util.io/>.LineReader.next(LineReader.java:276) >>>> at >>>> org.apache.joshua.tools.GrammarPacker.getGrammarReader(GrammarPacker.java:239) >>>> at org.apache.joshua.tools.GrammarPacker.pack(GrammarPacker.java:184) >>>> at >>>> org.apache.joshua.tools.GrammarPackerCli.run(GrammarPackerCli.java:120) >>>> at >>>> org.apache.joshua.tools.GrammarPackerCli.main(GrammarPackerCli.java:137) >>>> * FATAL: Couldn't pack the grammar. >>>> * Copying sorted grammars (/tmp/grammar.gzTQzG) to current directory. >>>> >>>> What I have noticed now is that, when running the tests after compilation, >>>> this error message appears: >>>> >>>> ERROR - Can't find libken.so (libken.dylib on OS X) on the Java library >>>> path. >>>> WARN - No glue grammar found! Creating dummy glue grammar. >>>> >>>> Could that be the source of the error? Thank you. >>>> >>>> @Lewis: I'll make sure to given them your regards. >>>> >>>> Best >>>> Fernando >>>> >>>> On 18 November 2016 at 13:42, Matt Post <p...@cs.jhu.edu >>>> <mailto:p...@cs.jhu.edu>> wrote: >>>> I just updated that page to use "mvn package" instead of the old "mvn >>>> compile assembly:single". So Fernando, please make sure you follow the >>>> updated instructions. >>>> >>>> >>>> >>>> >>>>> On Nov 17, 2016, at 10:10 PM, lewis john mcgibbney <lewi...@apache.org >>>>> <mailto:lewi...@apache.org>> wrote: >>>>> >>>>> Hi Fernando, >>>>> First and foremost please give y regards to the GATE team at Sheffield. I >>>>> spent a great week down there a number of years back and I am fond of the >>>>> place. >>>>> Are you following the tutorial at >>>>> https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+Tutorial >>>>> <https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+Tutorial> ? >>>>> If so then I'll try it out and see if I can reproduce. >>>>> Lewis >>>>> >>>>> On Thu, Nov 17, 2016 at 9:38 AM, >>>>> <user-digest-h...@joshua.incubator.apache.org >>>>> <mailto:user-digest-h...@joshua.incubator.apache.org>> wrote: >>>>> From: Fernando E Alva Manchego <fealvamanche...@sheffield.ac.uk >>>>> <mailto:fealvamanche...@sheffield.ac.uk>> >>>>> To: user@joshua.incubator.apache.org >>>>> <mailto:user@joshua.incubator.apache.org> >>>>> Cc: >>>>> Date: Thu, 17 Nov 2016 17:37:53 +0000 >>>>> Subject: Error while running the tutorial >>>>> Hello! >>>>> >>>>> I'm running the tutorial (phrase) and the following error came up: >>>>> >>>>> Error: Could not find or load main class >>>>> org.apache.joshua.tools.GrammarPackerCli >>>>> >>>>> When I installed Joshua, I ran the tests and everything was OK. Do you >>>>> have any idea what might be happening? Thank you. >>>>> >>>>> >>>> >>>> >>>> >>> >>> >> >> > > > >