Hi, The number is 0. The corpus I'm using is the one provided with the download: ASR.
Well, I tried with Hadoop 2.7.3, 2.6.5 and 2.5.2 and I get the exact same error. What could be wrong with its setup? It's just adding the $HADOOP_HOME/bin to the PATH. By the way, I really appreciate all the help you're giving. Cheers, Fernando On 22 November 2016 at 22:30, Matt Post <p...@cs.jhu.edu> wrote: > It looks like you have a very small corpus. Can you tell me what number > this command reports? > > gzip -cd grammar.gz | grep Infinity | wc -l > > matt > > On Nov 22, 2016, at 5:28 PM, Fernando E Alva Manchego < > fealvamanche...@sheffield.ac.uk> wrote: > > Hello, > > I'm using Hadoop 2.7.3 and Java 8. Apparently, the Hadoop setup is OK, > according to the instructions given in: > > https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/ > SingleCluster.html#Standalone_Operation > > I'll try and earlier version of Hadoop and see how it goes. > > Cheers, > Fernando > > On 22 November 2016 at 19:06, John Hewitt <john...@seas.upenn.edu> wrote: > >> Grepping through the log file, I found the following problem: >> >> class edu.jhu.thrax.hadoop.features.annotation.AnnotationFeatureJob >> FAILED >> >> This is a prereq of OutputJob, hence OutputJob failed. >> >> Here's a link to a useful closed issue with an almost identical problem. >> https://issues.apache.org/jira/browse/JOSHUA-297 >> >> +1 on the hadoop setup question, as well as the version of Java you're >> using, for good measure. >> >> -John >> >> On Tue, Nov 22, 2016 at 1:28 PM, Fernando E Alva Manchego < >> fealvamanche...@sheffield.ac.uk> wrote: >> >>> I'm attaching the file because it's big to paste all its content here. >>> The size of data/train/thrax-input-file is 4.9M. I'll check the hadoop >>> setup. >>> >>> Cheers, >>> Fernando >>> >>> On 22 November 2016 at 18:15, Matt Post <p...@cs.jhu.edu> wrote: >>> >>>> Okay, that is the size of a compressed empty file. So the grammar did >>>> not extract properly. Did you setup Hadoop properly? Can you paste the >>>> contents of thrax.log? What is the file size of >>>> data/train/thrax-input-file? >>>> >>>> >>>> >>>> >>>> On Nov 22, 2016, at 1:12 PM, Fernando E Alva Manchego < >>>> fealvamanche...@sheffield.ac.uk> wrote: >>>> >>>> Hello, >>>> >>>> It's 20 Bytes. >>>> >>>> Best, >>>> Fernando >>>> >>>> On 22 November 2016 at 18:00, Matt Post <p...@cs.jhu.edu> wrote: >>>> >>>>> eigen3 is not necessary. What is the file size of grammar.gz? >>>>> >>>>> >>>>> On Nov 22, 2016, at 7:54 AM, Fernando E Alva Manchego < >>>>> fealvamanche...@sheffield.ac.uk> wrote: >>>>> >>>>> Hello, >>>>> >>>>> Well, I ran that command and it went fine: build 100% >>>>> >>>>> However, now I ran the tutorial command again and I get: >>>>> >>>>> * Packing grammar at "grammar.gz" to "../joshua-tutorial/runs/1/tun >>>>> e/model/grammar.gz.packed" >>>>> * Running the grammar-packer.pl script with the command: >>>>> $JOSHUA/scripts/support/grammar-packer.pl -a -T /tmp -g grammar.gz -o >>>>> ../joshua-tutorial/runs/1/tune/model/grammar.gz.packed >>>>> Exception in thread "main" java.util.NoSuchElementException >>>>> at org.apache.joshua.util.io.LineReader.next(LineReader.java:276) >>>>> at org.apache.joshua.tools.GrammarPacker.getGrammarReader(Gramm >>>>> arPacker.java:239) >>>>> at org.apache.joshua.tools.GrammarPacker.pack(GrammarPacker.java:184) >>>>> at org.apache.joshua.tools.GrammarPackerCli.run(GrammarPackerCl >>>>> i.java:120) >>>>> at org.apache.joshua.tools.GrammarPackerCli.main(GrammarPackerC >>>>> li.java:137) >>>>> * FATAL: Couldn't pack the grammar. >>>>> * Copying sorted grammars (/tmp/grammar.gzR7NI) to current directory. >>>>> * __init__() takes at least 3 arguments (2 given) >>>>> >>>>> One thing I noticed is this "error" message when compiling: >>>>> >>>>> -- Could NOT find Eigen3 (missing: EIGEN3_INCLUDE_DIR >>>>> EIGEN3_VERSION_OK) (Required is at least version "2.91.0") >>>>> CMake Warning at lm/interpolate/CMakeLists.txt:65 (message): >>>>> Not building interpolation. Eigen3 was not found. >>>>> >>>>> Is Eigen3 really necessary? >>>>> >>>>> Cheers, >>>>> Fernando >>>>> >>>>> On 18 November 2016 at 18:15, Matt Post <p...@cs.jhu.edu> wrote: >>>>> >>>>>> Okay, it looks like KenLM is not building. This is a perennial pain. >>>>>> You can see the KenLM build lines in download_deps.sh. What is output >>>>>> when >>>>>> you run >>>>>> >>>>>> ./jni/build_kenlm.sh >>>>>> >>>>>> matt >>>>>> >>>>>> >>>>>> >>>>>> On Nov 18, 2016, at 12:24 PM, Fernando E Alva Manchego < >>>>>> fealvamanche...@sheffield.ac.uk> wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> UPDATE: I added $JOSHUA/lib to LD_LIBRARY_PATH because I saw that >>>>>> libken.so >>>>>> is there. Now, I run the command again and what I get is the same error >>>>>> that Lewis pointed out: >>>>>> >>>>>> [lm-sort-uniq] rebuilding... >>>>>> dep= ../joshua-tutorial/runs/1/data/train/corpus.en [CHANGED] >>>>>> dep= ../joshua-tutorial/runs/1/data/train/corpus.en.uniq [NOT >>>>>> FOUND] >>>>>> cmd= $JOSHUA/scripts/training/scat /export/data/falva/joshua-tuto >>>>>> rial/runs/1/data/train/corpus.en | sort -u -T /tmp -S 8G | gzip -9n >>>>>> >.../joshua-tutorial/runs/1/data/train/corpus.en.uniq >>>>>> took 1 seconds (1s) >>>>>> * FATAL: $JOSHUA/bin/lmplz (for building LMs) does not exist. >>>>>> This is often a problem with the boost libraries (particularly >>>>>> threaded >>>>>> versus unthreaded). >>>>>> >>>>>> Cheers, >>>>>> Fernando >>>>>> >>>>>> On 18 November 2016 at 16:40, Fernando E Alva Manchego < >>>>>> fealvamanche...@sheffield.ac.uk> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Sorry for the late reply. I have downloaded joshua again and >>>>>>> followed the updated procedure, but I still get the same error when >>>>>>> running >>>>>>> the following command: >>>>>>> >>>>>>> $JOSHUA/bin/pipeline.pl \ >>>>>>> --rundir 1 \ >>>>>>> --readme "Baseline Hiero run" \ >>>>>>> --source es \ >>>>>>> --target en \ >>>>>>> --type hiero \ >>>>>>> --corpus $FISHER/corpus/asr/fisher_train \ >>>>>>> --tune $FISHER/corpus/asr/fisher_dev \ >>>>>>> --test $FISHER/corpus/asr/fisher_dev2 \ >>>>>>> --maxlen 11 \ >>>>>>> --maxlen-tune 11 \ >>>>>>> --maxlen-test 11 \ >>>>>>> --tuner-iterations 1 \ >>>>>>> --lm-order 3 >>>>>>> >>>>>>> The error is still: >>>>>>> [pack-grammar] rebuilding... >>>>>>> dep= $HOME/joshua-tutorial/runs/1/grammar.packed/vocabulary [NOT >>>>>>> FOUND] >>>>>>> dep= $HOME/joshua-tutorial/runs/1/grammar.packed/encoding [NOT >>>>>>> FOUND] >>>>>>> dep= $HOME/joshua-tutorial/runs/1/grammar.packed/slice_00000.source >>>>>>> [NOT FOUND] >>>>>>> cmd= $JOSHUA/scripts/support/grammar-packer.pl -a -T /tmp -m 8g >>>>>>> -g grammar.gz -o $HOME/joshua-tutorial/runs/1/grammar.packed >>>>>>> JOB FAILED (return code 1) >>>>>>> Exception in thread "main" java.util.NoSuchElementException >>>>>>> at org.apache.joshua.util.io.LineReader.next(LineReader.java:276) >>>>>>> at org.apache.joshua.tools.GrammarPacker.getGrammarReader(Gramm >>>>>>> arPacker.java:239) >>>>>>> at org.apache.joshua.tools.GrammarPacker.pack(GrammarPacker.jav >>>>>>> a:184) >>>>>>> at org.apache.joshua.tools.GrammarPackerCli.run(GrammarPackerCl >>>>>>> i.java:120) >>>>>>> at org.apache.joshua.tools.GrammarPackerCli.main(GrammarPackerC >>>>>>> li.java:137) >>>>>>> * FATAL: Couldn't pack the grammar. >>>>>>> * Copying sorted grammars (/tmp/grammar.gzTQzG) to current directory. >>>>>>> >>>>>>> What I have noticed now is that, when running the tests after >>>>>>> compilation, this error message appears: >>>>>>> >>>>>>> ERROR - Can't find libken.so (libken.dylib on OS X) on the Java >>>>>>> library path. >>>>>>> WARN - No glue grammar found! Creating dummy glue grammar. >>>>>>> >>>>>>> Could that be the source of the error? Thank you. >>>>>>> >>>>>>> @Lewis: I'll make sure to given them your regards. >>>>>>> >>>>>>> Best >>>>>>> Fernando >>>>>>> >>>>>>> On 18 November 2016 at 13:42, Matt Post <p...@cs.jhu.edu> wrote: >>>>>>> >>>>>>>> I just updated that page to use "mvn package" instead of the old >>>>>>>> "mvn compile assembly:single". So Fernando, please make sure you >>>>>>>> follow the >>>>>>>> updated instructions. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Nov 17, 2016, at 10:10 PM, lewis john mcgibbney < >>>>>>>> lewi...@apache.org> wrote: >>>>>>>> >>>>>>>> Hi Fernando, >>>>>>>> First and foremost please give y regards to the GATE team at >>>>>>>> Sheffield. I spent a great week down there a number of years back and >>>>>>>> I am >>>>>>>> fond of the place. >>>>>>>> Are you following the tutorial at https://cwiki.apache.org/confl >>>>>>>> uence/display/JOSHUA/Joshua+Tutorial ? >>>>>>>> If so then I'll try it out and see if I can reproduce. >>>>>>>> Lewis >>>>>>>> >>>>>>>> On Thu, Nov 17, 2016 at 9:38 AM, <user-digest-help@joshua.incub >>>>>>>> ator.apache.org> wrote: >>>>>>>> >>>>>>>>> From: Fernando E Alva Manchego <fealvamanche...@sheffield.ac.uk> >>>>>>>>> To: user@joshua.incubator.apache.org >>>>>>>>> Cc: >>>>>>>>> Date: Thu, 17 Nov 2016 17:37:53 +0000 >>>>>>>>> Subject: Error while running the tutorial >>>>>>>>> Hello! >>>>>>>>> >>>>>>>>> I'm running the tutorial (phrase) and the following error came up: >>>>>>>>> >>>>>>>>> Error: Could not find or load main class >>>>>>>>> org.apache.joshua.tools.GrammarPackerCli >>>>>>>>> >>>>>>>>> When I installed Joshua, I ran the tests and everything was OK. Do >>>>>>>>> you have any idea what might be happening? Thank you. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >> > >