Hi Kellen and Matt, On Tue, Jul 19, 2016 at 8:20 PM, < dev-digest-h...@joshua.incubator.apache.org> wrote:
> From: Matt Post <p...@cs.jhu.edu> > To: dev@joshua.incubator.apache.org > Cc: > Date: Sun, 17 Jul 2016 23:30:33 -0400 > Subject: Re: Issue Building LM on master branch > Lewis — This is a good-sized dataset, and on a single desktop machine, I > expect it would take at least a day to go all the way through alignment, > model-building, and tuning. > OK thanks for the estimate. > > fast_align is a good idea, though it isn't integrated into the pipeline > (shouldn't be too hard, and is on the list). You could also just try > "--aligner berkeley" and see if that works. > I'll do exactly that. Starting with berkeley first and then moving on to fast_align. I'll update here with any progress. > > Do you see anything in the GIZA error logs (RUNDIR/alignment/0/...)? > Sometimes GIZA doesn't compile correctly, and this could be an error where > it doesn't find GIZA++ or one of the support binaries (mkcls, snt2cooc.out). > > AFAICT I don't see any errors prior to the bottom dozen or so lines. I've put the log below and would greatly appreciate if you could have a look through it and provide some feedback. http://home.apache.org/~lewismc/giza.log I'll update this thread on the berkeley alignment outcome before shooting to use the fast_align. Thanks both again. Lewis