Hi Folks,
When attempting to build a heiro model using 5K sentences for tuning, many
many more than that for testing and again many many more than that for the
actual corpus (~880K) I get the following error within the GIZA alignment
pipeline phase.

Anyone have a clue what this means? I have the full GIZA logs if they are
useful.
I did find a thread on a VERY similar issue at [0]. The solution seems to
be to use absolute paths to all input data for the pipeline however that is
exactly what I've done e.g.

$JOSHUA/bin/pipeline.pl  --rundir . --type hiero --corpus
/usr/local/joshua_input/commoncrawl.ru-en --tune
/usr/local/joshua_input/commoncrawl.ru-en.tune --test
/usr/local/joshua_input/commoncrawl.ru-en.test --source en --target ru
--rundir experiment1/1 --readme “Experiment 1 Run 1 Hiero Russian to
English Translation model” --mbr

Where the parallel .en and .ru sentence files exist for all of the above
corpus, tune and test paths respectively.

[0] http://comments.gmane.org/gmane.comp.nlp.moses.user/10489

I have been having trouble consistently when generating models using
GIZA... is there a suggested alignment substitute which I should be trying
out?

One last question... roughly how long should a Hiero-based LM for a corpus
of ~880K sentences take on say a MacBook Pro 2.7GHz Interl Core i7 16GB
mem. I remeber reading a while ago on the old Joshua site that a pipeline
would run in 10 or so minutes... this is clearly not the case and I would
like to share/compare some results if possible with others who are in the
business of generating LM and language packs.

Thanks

==========================================================
Executing: bash -c rm -f alignments/0/giza.ru.0-en.0/ru.0-en.0.A3.final.gz
Executing: bash -c gzip alignments/0/giza.ru.0-en.0/ru.0-en.0.A3.final
Waiting for second GIZA process...
(3) generate word alignment @ Fri Jul 15 16:38:42 PDT 2016
Combining forward and inverted alignment from files:
  alignments/0/giza.en.0-ru.0/en.0-ru.0.A3.final.{bz2,gz}
  alignments/0/giza.ru.0-en.0/ru.0-en.0.A3.final.{bz2,gz}
Executing: bash -c mkdir -p alignments/0/model
Executing: bash -c /usr/local/incubator-joshua/ext/symal/giza2bal.pl -d
<(gzip -cd alignments/0/giza.ru.0-en.0/ru.0-en.0.A3.final.gz) -i <(gzip -cd
alignments/0/giza.en.0-ru.0/en.0-ru.0.A3.final.gz)
|/usr/local/incubator-joshua/ext/symal/symal -alignment="grow"
-diagonal="yes" -final="yes" -both="no"
-o=alignments/0/model/aligned.grow-diag-final
symal: computing grow alignment: diagonal (1) final (1)both-uncovered (0)
skip=<0> counts=<817962>
symal(9081,0x7fff76241310) malloc: *** error for object 0x7fff74472250:
pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug
bash: line 1:  9080 Done
/usr/local/incubator-joshua/ext/symal/giza2bal.pl -d <(gzip -cd
alignments/0/giza.ru.0-en.0/ru.0-en.0.A3.final.gz) -i <(gzip -cd
alignments/0/giza.en.0-ru.0/en.0-ru.0.A3.final.gz)
      9081 Abort trap: 6           |
/usr/local/incubator-joshua/ext/symal/symal -alignment="grow"
-diagonal="yes" -final="yes" -both="no"
-o=alignments/0/model/aligned.grow-diag-final
Exit code: 134
ERROR: Can't generate symmetrized alignment file



-- 
*Lewis*

Reply via email to