Hi Sanjna,

You are getting the error in threshold.pl

This means that Miner did not function correctly and the *.probs file which
the threshold script takes is empty. Are you running the training manually
or through train-transliteration-module.pl? Please make sure to run the
cleaning script on your word list before running the miner.

If above doesn't help, send me your 1-1 word list or parallel data (with
alignments) on which miner is running.

Cheers,
Nadir

On Thu, May 5, 2016 at 1:54 PM, <[email protected]> wrote:

> Send Moses-support mailing list submissions to
>         [email protected]
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://mailman.mit.edu/mailman/listinfo/moses-support
> or, via email, send a message with subject or body 'help' to
>         [email protected]
>
> You can reach the person managing the list at
>         [email protected]
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Moses-support digest..."
>
>
> Today's Topics:
>
>    1. Re: Data for building a factored model (Sa?o Kuntaric)
>    2. Tranliteration error (Sanjanashree Palanivel)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Wed, 4 May 2016 21:30:17 +0200
> From: Sa?o Kuntaric <[email protected]>
> Subject: Re: [Moses-support] Data for building a factored model
> To: Marwa Refaie <[email protected]>
> Cc: [email protected]
> Message-ID:
>         <CANsquDosSSn=__
> [email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> Hello again,
>
> I believe I can wrap my head around the theoretical part, but the English
> and German corpora in the Moses factored model tutorial (
> http://www.statmt.org/moses/?n=Moses.FactoredTutorial) look beautifully
> factored, so my question is how were the original corpora processed? Was a
> specific tagger used and was there any manual/script postprocessing done?
>
> And since I am already bugging everyone, how is the language model pos.lm
> created? Is it extracted from a file, created manually or in another way?
>
> Thank you in advance for all the replies.
>
> Best regards,
>
> Sa?o
>
> 2016-05-02 19:45 GMT+02:00 Marwa Refaie <[email protected]>:
>
> > Corpus for translation model should be on 2 parallel files in the format
> > Word | pos | Lema .... For example , by a file for each language. You can
> > prepare files using word net , Stanford , or any tagger & stemmer  as can
> > deal with your language pairs. May be before enter the files to moses you
> > should adjust the text files by a python script (write it your self)
> >
> > For language model ... You must build it as follows
> > Verb noun noun
> > Noun Det adj
> > ....... Depending on the target language only ,, Then build it as usual
> > n-gram lm.
> >
> > Sent from my iPad
> >
> > > On May 2, 2016, at 10:11, Sa?o Kuntaric <[email protected]>
> wrote:
> > >
> > > Hi all,
> > >
> > > I am having some issues producing the corpora in the correct format for
> > Moses to execute factored training.
> > >
> > > I am looking at the factored tutorial on the Moses website and I am
> > wondering, how to get such consistent corpora for two languages. What
> tools
> > are being used and can they be trained for specific languages (Slovenian
> in
> > my example). Are such tools available for download or is such data
> produced
> > with custom scripts?
> > >
> > > --
> > > Best regards,
> > >
> > > Sa?o
> > > _______________________________________________
> > > Moses-support mailing list
> > > [email protected]
> > > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
>
>
>
> --
> lp,
>
> Sa?o
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20160504/4ecbc25b/attachment-0001.html
>
> ------------------------------
>
> Message: 2
> Date: Thu, 5 May 2016 16:24:04 +0530
> From: Sanjanashree Palanivel <[email protected]>
> Subject: [Moses-support] Tranliteration error
> To: [email protected]
> Message-ID:
>         <CAAc_kp69zSo0hBAkO=
> [email protected]>
> Content-Type: text/plain; charset="utf-8"
>
> Dear All,
>
>
>          When I try to train transliteration i get following error, I dont
> know what is missing please help.
>
> Extracting Transliteration Pairs
> > Constructing Graph
> > Computing Probs : iteration 1
> > Computing Probs : iteration 2
> > Computing Probs : iteration 3
> > Computing Probs : iteration 4
> > Computing Probs : iteration 5
> > Computing Probs : iteration 6
> > Computing Probs : iteration 7
> > Computing Probs : iteration 8
> > Computing Probs : iteration 9
> > Computing Probs : iteration 10
> > Finished...
> > Selecting Transliteration Pairs with threshold 0.5
> > Name "main::hash" used only once: possible typo at
> > /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/
> > threshold.pl line 26.
> > Preparing Corpus
> > Align Corpus
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > Using multi-thread GIZA
> > ERROR: Cannot find
> > /home/sanjana/Documents/SMT/mosesdecoder/tools/merge_alignment.py at
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/train-model.perl
> > line 393.
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > Using multi-thread GIZA
> > ERROR: Cannot find
> > /home/sanjana/Documents/SMT/mosesdecoder/tools/merge_alignment.py at
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/train-model.perl
> > line 393.
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > Using multi-thread GIZA
> > ERROR: Cannot find
> > /home/sanjana/Documents/SMT/mosesdecoder/tools/merge_alignment.py at
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/train-model.perl
> > line 393.
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > using gzip
> > (3) generate word alignment @ Thu May  5 16:19:50 IST 2016
> > Combining forward and inverted alignment from files:
> >
> >
> /home/sanjana/Documents/SMT/Transliteration/training/giza-inverse/en-hi.A3.final.{bz2,gz}
> >
> >
> /home/sanjana/Documents/SMT/Transliteration/training/giza/hi-en.A3.final.{bz2,gz}
> > ERROR: Can't read
> >
> /home/sanjana/Documents/SMT/Transliteration/training/giza-inverse/en-hi.A3.final.{bz2,gz}
> > Train Translation Models
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > using gzip
> > (4) generate lexical translation table 0-0 @ Thu May  5 16:19:50 IST 2016
> >
> >
> (/home/sanjana/Documents/SMT/Transliteration/training/corpus.en,/home/sanjana/Documents/SMT/Transliteration/training/corpus.hi,/home/sanjana/Documents/SMT/Transliteration/model/lex)
> > ERROR: Can't read
> >
> /home/sanjana/Documents/SMT/Transliteration/model/aligned.grow-diag-final-and
> > at
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/LexicalTranslationModel.pm
> > line 92.
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > using gzip
> > (5) extract phrases @ Thu May  5 16:19:50 IST 2016
> > File not found:
> >
> /home/sanjana/Documents/SMT/Transliteration/model/aligned.grow-diag-final-and
> > at
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/train-model.perl
> > line 1609.
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > using gzip
> > (6) score phrases @ Thu May  5 16:19:50 IST 2016
> > (6.1)  creating table half
> > /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.f2e @
> > Thu May  5 16:19:50 IST 2016
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/generic/score-parallel.perl
> > 8 "sort    "
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/../bin/score
> > /home/sanjana/Documents/SMT/Transliteration/model/extract.sorted.gz
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.f2e
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.f2e.gz
> > --KneserNey  0
> > Executing:
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/generic/score-parallel.perl
> > 8 "sort    "
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/../bin/score
> > /home/sanjana/Documents/SMT/Transliteration/model/extract.sorted.gz
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.f2e
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.f2e.gz
> > --KneserNey  0
> > using gzip
> > Started Thu May  5 16:19:50 2016
> > gzip:
> /home/sanjana/Documents/SMT/Transliteration/model/extract.sorted.gz:
> > No such file or directory
> > /home/sanjana/Documents/SMT/mosesdecoder/scripts/../bin/score
> > /home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/extract.0.gz
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.f2e
> >
> /home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/phrase-table.half.0000000.gz
> > --KneserNey  2>> /dev/stderr
> > /home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/
> >
> run.0.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/run.1.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/run.2.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/run.3.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/run.4.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/run.5.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/run.6.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/run.7.shScore
> > v2.1 -- scoring methods for extracted rules
> > adjusting phrase translation probabilities with Kneser Ney discounting
> > Loading lexical translation table from
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.f2eCan't read
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.f2e
> > mv
> >
> /home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/phrase-table.half.0000000.gz
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.f2e.gzmv:
> > cannot stat
> >
> '/home/sanjana/Documents/SMT/Transliteration/model/tmp.10464/phrase-table.half.0000000.gz':
> > No such file or directory
> > Exit code: 1
> > ERROR: Scoring of phrases failed at
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/train-model.perl
> > line 1773.
> > (6.3)  creating table half
> > /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.e2f @
> > Thu May  5 16:19:50 IST 2016
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/generic/score-parallel.perl
> > 8 "sort    "
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/../bin/score
> > /home/sanjana/Documents/SMT/Transliteration/model/extract.inv.sorted.gz
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.e2f
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.e2f.gz
> > --Inverse --KneserNey  1
> > Executing:
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/generic/score-parallel.perl
> > 8 "sort    "
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/../bin/score
> > /home/sanjana/Documents/SMT/Transliteration/model/extract.inv.sorted.gz
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.e2f
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.e2f.gz
> > --Inverse --KneserNey  1
> > using gzip
> > Started Thu May  5 16:19:50 2016
> > gzip:
> > /home/sanjana/Documents/SMT/Transliteration/model/extract.inv.sorted.gz:
> No
> > such file or directory
> > /home/sanjana/Documents/SMT/mosesdecoder/scripts/../bin/score
> > /home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/extract.0.gz
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.e2f
> >
> /home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/phrase-table.half.0000000.gz
> > --Inverse --KneserNey  2>> /dev/stderr
> > /home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/
> >
> run.0.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/run.1.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/run.2.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/run.3.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/run.5.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/run.6.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/run.7.sh/home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/run.4.shScore
> > v2.1 -- scoring methods for extracted rules
> > using inverse mode
> > adjusting phrase translation probabilities with Kneser Ney discounting
> > Loading lexical translation table from
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.e2fCan't read
> > /home/sanjana/Documents/SMT/Transliteration/model/lex.e2f
> > gunzip -c
> >
> /home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/phrase-table.half.*.gz
> > 2>> /dev/stderr| LC_ALL=C sort     -T
> > /home/sanjana/Documents/SMT/Transliteration/model/tmp.10512  | gzip -c >
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.e2f.gz
> > 2>> /dev/stderr gzip:
> >
> /home/sanjana/Documents/SMT/Transliteration/model/tmp.10512/phrase-table.half.*.gz:
> > No such file or directory
> > rm -rf /home/sanjana/Documents/SMT/Transliteration/model/tmp.10512
> > Finished Thu May  5 16:19:50 2016
> > (6.6) consolidating the two halves @ Thu May  5 16:19:50 IST 2016
> > Executing:
> > /home/sanjana/Documents/SMT/mosesdecoder/scripts/../bin/consolidate
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.f2e.gz
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.e2f.gz
> > /dev/stdout --KneserNey
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.f2e.gz.coc
> > | gzip -c >
> > /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.gz
> > Consolidate v2.0 written by Philipp Koehn
> > consolidating direct and indirect rule tables
> > adjusting phrase translation probabilities with Kneser Ney discounting
> > Can't read
> >
> /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.f2e.gz.coc
> > Executing: rm -f
> > /home/sanjana/Documents/SMT/Transliteration/model/phrase-table.half.*
> > Train Language Models
> > one of required modified KneserNey count-of-counts is zero
> > error in discount estimator for order 2
> > while opening /home/sanjana/Documents/SMT/Transliteration/lm/targetLM
> > ERROR
> > Create Config File
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > using gzip
> > ERROR: Language model file not found or empty:
> > /home/sanjana/Documents/SMT/Transliteration/lm/targetLM.bin at
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/train-model.perl
> > line 602.
> > Running Tuning for Transliteration Module
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > using gzip
> > (9) create moses.ini @ Thu May  5 16:19:50 IST 2016
> > Executing: mkdir -p
> > /home/sanjana/Documents/SMT/Transliteration/tuning/filtered
> > Stripping XML...
> > Executing:
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/../generic/strip-xml.perl
> > < /home/sanjana/Documents/SMT/Transliteration/tuning/input >
> > /home/sanjana/Documents/SMT/Transliteration/tuning/filtered/input.10592
> > pt:PhraseDictionaryMemory name=TranslationModel0 num-features=4
> > path=/home/sanjana/Documents/SMT/Transliteration/model/phrase-table
> > input-factor=0 output-factor=0
> > Considering factor 0
> > Filtering files...
> > filtering /home/sanjana/Documents/SMT/Transliteration/model/phrase-table
> > ->
> >
> /home/sanjana/Documents/SMT/Transliteration/tuning/filtered/phrase-table.0-0.1.1...
> > No phrases found in
> > /home/sanjana/Documents/SMT/Transliteration/model/phrase-table! at
> > /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/
> > filter-model-given-input.pl line 398.
> > sh: 1: cannot open
> > /home/sanjana/Documents/SMT/Transliteration/model/moses.ini: No such file
> > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> > File not found:
> > /home/sanjana/Documents/SMT/Transliteration/tuning/moses.filtered.ini
> > (interpreted as
> > /home/sanjana/Documents/SMT/Transliteration/tuning/moses.filtered.ini).
> at
> > /home/sanjana/Documents/SMT/mosesdecoder/scripts/training/mert-moses.pl
> > line 494.
> > cp: cannot stat
> > ?/home/sanjana/Documents/SMT/Transliteration/tuning/tmp/moses.ini?: No
> such
> > file or directory
> > ERROR cannot open base-ini
> > '/home/sanjana/Documents/SMT/Transliteration/model/moses.ini': No such
> file
> > or directory at
> >
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/ems/support/substitute-weights.perl
> > line 16.
> > Training Transliteration Module - End Thu May  5 16:19:50 IST 2016
> >
> > --
> Thanks and regards,
>
> Sanjanasri J.P
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> http://mailman.mit.edu/mailman/private/moses-support/attachments/20160505/3a332748/attachment.html
>
> ------------------------------
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> End of Moses-support Digest, Vol 115, Issue 4
> *********************************************
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to