Why do i get " Use of uninitialized value in string eq at /home/mosesdecoder/scripts/Transliteration/clean.pl line 139, <$IN> line 1." while training transliteration model ... what is wrong
On Fri, May 6, 2016 at 4:20 PM, Sanjanashree Palanivel < [email protected]> wrote: > I installed mgiza, and copied those binary subfolder in the same folder > where i got giza++ binary files and also merge_alignment.py file. but still > i get error, in this case I am getting an error stating > > Training Transliteration Module - Start > Fri May 6 16:16:04 IST 2016 > Creating Model > Extracting 1-1 Alignments > Cleaning the list for Miner > Source is Latin > will run Transliteration module > Three preprocessing steps to do: > 1) Delete Symbol 2) Delete Latin from non-Latin langauge 3) > Character Frequency based filtering > STARTING 1 and 2 ... > Use of uninitialized value in string eq at > /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl > line 139, <$IN> line 1. > Use of uninitialized value $wrds[1] in numeric lt (<) at > /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl > line 143, <$IN> line 1. > Use of uninitialized value $retur in numeric eq (==) at > /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl > line 61, <$IN> line 1. > DONE 1 and 2 > STARTING 3) Preprocessing for Character filtering... > Use of uninitialized value $keys[0] in hash element at > /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl > line 197. > Use of uninitialized value $bestsrcfreq in multiplication (*) at > /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl > line 198. > Use of uninitialized value $keys[0] in hash element at > /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl > line 227. > Use of uninitialized value $besttrgfreq in multiplication (*) at > /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl > line 228. > DONE 3 > Extracting Transliteration Pairs > Constructing Graph > Computing Probs : iteration 1 > Computing Probs : iteration 2 > Computing Probs : iteration 3 > Computing Probs : iteration 4 > Computing Probs : iteration 5 > Computing Probs : iteration 6 > Computing Probs : iteration 7 > Computing Probs : iteration 8 > Computing Probs : iteration 9 > Computing Probs : iteration 10 > Finished... > Selecting Transliteration Pairs with threshold 0.5 > Name "main::hash" used only once: possible typo at > /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/ > threshold.pl line 26. > Preparing Corpus > Align Corpus > Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts > Using multi-thread GIZA > using gzip > (1) preparing corpus @ Fri May 6 16:16:05 IST 2016 > Executing: mkdir -p > /home/sanjana/Documents/SMT/Transliteration/training/prepared > (1.0) selecting factors @ Fri May 6 16:16:05 IST 2016 > (1.1) running mkcls @ Fri May 6 16:16:05 IST 2016 > /home/sanjana/Documents/SMT/mosesdecoder/tools/mkcls -c50 -n2 > -p/home/sanjana/Documents/SMT/Transliteration/training/corpus.en > -V/home/sanjana/Documents/SMT/Transliteration/training/prepared/en.vcb.classes > opt > Executing: /home/sanjana/Documents/SMT/mosesdecoder/tools/mkcls -c50 -n2 > -p/home/sanjana/Documents/SMT/Transliteration/training/corpus.en > -V/home/sanjana/Documents/SMT/Transliteration/training/prepared/en.vcb.classes > opt > ERROR: Execution of: /home/sanjana/Documents/SMT/mosesdecoder/tools/mkcls > -c50 -n2 -p/home/sanjana/Documents/SMT/Transliteration/training/corpus.en > -V/home/sanjana/Documents/SMT/Transliteration/training/prepared/en.vcb.classes > opt > > > On Fri, May 6, 2016 at 4:09 PM, Nadir Durrani <[email protected]> > wrote: > >> You need to check if you have mgiza and its required components in the >> external bin directory. Here's the git >> >> https://github.com/moses-smt/mgiza >> >> Have you ever trained a Moses SMT system? Here are the instructions. >> >> http://www.statmt.org/moses/?n=Development.GetStarted >> >> >> >> On Fri, May 6, 2016 at 11:36 AM, Sanjanashree Palanivel < >> [email protected]> wrote: >> >>> Dear nadir, >>> >>> How the input should be given to train transliteration, Is >>> just raw parallel corpus enough? >>> >>> When I try running this script >>> >>> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/ >>>> train-transliteration-module.pl --corpus-f DATA/ICON15/H_train.en >>>> --corpus-e DATA/ICON15/H_train.hi --alignment >>>> /home/sanjana/Documents/SMT/ICON15/Health/BL/En_H/model/aligned.grow-diag-final-and >>>> --moses-src-dir /home/sanjana/Documents/SMT/mosesdecoder --external-bin-dir >>>> /home/sanjana/Documents/SMT/mosesdecoder/tools --input-extension en >>>> --output-extension hi --srilm-dir >>>> /home/sanjana/Documents/SMT/srilm-1.7.1/bin/i686-m64 --out-dir >>>> /home/sanjana/Documents/SMT/Transliteration >>>> >>> >>> But Giza is not running i guess, because i do not find any folders >>> regarding giza, >>> >>> I understand that the transliteration scripts works fine. But why I am >>> unable to train models. >>> >>> What mistake I am doing. >>> >>> SRILM was installed correctly, when i checked with ngram-count, it >>> worked fine. >>> >>> Why error mentioning multi thread giza has occured, (i didnt install >>> mgiza). Do I have to install mgiza. >>> >>> Please guide me, I do not understand why it is not working >>> >>> >>> On Fri, May 6, 2016 at 7:08 AM, Sanjanashree Palanivel < >>> [email protected]> wrote: >>> >>>> Dear nadir, >>>> Thanks a lot... i will just wrk on what you have said... and >>>> update you what happens.. >>>> On May 6, 2016 4:36 AM, "Nadir Durrani" <[email protected]> >>>> wrote: >>>> >>>>> >>>>> I can only ensure that there's no bug in the scripts. You will need to >>>>> debug and troubleshoot the problem. The files I sent you should be >>>>> helpful. >>>>> Here are the steps >>>>> >>>>> Mining >>>>> >>>>> 1. Extract 1-1 alignments from parallel data, compare "1-1.en-hi" file >>>>> with mine >>>>> 2. Clean the list and make ready for miner, compare 1-1.en-hi.cleaned >>>>> with mine >>>>> 3. TMining to extract transliteration pairs, >>>>> compare 1-1.en-hi.pair-probs with mine >>>>> 4. Threshold.pl to extract the transliteration corpus, >>>>> compare 1-1.en-hi.mined-pairs with mine >>>>> >>>>> Transliteration Model >>>>> >>>>> 1. Running Giza on the corpus, you should be able to see giza and >>>>> giza-ineverse folders inside training and >>>>> model/aligned.grow-diag-final-and >>>>> 2. Model training, you should be able to see following files inside >>>>> model folder >>>>> >>>>> extract.inv.sorted.gz extract.sorted.gz lex.e2f lex.f2e moses.ini >>>>> phrase-table.gz >>>>> >>>>> and targetLM.bin inside lm folder >>>>> >>>>> 3. Tune the system, tuning folder should have the following files >>>>> >>>>> filtered input moses.filtered.ini moses.ini moses.tuned.ini >>>>> reference tmp >>>>> >>>>> moses.ini is the final file that is created. if you open it you will >>>>> see the BLEU scores for tuning-set (if it ran properly) >>>>> >>>>> Just make sure that your moses is compiled fine and works properly. If >>>>> things still don't work then try pulling a new version and recompile from >>>>> scratch. >>>>> >>>>> Good luck >>>>> >>>>> Nadir >>>>> >>>>> On Thu, May 5, 2016 at 6:44 PM, Sanjanashree Palanivel < >>>>> [email protected]> wrote: >>>>> >>>>>> Dear Nadir, >>>>>> >>>>>> Thanks a lot... But why i couldn't train transliteration >>>>>> model or do anything reg transliteration.. what should i do to make it >>>>>> work..Please help me in this.. >>>>>> >>>>>> On Thu, May 5, 2016 at 8:18 PM, Nadir Durrani < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> I just asked for the word-alignment :-) >>>>>>> >>>>>>> Anyway, I ran your script with my paths and it ran fine. I am >>>>>>> attaching my Transliteration folder. >>>>>>> >>>>>>> As you can see in >>>>>>> >>>>>>> 1-1.en-hi.mined-pairs >>>>>>> >>>>>>> roughly 4000 transliteration pairs were mined. The threshold.pl >>>>>>> script selects from word pairs which have probability lower than 0.5. >>>>>>> The >>>>>>> entire list with probs can be seen in >>>>>>> >>>>>>> 1-1.en-hi.pair-probs >>>>>>> >>>>>>> Lower the probability number, better transliteration it is. >>>>>>> >>>>>>> Looking at the transliteration module and tuning run, you can see >>>>>>> that transliteration system is pretty good. Check out >>>>>>> >>>>>>> moses.ini >>>>>>> >>>>>>> in the tuning folder. Tuning BLEU is 91.48 which is great. >>>>>>> >>>>>>> Nadir >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, May 5, 2016 at 3:18 PM, Sanjanashree Palanivel < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> >>>>>>>> En_H.zip >>>>>>>> <https://drive.google.com/file/d/0Bwi7uqU0aYEzQ1djLWlGMnQ3Q2c/view?usp=drive_web> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I have attached you the zip file >>>>>>>> >>>>>>>> -- >>>>>>>> Thanks and regards, >>>>>>>> >>>>>>>> Sanjanasri J.P >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Thanks and regards, >>>>>> >>>>>> Sanjanasri J.P >>>>>> >>>>> >>>>> >>> >>> >>> -- >>> Thanks and regards, >>> >>> Sanjanasri J.P >>> >> >> > > > -- > Thanks and regards, > > Sanjanasri J.P > -- Thanks and regards, Sanjanasri J.P
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
