Re: [Moses-support] (no subject)

Sanjanashree Palanivel Fri, 06 May 2016 06:05:08 -0700

Why do i get " Use of uninitialized value in string eq at
/home/mosesdecoder/scripts/Transliteration/clean.pl line 139, <$IN> line
1."  while training transliteration model ... what is wrong


On Fri, May 6, 2016 at 4:20 PM, Sanjanashree Palanivel <
[email protected]> wrote:

> I installed mgiza, and copied those binary subfolder in  the same folder
> where i got giza++ binary files and also merge_alignment.py file. but still
> i get error, in this case I am getting an error stating
>
> Training Transliteration Module - Start
> Fri May  6 16:16:04 IST 2016
> Creating Model
> Extracting 1-1 Alignments
> Cleaning the list for Miner
> Source is Latin
> will run Transliteration module
> Three preprocessing steps to do:
>  1) Delete Symbol      2) Delete Latin from non-Latin langauge      3)
> Character Frequency based filtering
> STARTING 1 and 2 ...
> Use of uninitialized value in string eq at
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl
> line 139, <$IN> line 1.
> Use of uninitialized value $wrds[1] in numeric lt (<) at
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl
> line 143, <$IN> line 1.
> Use of uninitialized value $retur in numeric eq (==) at
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl
> line 61, <$IN> line 1.
> DONE 1 and 2
> STARTING 3) Preprocessing for Character filtering...
> Use of uninitialized value $keys[0] in hash element at
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl
> line 197.
> Use of uninitialized value $bestsrcfreq in multiplication (*) at
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl
> line 198.
> Use of uninitialized value $keys[0] in hash element at
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl
> line 227.
> Use of uninitialized value $besttrgfreq in multiplication (*) at
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/clean.pl
> line 228.
> DONE 3
> Extracting Transliteration Pairs
> Constructing Graph
> Computing Probs : iteration 1
> Computing Probs : iteration 2
> Computing Probs : iteration 3
> Computing Probs : iteration 4
> Computing Probs : iteration 5
> Computing Probs : iteration 6
> Computing Probs : iteration 7
> Computing Probs : iteration 8
> Computing Probs : iteration 9
> Computing Probs : iteration 10
> Finished...
> Selecting Transliteration Pairs with threshold 0.5
> Name "main::hash" used only once: possible typo at
> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/
> threshold.pl line 26.
> Preparing Corpus
> Align Corpus
> Using SCRIPTS_ROOTDIR: /home/sanjana/Documents/SMT/mosesdecoder/scripts
> Using multi-thread GIZA
> using gzip
> (1) preparing corpus @ Fri May  6 16:16:05 IST 2016
> Executing: mkdir -p
> /home/sanjana/Documents/SMT/Transliteration/training/prepared
> (1.0) selecting factors @ Fri May  6 16:16:05 IST 2016
> (1.1) running mkcls  @ Fri May  6 16:16:05 IST 2016
> /home/sanjana/Documents/SMT/mosesdecoder/tools/mkcls -c50 -n2
> -p/home/sanjana/Documents/SMT/Transliteration/training/corpus.en
> -V/home/sanjana/Documents/SMT/Transliteration/training/prepared/en.vcb.classes
> opt
> Executing: /home/sanjana/Documents/SMT/mosesdecoder/tools/mkcls -c50 -n2
> -p/home/sanjana/Documents/SMT/Transliteration/training/corpus.en
> -V/home/sanjana/Documents/SMT/Transliteration/training/prepared/en.vcb.classes
> opt
> ERROR: Execution of: /home/sanjana/Documents/SMT/mosesdecoder/tools/mkcls
> -c50 -n2 -p/home/sanjana/Documents/SMT/Transliteration/training/corpus.en
> -V/home/sanjana/Documents/SMT/Transliteration/training/prepared/en.vcb.classes
> opt
>
>
> On Fri, May 6, 2016 at 4:09 PM, Nadir Durrani <[email protected]>
> wrote:
>
>> You need to check if you have mgiza and its required components in the
>> external bin directory. Here's the git
>>
>> https://github.com/moses-smt/mgiza
>>
>> Have you ever trained a Moses SMT system? Here are the instructions.
>>
>> http://www.statmt.org/moses/?n=Development.GetStarted
>>
>>
>>
>> On Fri, May 6, 2016 at 11:36 AM, Sanjanashree Palanivel <
>> [email protected]> wrote:
>>
>>> Dear nadir,
>>>
>>>            How the input should be given to train transliteration, Is
>>> just raw parallel corpus enough?
>>>
>>>            When I try running this script
>>>
>>> /home/sanjana/Documents/SMT/mosesdecoder/scripts/Transliteration/
>>>> train-transliteration-module.pl --corpus-f DATA/ICON15/H_train.en
>>>> --corpus-e DATA/ICON15/H_train.hi --alignment
>>>> /home/sanjana/Documents/SMT/ICON15/Health/BL/En_H/model/aligned.grow-diag-final-and
>>>> --moses-src-dir /home/sanjana/Documents/SMT/mosesdecoder --external-bin-dir
>>>> /home/sanjana/Documents/SMT/mosesdecoder/tools --input-extension en
>>>> --output-extension hi --srilm-dir
>>>> /home/sanjana/Documents/SMT/srilm-1.7.1/bin/i686-m64 --out-dir
>>>> /home/sanjana/Documents/SMT/Transliteration
>>>>
>>>
>>> But Giza is not running i guess, because i do not find any folders
>>> regarding giza,
>>>
>>> I understand that the transliteration scripts works fine. But why I am
>>> unable to train models.
>>>
>>>  What mistake I  am doing.
>>>
>>>  SRILM was installed correctly, when i checked with ngram-count, it
>>> worked fine.
>>>
>>> Why error mentioning multi thread giza has occured, (i didnt install
>>> mgiza). Do I have to install mgiza.
>>>
>>> Please guide me, I do not understand why it is not working
>>>
>>>
>>> On Fri, May 6, 2016 at 7:08 AM, Sanjanashree Palanivel <
>>> [email protected]> wrote:
>>>
>>>> Dear nadir,
>>>>      Thanks a lot... i will just wrk on what you have said... and
>>>> update you what happens..
>>>> On May 6, 2016 4:36 AM, "Nadir Durrani" <[email protected]>
>>>> wrote:
>>>>
>>>>>
>>>>> I can only ensure that there's no bug in the scripts. You will need to
>>>>> debug and troubleshoot the problem. The files I sent you should be 
>>>>> helpful.
>>>>> Here are the steps
>>>>>
>>>>> Mining
>>>>>
>>>>> 1. Extract 1-1 alignments from parallel data, compare "1-1.en-hi" file
>>>>> with mine
>>>>> 2. Clean the list and make ready for miner, compare 1-1.en-hi.cleaned
>>>>> with mine
>>>>> 3. TMining to extract transliteration pairs,
>>>>> compare 1-1.en-hi.pair-probs with mine
>>>>> 4. Threshold.pl to extract the transliteration corpus,
>>>>> compare 1-1.en-hi.mined-pairs with mine
>>>>>
>>>>> Transliteration Model
>>>>>
>>>>> 1. Running Giza on the corpus, you should be able to see giza and
>>>>> giza-ineverse folders inside training and 
>>>>> model/aligned.grow-diag-final-and
>>>>> 2. Model training, you should be able to see following files inside
>>>>> model folder
>>>>>
>>>>>  extract.inv.sorted.gz  extract.sorted.gz  lex.e2f  lex.f2e  moses.ini
>>>>>  phrase-table.gz
>>>>>
>>>>> and targetLM.bin inside lm folder
>>>>>
>>>>> 3. Tune the system, tuning folder should have the following files
>>>>>
>>>>> filtered  input  moses.filtered.ini  moses.ini  moses.tuned.ini
>>>>>  reference  tmp
>>>>>
>>>>> moses.ini is the final file that is created. if you open it you will
>>>>> see the BLEU scores for tuning-set (if it ran properly)
>>>>>
>>>>> Just make sure that your moses is compiled fine and works properly. If
>>>>> things still don't work then try pulling a new version and recompile from
>>>>> scratch.
>>>>>
>>>>> Good luck
>>>>>
>>>>> Nadir
>>>>>
>>>>> On Thu, May 5, 2016 at 6:44 PM, Sanjanashree Palanivel <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Dear Nadir,
>>>>>>
>>>>>>            Thanks a lot... But why  i couldn't  train transliteration
>>>>>> model or do anything reg transliteration.. what should i do to make it
>>>>>> work..Please help me in this..
>>>>>>
>>>>>> On Thu, May 5, 2016 at 8:18 PM, Nadir Durrani <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> I just asked for the word-alignment :-)
>>>>>>>
>>>>>>> Anyway, I ran your script with my paths and it ran fine. I am
>>>>>>> attaching my Transliteration folder.
>>>>>>>
>>>>>>> As you can see in
>>>>>>>
>>>>>>> 1-1.en-hi.mined-pairs
>>>>>>>
>>>>>>> roughly 4000 transliteration pairs were mined. The threshold.pl
>>>>>>> script selects from word pairs which have probability lower than 0.5. 
>>>>>>> The
>>>>>>> entire list with probs can be seen in
>>>>>>>
>>>>>>> 1-1.en-hi.pair-probs
>>>>>>>
>>>>>>> Lower the probability number, better transliteration it is.
>>>>>>>
>>>>>>> Looking at the transliteration module and tuning run, you can see
>>>>>>> that transliteration system is pretty good. Check out
>>>>>>>
>>>>>>> moses.ini
>>>>>>>
>>>>>>> in the tuning folder. Tuning BLEU is 91.48 which is great.
>>>>>>>
>>>>>>> Nadir
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, May 5, 2016 at 3:18 PM, Sanjanashree Palanivel <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> 
>>>>>>>>  En_H.zip
>>>>>>>> <https://drive.google.com/file/d/0Bwi7uqU0aYEzQ1djLWlGMnQ3Q2c/view?usp=drive_web>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have attached you the zip file
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks and regards,
>>>>>>>>
>>>>>>>> Sanjanasri J.P
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Thanks and regards,
>>>>>>
>>>>>> Sanjanasri J.P
>>>>>>
>>>>>
>>>>>
>>>
>>>
>>> --
>>> Thanks and regards,
>>>
>>> Sanjanasri J.P
>>>
>>
>>
>
>
> --
> Thanks and regards,
>
> Sanjanasri J.P
>



-- 
Thanks and regards,

Sanjanasri J.P

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] (no subject)

Reply via email to