there are actually 7 different configurations. You have to look at the config file in steps/?/config.? For fr-en: 1. phrase-based, truecased 2. phrase-based, lowercased then recased 3. hierarchical model, lowercased then recased 4. phrase-based, lowercased then recased. Using target side word + pos factors 5. Like (2) but using batch-mira to tune 6. Like (2) but using PRO to tune 7. Like (2) but using CreateOnDiskPt to create binary phrase table You can see the BLEU scores in evaluation/report.*
Hieu Hoang http://www.hoang.co.uk/hieu On 12 December 2016 at 13:28, Shubham Khandelwal <[email protected]> wrote: > Okay Thanks Hieu. I will try it with 1TB HD-memory machine. > Btw I can see there are 4 pre-made models available for fr-en and de-en ( > http://www.statmt.org/moses/RELEASE-3.0/models/fr-en/model/ and > http://www.statmt.org/moses/RELEASE-3.0/models/de-en/model/). Can you > please tell me among these 4, which one is better model(in terms of bleu > score) except the huge model which is already there in both, as I can not > understand how analysis is shown in steps folder. > Also, Are all these pre-made models hierarchical model ? > > > On Mon, Dec 12, 2016 at 6:09 PM, Hieu Hoang <[email protected]> wrote: > >> >> >> Hieu Hoang >> http://www.hoang.co.uk/hieu >> >> On 10 December 2016 at 14:06, Shubham Khandelwal <[email protected]> >> wrote: >> >>> Yes, CreateOnDiskPt command executed without any error. >>> >>> There are 5 files in this phrase-table.3.folder: Misc.dat , Source.dat, >>> TargetColl.dat, TargetInd.dat, Vocab.dat >>> *Misc.dat and Vocab.dat files are empty. * >>> I just checked that my hard-disk memory is full as this folder took 165G >>> already. So may be, due to this reason those 2 files are empty. But >>> CreateOnDiskPt >>> command should throw an error of *No space left on machine *when it >>> stopped. >>> Let me know if no space on my machine, is the issue or not so that I can >>> go for better device having more hard-disk memory. >>> >> Good idea. Not sure who's going to do it but if you do it, please send me >> a patch & I'll check it in >> >>> >>> Also May I know that How much memory phrase-table.3.folder has in >>> general when CreateOnDiskPt command executes completely >>> as phrase-table.3.gz size is only 23GB. >>> >> I'm not too sure. Try it on a disk with 1TB and please report back what >> you find for future reference >> >>> >>> Thanking You. >>> >>> >>> On Sat, Dec 10, 2016 at 6:53 PM, Hieu Hoang <[email protected]> wrote: >>> >>>> strange, did the CreateOnDiskPt command execute ok, ie. with no error? >>>> >>>> Does this file exist: >>>> /home/shubham/models/fr-en/phrase-table.3.folder/Misc.dat >>>> If you do >>>> cat Misc.dat >>>> what does it say? >>>> >>>> Hieu Hoang >>>> http://www.hoang.co.uk/hieu >>>> >>>> On 10 December 2016 at 11:30, Shubham Khandelwal <[email protected]> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> Thanks Hieu for your reply. >>>>> I have used *CreateOnDiskPt* to binarize the model and stored in >>>>> *phrase-table.3.folder *using following command: >>>>> >>>>> >>>>> *~/mosesdecoder/bin/CreateOnDiskPt 1 1 4 100 2 phrase-table.3.gz >>>>> phrase-table.3.folder* >>>>> >>>>> Also I have made changes in *moses.ini.3 (*i.e. I have converted >>>>> *PhraseDictionaryMemory* to *PhraseDictionaryCompact* as follows*) * >>>>> >>>>> PhraseDictionaryOnDisk name=TranslationModel0 num-features=4 >>>>> path=/home/shubham/models/fr-en/phrase-table.3.folder input-factor=0 >>>>> output-factor=0 >>>>> >>>>> Now, when I run it using :* ~/mosesdecoder/bin/moses -f moses.ini.3 * , >>>>> it gave following error after *Created input-output object*: >>>>> >>>>> *terminate called after throwing an instance of 'util::Exception'* >>>>> * what(): OnDiskPt/OnDiskWrapper.cpp:217 in uint64_t >>>>> OnDiskPt::OnDiskWrapper::GetMisc(const string&) const threw >>>>> util::Exception >>>>> because `iter == m_miscInfo.end()'.* >>>>> *Couldn't find value for key NumSourceFactors* >>>>> *Aborted (core dumped)* >>>>> >>>>> Here, I do not know that what key value should I pass and how ? Can >>>>> you please help me in this regard. >>>>> >>>>> Thank you so much for your help. >>>>> >>>>> Regards, >>>>> Shubham >>>>> >>>>> On Fri, Dec 9, 2016 at 4:27 PM, Hieu Hoang <[email protected]> >>>>> wrote: >>>>> >>>>>> This is a hierarchical model. You must binarize with CreateOnDiskPt >>>>>> for this model >>>>>> >>>>>> Hieu Hoang >>>>>> http://www.hoang.co.uk/hieu >>>>>> >>>>>> On 9 December 2016 at 08:18, Shubham Khandelwal <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Thanks. It worked now. I have created compact phrase table. >>>>>>> Now, when I am running it using following command: >>>>>>> >>>>>>> *~/mosesdecoder/bin/moses >>>>>>> -f ~/Translate/models/de-en/model/moses.ini.2 -threads all* >>>>>>> >>>>>>> Then, after creating input-output object, it gave following >>>>>>> segmentation fault error: >>>>>>> >>>>>>> Created input-output object : [14.796] seconds >>>>>>> Ich bin ein Student >>>>>>> Line 0: Initialize search took 0.000 seconds total >>>>>>> Translating: <s> Ich bin ein Student </s> ||| [0,0]=X (1) [0,1]=X >>>>>>> (1) [0,2]=X (1) [0,3]=X (1) [0,4]=X (1) [0,5]=X (1) [1,1]=X (1) [1,2]=X >>>>>>> (1) >>>>>>> [1,3]=X (1) [1,4]=X (1) [1,5]=X (1) [2,2]=X (1) [2,3]=X (1) [2,4]=X (1) >>>>>>> [2,5]=X (1) [3,3]=X (1) [3,4]=X (1) [3,5]=X (1) [4,4]=X (1) [4,5]=X (1) >>>>>>> [5,5]=X (1) >>>>>>> >>>>>>> Segmentation fault (core dumped) >>>>>>> >>>>>>> In my machine, I have 40GB RAM but still I am confused why it gave >>>>>>> this error. >>>>>>> Can you please help me in this regard. I have attached moses.ini.2 >>>>>>> for your reference. >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>>> Regards, >>>>>>> Shubham >>>>>>> >>>>>>> On Fri, Dec 9, 2016 at 2:02 AM, Hieu Hoang <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> maybe try >>>>>>>> >>>>>>>> -encoding None >>>>>>>> >>>>>>>> On 08/12/2016 19:44, Shubham Khandelwal wrote: >>>>>>>> >>>>>>>> Hi Hieu, >>>>>>>> >>>>>>>> Thanks for your reply. >>>>>>>> Yes, I have used the absolute path and also I tried with -T but it >>>>>>>> did not work. >>>>>>>> Is there any other solution to this problem. >>>>>>>> >>>>>>>> Btw, Can anybody please upload the compact model of all pre-made >>>>>>>> models as this will take less space and also it will be very fast >>>>>>>> during >>>>>>>> decoding. >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> On Fri, Dec 9, 2016 at 12:50 AM, Hieu Hoang <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> the previous email you referred to says that the directory >>>>>>>>> >>>>>>>>> * binarised-model/ * >>>>>>>>> >>>>>>>>> *must exist before you run it, otherwise it will segfault. I would >>>>>>>>> also use absolute path to make sure, ie. not * >>>>>>>>> *binarised-model/phrase-table * >>>>>>>>> >>>>>>>>> *but * >>>>>>>>> >>>>>>>>> >>>>>>>>> * /home/shubham/moses/binarised-model/phrase-table * >>>>>>>>> >>>>>>>>> *The previous email exchange also says you should try to add the >>>>>>>>> argument * >>>>>>>>> >>>>>>>>> >>>>>>>>> * -T . * >>>>>>>>> >>>>>>>>> Hieu Hoang >>>>>>>>> http://www.hoang.co.uk/hieu >>>>>>>>> >>>>>>>>> On 8 December 2016 at 15:52, Shubham Khandelwal < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> This is just the reminder of my previous email. >>>>>>>>>> >>>>>>>>>> Thanking You. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Shubham >>>>>>>>>> >>>>>>>>>> On Thu, Dec 8, 2016 at 9:04 AM, Shubham Khandelwal < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> I have just downloaded phrase-table.2.gz (18GB) de-en model >>>>>>>>>>> and phrase-table.3.gz (22GB) fr-en model from the available pre-made >>>>>>>>>>> models. >>>>>>>>>>> Now, I am converting them to PhraseDictionaryCompact using >>>>>>>>>>> following command (for exmaple): >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *~/mosesdecoder/bin/processPhraseTableMin -threads all -in >>>>>>>>>>> ~/model/phrase-table.3.gz -nscores 4 -out >>>>>>>>>>> binarised-model/phrase-table * >>>>>>>>>>> >>>>>>>>>>> But after passing 1/3, it gave following segementation fault >>>>>>>>>>> error: >>>>>>>>>>> >>>>>>>>>>> *Pass 1/3: Creating hash function for rank assignment* >>>>>>>>>>> *Segmentation fault (core dumped)* >>>>>>>>>>> >>>>>>>>>>> I have found almost same issue on this thread: >>>>>>>>>>> http://comments.gmane.org/gmane.comp.nlp.moses.user/13033 >>>>>>>>>>> However, I have provided the existing *binarised-model *folder >>>>>>>>>>> in the command. Also, I have the write-access in /tmp but still >>>>>>>>>>> it gave sementation fault. >>>>>>>>>>> >>>>>>>>>>> Can you please tell me what could be wrong here ? >>>>>>>>>>> >>>>>>>>>>> Thanking You. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Shubham >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Moses-support mailing list >>>>>>>>>> [email protected] >>>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >> > > > -- > Yours Sincerely, > > Shubham Khandelwal > Masters in Informatics (M2-MoSIG), > University Joseph Fourier-Grenoble INP, > Grenoble, France > Webpage: https://sites.google.com/site/skhandelwl21/ >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
