PhraseDictionaryDynSuffixArray is deprecated and should not be used any more. It will be replaced with memory-mapped suffix array phrase tables (mmsapt) which are currently in the branch dynamic-phrase-tables.
In order to use them, you need: - the two text files, one sentence per line - the word alignments in symal format let fr be the language tag for the language you are translating from and en the tag for the language we are translating to cat train.fr | mtt-build -i -o train.fr cat train.en | mtt-build -i -o train.en cat train.symal | symal2mam train.fr-en.mam mmlex-build train fr en -o train.fr-en.lex -c train.fr-en.coc then in moses.ini, the line for the phrase table should look like this: Mmsapt name=PT0 output-factor=0 num-features=5 base=/path/to/train L1=fr L2=en No guarantee that this works; this is work in progress. Probably won't work on Mac, and works in multi-threaded mode only. - Uli On Mon, Mar 17, 2014 at 4:17 PM, Mirkin, Shachar < [email protected]> wrote: > Hi, > > I'm now subscribed also from this email address. > > Let me give more details about the problems that I encountered. > Trying to load the Moses server with the modified ini file, after > replacing the PhraseDictionaryBinary line with: > > PhraseDictionaryDynSuffixArray source=<path-to-source-corpus> > target=<path-to-target-corpus> alignment=<path-to-alignments> > > (with the correct paths, of course), I got: > > Feature function PhraseDictionaryDynSuffixArray0 specified 1 dense scores > or weights. Actually has 0 > > This was solved by adding "num-features=0" to the > PhraseDictionaryDynSuffixArray line. > > The next error was: > > ... > Loading source corpus... > terminate called after throwing an instance of > 'Moses::StrayFactorException' > what(): moses/Word.cpp:112 in void > Moses::Word::CreateFromString(Moses::FactorDirection, const > std::vector<long unsigned int, std::allocator<long unsigned int> >&, const > StringPiece&, bool) threw StrayFactorException because `fit'. > You have configured 0 factors but the word le contains factor delimiter | > too many times. > > In this test my source, target and alignment files consist each of a > single line with no "|"s, and the word "le" is the first one in the source. > > Is there anything else I should do in the ini file? > > Thanks, > Shachar > > > > > On 03/17/2014 02:58 PM, Hieu Hoang wrote: > > Hi Shachar > > can you please subscribe to the mailing list before posting to it. It's > a public email address so there's a lot of automated spammers. You can > subscribe here > http://mailman.mit.edu/mailman/listinfo/moses-support > > To answer you question - the webpage does document it in the new ini > format, eg. > PhraseDictionaryDynSuffixArray source=<path-to-source-corpus> ... > Do you have a printout of the old version? > > Also, the dynamic suffix array is undergoing updates as Uli Germann > (cc'ed) is updating it with more features. He can tell you more about it > > > ---------- Forwarded message ---------- > From: <[email protected]> > Date: 17 March 2014 12:13 > Subject: Moses-support post from [email protected] requires > approval > To: [email protected] > > > As list administrator, your authorization is requested for the > following mailing list posting: > > List: [email protected] > From: [email protected] > Subject: Incremental training and the new ini format > Reason: Post by non-member to a members-only list > > At your convenience, visit: > > http://mailman.mit.edu/mailman/admindb/moses-support > > to approve or deny the request. > > > ---------- Forwarded message ---------- > From: "Mirkin, Shachar" <[email protected]> > To: [email protected] > Cc: > Date: Mon, 17 Mar 2014 13:06:47 +0100 > Subject: Incremental training and the new ini format > Hi, > > I'm trying to use incremental training with the latest Moses version, but > the documentation refers to the old ini format ( > http://www.statmt.org/moses/?n=Moses.AdvancedFeatures#ntoc34). > Can you please explain what changes are required to get the incremental > training working with the new ini format? > > Thanks, > Shachar > > > > > ---------- Forwarded message ---------- > From: [email protected] > To: > Cc: > Date: > Subject: confirm 2701c5fb8f659b6037c9e0bf07ad70095ba4ffe2 > If you reply to this message, keeping the Subject: header intact, > Mailman will discard the held message. Do this if the message is > spam. If you reply to this message and include an Approved: header > with the list password in it, the message will be approved for posting > to the list. The Approved: header can also appear in the first line > of the body of the reply. > > > > -- > Hieu Hoang > Research Associate > University of Edinburgh > http://www.hoang.co.uk/hieu > > > -- Ulrich Germann Research Associate School of Informatics University of Edinburgh
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
