Re: [Moses-support] please help me with the code - getting word index

Matthias Huck Sat, 20 Jun 2015 09:23:36 -0700

Hi,

Yes, you need to calculate the absolute position by adding up the start
position of the current rule application, the relative index within the
rule, and the span width of any right-hand side non-terminals in the
current rule with a smaller source index.


As Rico noted, you'll find some similar code examples in existing
feature functions.

Cheers,
Matthias


On Sat, 2015-06-20 at 18:05 +0430, amir haghighi wrote:
> Thanks Matthias
> ChartHypothesis::GetCurrSourceRange() gets the source span that all
> terminals and non terminals in the current hypothesis cover in the
> source sentence. I'd like to know which terminals (non terminals) are
> corresponded to which source word's index in the source. Could you
> guide me how to obtain that?
> 
> 
> Thanks again
> 
> 
> On Thu, Jun 18, 2015 at 9:48 PM, Matthias Huck <[email protected]>
> wrote:
>         Hi,
>         
>         You can calculate absolute positions in the source sentence
>         based on the
>         words range of the current hypothesis and those of the direct
>         predecessors (in case of right-hand side non-terminals).
>         
>         Take a look at these methods:
>         
>                 InputPath::GetWordsRange()
>                 ChartHypothesis::GetCurrSourceRange()
>                 ChartCellLabel::GetCoverage()
>         
>         Cheers,
>         Matthias
>         
>         
>         On Thu, 2015-06-18 at 20:23 +0430, amir haghighi wrote:
>         > Hi everybody
>         >
>         >
>         > I wrote the following code to get an ordered list from the
>         source words
>         > inside a hypothesis. It gets the words in their translation
>         order, but I
>         > need not only the words' strings, but also the index of each
>         word in  the
>         > original sentence.
>         >
>         > could you please help me how to get the index of each word
>         in srcPhrase, in
>         > the sentence?
>         >
>         >
>         > void Amir::GetSourcePhrase2(const ChartHypothesis&
>         cur_hypo,Phrase
>         > &srcPhrase) const
>         > {
>         >     AmirUtils utility;
>         >     TargetPhrase targetPh=cur_hypo.GetCurrTargetPhrase();
>         >     const Phrase *sourcePh=targetPh.GetRuleSource();
>         >      int
>         targetWordsNum=cur_hypo.GetCurrTargetPhrase().GetSize();
>         >     std::vector <Word> source, orderedSource;
>         >     std::vector <int> alignmentVector;
>         >     std::vector <bool> isAligned;
>         >
>         >     std::vector <std::set <size_t> > sourcePosSets;
>         >
>         >     for(int targetP=0; targetP< targetWordsNum; targetP++ ){
>         >         //std::cerr<<"setting alignments for targetword:
>         "<<targetP<<endl;
>         >
>         >
>         
> sourcePosSets.push_back(cur_hypo.GetCurrTargetPhrase().GetAlignTerm().GetAlignmentsForTarget(targetP));
>         >     }
>         >
>         >
>         >     for(int ii=targetWordsNum-1; ii>=0; ii--){
>         >         std::set <size_t> cur_srcPosSet=sourcePosSets[ii];
>         >         for (std::set <size_t>::const_iterator alignmet =
>         > cur_srcPosSet.begin();alignmet != cur_srcPosSet.end();
>         ++alignmet) {
>         >             int  alignmentElement=*alignmet;
>         >         for(int index=0; index<ii; index++ ){ //keep the
>         rightmost one and
>         > remove the othres
>         >             //remove it from the list
>         >             if(sourcePosSets[index].size()>0){
>         >             //    std::cerr<<" removing "<<*alignmet<<endl;
>         >                 //std::cerr<<"  for set with size:
>         > "<<sourcePosSets[index].size()<<endl;
>         >             sourcePosSets[index].erase(alignmentElement);
>         >             }
>         >
>         >         }
>         >     }
>         >     }
>         >
>         > for (size_t posT = 0; posT <
>         cur_hypo.GetCurrTargetPhrase().GetSize();
>         > ++posT) {
>         >   const Word &word =
>         cur_hypo.GetCurrTargetPhrase().GetWord(posT);
>         >   if (word.IsNonTerminal()){
>         >     // non-term. fill out with prev hypo
>         >
>         >         size_t nonTermInd =
>         >
>         
> cur_hypo.GetCurrTargetPhrase().GetAlignNonTerm().GetNonTermIndexMap()[posT];
>         >         const ChartHypothesis *prevHypo =
>         cur_hypo.GetPrevHypo(nonTermInd);
>         >
>         >         GetSourcePhrase2(*prevHypo,srcPhrase);
>         >     }
>         >   else{
>         >
>         >       for(std::set<size_t>::const_iterator
>         > it=sourcePosSets[posT].begin();it !=
>         sourcePosSets[posT].end() ; it++
>         > ){
>         >           srcPhrase.AddWord(sourcePh->GetWord(*it));
>         >       }
>         >       }
>         > }
>         >
>         >
>         > }
>         
>         > _______________________________________________
>         > Moses-support mailing list
>         > [email protected]
>         > http://mailman.mit.edu/mailman/listinfo/moses-support
>         
>         
>         
>         --
>         The University of Edinburgh is a charitable body, registered
>         in
>         Scotland, with registration number SC005336.
>         
> 
> 



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] please help me with the code - getting word index

Reply via email to