Re: [Moses-support] Sentence IDs in phrase table

Philipp Koehn Mon, 14 Jan 2013 11:15:39 -0800

Hi,

this is technical possible, although there is currently no good
generic support for additional information for the phrase table
(there are several other things that may be useful as well).
One bit of a problem is that some phrase pairs (think translations
of "the") occur in many many many sentence pairs, so this would
require very large additional memory.


-phi

On Mon, Jan 14, 2013 at 9:38 AM, David Wilson-Parr
<[email protected]> wrote:
> Hi,
>
> I was wondering if there was any way to get a list of sentence ids in
> the final phrase table corresponding to where that phrase occurred?
>
> I noticed that the 'extract' program used in step (5) takes the argument
> '--IncludeSentenceId' and I tried this and it does include the ID (line
> number in corpus) in the extract.sorted and extract.inv.sorted however I
> don't suppose that these are still completely valid after the final
> phrase table is calculated after the score phrases  (6.6) step which
> consolidates the normal and the inverse files together.  Is there any
> 'idiots' process description of what the consolidate process does?  I
> found the source code quite hard to follow.
>
> Also I didn't understand why the 'aligned.grow-diag-final-and' file is
> generated earlier which is an already combined version of the normal and
> inverse word alignments (I think - at least it seems to have many to
> many relationships in it!) if the processing then needs to go back to
> using them both separately.
>
> Sorry if I misunderstood something, I am just scratching the surface at
> the moment.
>
> Kind regards,
>
> Dave
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Sentence IDs in phrase table

Reply via email to