Hi,

I have one more question:
In the lex.e2f file there is a translation Gitarre->guitar:

        Gitarre guitar 0.4000000
        Gitarre using 0.0000284
        Gitarre ; 0.0000017

Why has not it became part of the phrase table?

Thanks again!
Vera

-----Ursprüngliche Nachricht-----
Von: Vera Aleksic, Linguatec GmbH 
Gesendet: Donnerstag, 27. November 2014 09:42
An: 'Matthias Huck'; Raj Dabre
Betreff: AW: [Moses-support] Unknown single words that are part of phrases

Hi,
Thank you for your answers.
@Raj, one-word-translations do not exist, I have searched for them. If the 
grow-diag method probably causes such phenomena, are there any better 
alternatives?
@Matthias, you are right, the pair Gitarre-guitar is always unaligned, but I do 
not really understand why. Why is "guitar" in the example below aligned to 
"Musikinstrument Gittare", and not to "Gitarre" only? I assume, decomposing 
"Musik + Instrument" would help? How else could I improve the word alignment 
quality?
Thanks!
Best,
Vera

für ein Musikinstrument wie eine elektrische Gitarre , NULL ({ }) for ({ 1 }) a 
({ 2 }) musical ({ }) instrument ({ }) , ({ }) such ({ }) as ({ 4 }) an ({ 5 }) 
electric ({ 6 }) guitar ({ 3 7 }) ; ({ 8 })

-----Ursprüngliche Nachricht-----
Von: Matthias Huck [mailto:[email protected]]
Gesendet: Mittwoch, 26. November 2014 17:54
An: Raj Dabre
Cc: Vera Aleksic, Linguatec GmbH; moses-support
Betreff: Re: [Moses-support] Unknown single words that are part of phrases

Hi,

Supposedly your phrase table does not contain an entry "Gitarre ||| guitar" 
because this word pair is always unaligned in your training data. You could try 
to improve your word alignment quality.

Alternatively, you could implement a procedure in the manner of the "forced 
single word heuristic" as described in: 
D. Stein, D. Vilar, S. Peitz, M. Freitag, M. Huck, and H. Ney. A Guide to Jane, 
an Open Source Hierarchical Translation Toolkit. The Prague Bulletin of 
Mathematical Linguistics, number 95, pages 5-18, Prague, Czech Republic, April 
2011.
http://ufal.mff.cuni.cz/pbml/95/art-stein-vilar-ney-jane.pdf
(see Fig. 1c).

But the latter would rather be a workaround.

Cheers,
Matthias


On Thu, 2014-11-27 at 01:18 +0900, Raj Dabre wrote:
> Hello,
> 
> 
> If I am not wrong this is most likely due to the grow (-diag) method applied 
> to the word aligned data (both directions) before phrase extraction.
> 
> Furthermore..... one word translations should exist (but not always).... 
> search for them.
> 
> 
> 
> Regards.
> 
> 
> On Thu, Nov 27, 2014 at 12:53 AM, Vera Aleksic, Linguatec GmbH 
> <[email protected]> wrote:
>         Hi,
>         
>         I have observed many times that some words do not exist as single 
> word translations in the phrase table, although they exist in the training 
> corpus and in multiword phrases.
>         An example:
>         German-English translation for "Gitarre" is unknown, i.e. there is no 
> single word entry  for "Gitarre" in the phrase table, although some other 
> phrases containing this word exist (see below).
>         How is it possible?
>         Thanks and best regards,
>         Vera
>         
>         
>         Gitarre , ||| guitar ; ||| 1 0.0284465 1 0.0654272 2.718 ||| ||| 1 1
>         Gitarre darstellt , unter Beanspruchung ||| guitar using ||| 0.25 
> 2.7351e-11 1 0.0625119 2.718 ||| ||| 4 1
>         Gitarre darstellt , unter ||| guitar using ||| 0.25 1.18917e-05 1 
> 0.0625119 2.718 ||| ||| 4 1
>         Gitarre darstellt , ||| guitar using ||| 0.25 0.00569228 1 0.0625119 
> 2.718 ||| ||| 4 1
>         Gitarre darstellt ||| guitar using ||| 0.25 0.0400028 1 0.0625119 
> 2.718 ||| ||| 4 1
>         Kopfplatte einer Gitarre darstellt , ||| head of a guitar using ||| 
> 0.5 4.23407e-08 1 0.00471281 2.718 ||| ||| 2 1
>         Kopfplatte einer Gitarre darstellt ||| head of a guitar using ||| 0.5 
> 2.97552e-07 1 0.00471281 2.718 ||| ||| 2 1
>         eine elektrische Gitarre , ||| an electric guitar ; ||| 1 0.00107982 
> 1 0.00163632 2.718 ||| ||| 1 1
>         einer Gitarre darstellt , unter ||| of a guitar using ||| 0.333333 
> 6.4754e-07 1 0.00471281 2.718 ||| ||| 3 1
>         einer Gitarre darstellt , ||| of a guitar using ||| 0.333333 
> 0.000309961 1 0.00471281 2.718 ||| ||| 3 1
>         einer Gitarre darstellt ||| of a guitar using ||| 0.333333 0.00217827 
> 1 0.00471281 2.718 ||| ||| 3 1
>         elektrische Gitarre , ||| electric guitar ; ||| 1 0.005661 1 
> 0.0142097 2.718 ||| ||| 1 1
>         wie eine elektrische Gitarre , ||| as an electric guitar ; |||
> 1 0.000177339 1 0.000809485 2.718 ||| ||| 1 1
>         
>         _______________________________________________
>         Moses-support mailing list
>         [email protected]
>         http://mailman.mit.edu/mailman/listinfo/moses-support
> 
> 
> 
> --
> Raj Dabre.
> Research Student,
> 
> Graduate School of Informatics,
> Kyoto University.
> CSE MTech, IITB., 2011-2014
> 
> 
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support



--
The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336.


_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to