No problem! :)

F.

El dc 15 de 02 de 2012 a les 19:58 -0500, en/na Dan Loehr va escriure:
> Fran, I didn't want to forget to close the loop.  apertium-mk-en-0.1.1
> compiles and runs just fine.  Thanks again for updating it.
>  
>    - Dan
> 
> On Thu, Feb 9, 2012 at 8:50 AM, Dan Loehr <[email protected]> wrote:
>         Thanks, Fran, for that explanation of Google's poor results.
>         I was wondering what might be going on.  Especially since
>         their Bulgarian works well.
>          
>            - Dan
>         
>         On Thu, Feb 9, 2012 at 2:34 AM, Francis Tyers
>         <[email protected]> wrote:
>                 El dc 08 de 02 de 2012 a les 20:26 -0500, en/na Dan
>                 Loehr va escriure:
>                 > Many thanks, Fran.  I won't be able to download and
>                 test the new
>                 > version (apertium-mk-en-0.1.1.tar.gz) for a day or
>                 two.  But I did
>                 > want to reply right away and say thank you.
>                 >
>                 > You also asked for feedback on the quality.  You are
>                 probably already
>                 > aware that it does very well compared to Google
>                 Translate.  Your
>                 > online platform at apertium.org provides this
>                 translation of a section
>                 > from the Macedonian version of the UN Declaration of
>                 Human Rights:
>                 >
>                 > Since the recognition on врoдeнoтo dignity, and on
>                 the equal and
>                 > нeoтуѓиви authentic on all members on the humanity
>                 are тeмeлитe on the
>                 > freedom, the justice and the peace in the world;
>                 >
>                 > And here's Google Translate's translation of the
>                 same passage:
>                 >
>                 > A great priznavanjeto Following the vrodenoto
>                 dostoinstvo, also in
>                 > case of ednakvite and neotugjivi prava Following the
>                 all outdoor
>                 > chlenovi Following the choveshtvoto everything
>                 temelite Following the
>                 > slobodata, pravdata and mirot vo svetot;
>                 >
>                 > Here's the UN's English version (available at
>                 >
>                 http://www.ohchr.org/EN/UDHR/Pages/Language.aspx?LangID=eng)
>                 >
>                 > Whereas recognition of the inherent dignity and of
>                 the equal and
>                 > inalienable rights of all members of the human
>                 family is the
>                 > foundation of freedom, justice and peace in the
>                 world,
>                 >
>                 > (And here's the actual section translated (available
>                 at
>                 >
>                 http://www.ohchr.org/EN/UDHR/Pages/Language.aspx?LangID=mkj):
>                 >
>                 > Бидejќи признaвaњeтo нa врoдeнoтo дoстoинствo, и нa
>                 eднaквитe и
>                 > нeoтуѓиви прaвa нa ситe члeнoви нa чoвeштвoтo сe
>                 тeмeлитe нa
>                 > слoбoдaтa, прaвдaтa и мирoт вo свeтoт;
>                 >
>                 > So for 8-10 days' work, I'd say you've done quite
>                 well!
>                 >
>                 > Thanks again,
>                 
>                 
>                 Hmm, the poor result from Google is surprising and
>                 leads me to think
>                 there is something else at play here. I'm sure they
>                 have the same corpus
>                 I was working with 'SETimes'.  I would also be
>                 surprised if they haven't
>                 used the UDHR in their training corpus too.
>                 
>                 I just checked and the Macedonian input (from the
>                 UDHR) is full of Latin
>                 characters, e.g. Latin 'o' instead of Cyrillic 'о',
>                 'e' and 'a' the
>                 same.
>                 
>                 If we replace them with their Cyrillic counterparts,
>                 Google gets a much
>                 better result:
>                 
>                 --
>                 
>                 Бидеjќи признавањето на вроденото достоинство, и на
>                 еднаквите и
>                 неотуѓиви права на сите членови на човештвото се
>                 темелите на слободата,
>                 правдата и мирот во светот;
>                 
>                 Since they recognizing the inherent dignity and equal
>                 and inalienable
>                 rights of all members of the human family is the
>                 foundation of freedom,
>                 
>                 justice and peace in the world;
>                 
>                 --
>                 
>                 So, if you want a free/rule-based system then Apertium
>                 is probably what
>                 you're looking for. And we'd definitely welcome
>                 further feedback and
>                 development. Otherwise, if you want to make a vanilla
>                 SMT system, use
>                 the SETimes corpus and make sure you sanitise your
>                 input on the
>                 Macedonian side for unexpected Latin characters (in
>                 Apertium we have an
>                 option to do it in the dictionary compilation stage).
>                 
>                 Best regards,
>                 
>                 Fran
>                 
>                 PS. I'm really surprised Google isn't doing this for
>                 languages using
>                 Cyrillic, having Latin characters pop up doesn't just
>                 happen in
>                 Macedonian (sometimes from bad keyboard layouts,
>                 sometimes from bad OCR
>                 software), but also in other languages with
>                 Cyrillic-based scripts,
>                 Chuvash, Komi etc.
>                 
>         
>         
> 



------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to