On 27 January 2011 09:07, Hèctor Alòs i Font <[email protected]> wrote:
> 2011/1/27 Francis Tyers <[email protected]>
>>
>>
>> I've added "Windows" as a proper noun, "do so" is difficult as I think
>> it should translate as "hacerlo", but then the pronoun moves if it is
>> finite, e.g. "lo hará", added "more than half of" as a determiner (as
>> with "a shitload of")
>>
>>  1) More than half of software developers are already building
>>     applications for Windows 7 and nearly 80% will do so within the
>>     next year, a new survey has found.
>>
>> web) Más que medio de software developers ya está construyendo
>>     aplicaciones para Ventanas 7 y nearly 80% hará tan dentro del año
>>     próximo, una encuesta nueva ha encontrado.
>>
>> svn) Más de la mitad de desarrolladores de software ya están
>>     construyendo aplicaciones para Windows 7 y casi 80% hará tan dentro
>>     del el año que viene, una encuesta nueva ha encontrado.
>>
>
>
> As you know, proper names are a mess, so we have to store big amounts of
> them to avoid being ridiculous. Unfortunately I couldn't yet catch a list of
> trade marks, as Windows,

Every country has a registry for trademarks, so it should at least be
possible to get lists of them.

> but enterprises names (from the Fortune top 500
> list), first names, family names and place names (the lasts from
> French-speaking and Catalan-speaking countries). So Pierre Noël better not
> be translated into e.g. Stone Christmas :-) (by the way, I still have lots
> of problems with the French "Marie" analyzed as a verb, while fortunately
> "Paris" already is not got as a common name)
>
> You may find the lists I got at
> https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-eo-fr/lexic/
> https://apertium.svn.sourceforge.net/svnroot/apertium/trunk/apertium-eo-ca/lexic/
> (but I couldn't them yet fully process)
>
> (For enterprises I got the top 500 list of Fortune, but e.g. in
> http://fr.transnationale.org/ is said to be 7,000 names - I don't know if
> they can be got somehow)
>
> It should be better for all of us if the could put somewhere together
> lexical resources in order to facilitate their finding and reuse in the
> translators (maybe a directory on sourceforge exits for that, but I don't
> know). I would prefer to have plain files with a clear explanation of their
> content rather than Apertium-formatted stuff because every translator

ITYM *not* every

> prefers to differenciate first names from family names or not, masculine
> first names from feminine ones or not, maybe someone would want to add the
> 5,000 most often Catalan family names (from Idescat) for his/her X-Y
> translator or not, and so on. From plane text files, anyone can quickly
> generate Apertium-like files. Lexical resources should be also referenced in
> the Apertium wiki.
>
> By the way, does someone of us yet used Wikipedia as a resource for huge
> amounts of proper nouns (with possibly translations into other languages)?
>

Antonio Toral presented a paper at FreeRBMT about exactly that.

> In connection with proper noun translation, we have a problem in Apertium
> when dealing with regular expressions. If I include, for instance, this
> couple of regular expressions (in just one dictionary):
> <e>       <re>Saint\-[A-Z][a-z]+</re><i></i><par n="Andorre__np"/></e>
> <e>       <re>Sainte\-[A-Z][a-z]+</re><i></i><par n="Andorre__np"/></e>
> the compilation (lt-comp) slows down from 12 s. to 34 s. in my computer. If
> the expression is more complicated (in order to deal with non-English
> characters) it's even worst. I simply can't use them in most of the cases.

It also slows down processing and increases memory usage, as was
discussed here before.

-- 
<Leftmost> jimregan, that's because deep inside you, you are evil.
<Leftmost> Also not-so-deep inside you.

------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to