El dl 19 de 11 de 2012 a les 22:32 +0100, en/na Per Tunedal va escriure:
> Hi again,
> please see my comments below.
> Per Tunedal
> 
> On Sun, Nov 18, 2012, at 14:22, Francis Tyers wrote:
> > El dg 18 de 11 de 2012 a les 12:01 +0100, en/na Per Tunedal va escriure:
> > > Hi,
> > > I've started some work on abbreviations for the pair Swedish - Danish
> > > (sv- da). I imagined that this would be a well-defined, small task of
> > > limited scope, but it isn't.
> > > 
> > > Some questions:
> > > 
> > > 1. Internationally accepted standard abbreviations
> > > 
> > > I'm talking about abbreviations for countries (e.g. SE for Sweden), for
> > > languages (e.g. sv for Swedish), for currencies (e.g. SEK for svenska
> > > kronor - Swedish crowns), for measurement units (e.g. kg or m) and
> > > possible some other domains. These abbreviations are common to all
> > > languages and it seems unnecessary to add them to all language pairs.
> > > Couldn't they be include by default in Apertium? Now they are marked as
> > > unknown words and thus get unnecessary attention when post editing.
> > 
> > There complications to "including them by default" for example "kg" and
> > "m" are not the same in all languages. For some languages, codes may
> > have unexpected ambiguity with other words, thus decreasing the
> > performance of the tagger. So no, I don't think it's a good idea to
> > include them by default. 
> > 
> > A good idea, would perhaps be to have a page on the Wiki with a list
> > that people can copy, paste and check. Or a script in the
> > trunk/apertium-tools which autogenerates dictionary entries for common
> > abbreviations which can then be manually checked before being added.
> > 
> 
> Is this a suitable GCI task?

Sure, would you like to mentor it ? 

> > > 2. An abbreviation that hasn't any equivalent in the other language in
> > > the pair. It has to be translated by an expression.
> > > How to treat them?
> > 
> > No idea, give some examples.
> 
> Well, you might often abbreviate a word or an expression in one
> language, but never dream of abbreviating in it the other (or be
> reluctant to doing so). Example:
> Danish "ejd." = Swedish "egendom". And further there are phenomenons in
> one country that simple doesn't exist in the other country. e.g. Danish
> "D." = Swedish ("-") (We don't have any "Dannebrogsordenen" in Sweden!)
> or even Danish  "AB" = Swedish  "-"  ( It doesn't exist, I have to use
> the Danish full word "andelsboligforening" or some Swedish related
> phenomena like "bostadsrättsförening" (can be abbreviated to bfr.) or
> "bostadsförening" (very rare, as only a few old ones exist, due to the
> fact that it's illegal to create new ones.)

This sounds really infrequent. In that case I might do something like:

da:

   D.<abbr> (LR)
   ejd.<abbr> 

da-sv:

   D.<abbr> = Dannebrogsordenen<abbr> (LR)
   ejd.<abbr> = egendom<abbr> (LR)
 
sv:

   Dannebrogsordenen<abbr> (RL)
   egendom<abbr> (RL)

This would allow you to translate from Danish->Swedish, but wouldn't
mess up the analysis in Swedish. If you don't understand this, then I
would just leave it until it starts causing serious translation problems
(maybe in 10 years).

Fran


------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to